Deep learning specialization, finished course 1 [ Post #25, Day 51]
February 25, 2025•603 words
I have completed the final graded programming assignment of Week 4 of Course 1 for the Deep Learning Specialization. Woohoo! I get a certificate now.
The assignments were interesting and helped me to understand how to implement a neural network in Python without using OOP or PyTorch. First, we implemented a logistic regression model (Week 2 programming assignment), with one linear and activation (sigmoid) unit, so no hidden layer(s). The goal was to identify if a given input photo was a photo of a cat or not. Next (in the Week 3 programming assignment), we implemented a neural network with one hidden layer for classifying data points in a flower shape. In the final graded programming assignment in Week 4, we built a deep neural network, again for determining if a photo is of a cat or not. The Jupyter Notebooks are set up a little rigidly, so you stay within the guardrails of the assignment. I wasn't vetting things as much as I did in Andrej's Z2H lectures and exercises, like for example I'm not sure if I have access to the data set file for the cat photos, I'll have to look into that.
Overall it was a very good learning experience, and it was good to go over concepts I saw previously in Andrej's Z2H course but with a different perspective and different constructions of the neural nets. I think I have a better understanding now of working out the matrix sizes feeding into the different layers forward and back.
Something I thought of recently: I always need a foundational base to come back to, something simple and solid, that when things start to feel complicated, I can fall back on that. Like for neural network programming, I'm thinking it's matrices of numbers. That is what is being manipulated at the end of the day.
A thought I had today: Is there a gap in applying AI solutions to spatial data? Like geotechnical data, GIS data, etc. It seems if I want to make a impact with my application to tech companies working in AI, I need to build something on my own that I can show and talk about and perhaps blog post about. So as I'm working through the rest of the Deep Learning Specialization courses and the rest of Andrej's Z2H videos and exercises, I need to be continually thinking about things I could build with my knew knowledge and toolkit. I have some ideas for useful apps, but not something specifically leveraging neural nets yet. I'll continue to think about it.
I am now working through Course 2, Week 1.
Some notes from the second graded programming assignment:
- The value of 𝜆 is a hyperparameter that you can tune using a dev set
- L2 regularization makes your decision boundary smoother, if 𝜆 is too large, it is also possible to "oversmooth", resulting in a model with high bias
I remember Andrej said this regularization is like applying a pressure to the weights to be closer to 0.
I also learned about dropout which is cool, as another regularization technique. That's a new one that I didn't learn the details of in the Z2H series (yet anyways).
From the graded programming assignment Jupyter Notebook:
When you shut some neurons down, you actually modify your model. The idea behind drop-out is that at each iteration, you train a different model that uses only a subset of your neurons. With dropout, your neurons thus become less sensitive to the activation of one other specific neuron, because that other neuron might be shut down at any time.