Z2H Video 1, exercises complete [Post #13, Day 29]

I was able to slog my way through the exercises in Google Colab.

Section 1 was relatively straightforward, once I got the hang of the Google Colab sheet and what was being asked I was able to compute partial derivatives (e.g., the partial derivative of f with respect to a) using some help from WolframAlpha. I used the partial derivatives df/da, df/db, and df/dc to compute the analytical gradient for given inputs a, b, and c.

Next, I repeated the process manually, using the "nudge" by h method. I did the calculations using Excel. At first my solution was not accurate to a far enough decimal point. I was using the h we had used during the video of 0.0001, I realized if I made this bump smaller, my answer became more and more accurate (from the derivative definition: the limit as h approaches zero).

Next, I learned a new formula for a better numerical approximation to the derivative of a function – the symmetric derivative. I implemented this in my Excel spreadsheet also.

Section 2 was more challenging. I had to re-write the methods in the Value class we had developed during the video. This was challenging as I'm still not 100% solid on the whole OOP thing.

I struggled with errors, and searched on the Discord for any helpful comments. I found some helpful ones with people having the same problems (bugs) as me. One was in the Python sum function which didn't want to add an integer to a Value object. I learned that the sum function always starts with an integer 0 and adds values to that. I just had to set the start parameter to counts[0] and sum counts[1:], this made everything Value objects I think and the computation worked (counts[0] is the Value object in the counts list at position 0). The next thing was a mixup in my Value class methods between self and self.data and out and out.data, I just had to keep those straight, then bam it worked! This part was getting a bit ahead of the lectures as it introduced negative log likelihood loss, and logits, and the softmax function. I think it was good for me to struggle through this a bit because it makes it stick more. Then when I get to these topics again in the next video, I'll be going in with more knowledge off the bat.

The final part was implementing the same thing in PyTorch. I was able to do that relatively painlessly, I do have magic Tab complete from Gemini in the Google Colab sheet so that helped me a bit.

I plan to monitor the Discord lecture1-micrograd channel and if any questions come up that I can answer I will have a go at it. I think this will be good practice, as well as helpful for the person asking the question, a double benefit.

More from A Civil Engineer to AI Software Engineer 🤖
All posts