Z2H Video 5, back to the start [Post #11, Day 26]

I started to watch Z2H Video 5 and Andrej recommended that I work out the exercises for myself before Andrej revealed the solution in the video. I felt I wasn't prepared for this and it was a good time to circle back to the beginning video and solidify more details in my mind.

I'm stepping through Video 1 again now and already know I am understanding a lot more of the fine details than I did the first time.

Intuitive explanation of the chain rule from Wikipedia:

Intuitively, the chain rule states that knowing the instantaneous rate of change of z relative to y and that of y relative to x allows one to calculate the instantaneous rate of change of z relative to x as the product of the two rates of change.

As put by George F. Simmons: "If a car travels twice as fast as a bicycle and the bicycle is four times as fast as a walking man, then the car travels 2 × 4 = 8 times as fast as the man."

I like the Liebniz notation version too.

Plus node local derivatives are always one, so can think of it as routing previous gradient back through to the next (i.e. previous) nodes.

Backpropagation is a recursive application of the chain rule backwards through the computation graph.

More from A Civil Engineer to AI Software Engineer 🤖
All posts