Z2H Video 7, finished watching [Post #20, Day 43]

I have finished watching Video 7 and am feeling discouraged. There was a lot I didn't fully understand. And so far we have only covered the pre-training stage of LLMs, which will create a document completer. There are still several more steps required to build an actual assistant like ChatGPT.

I did learn that the "attention" in Attention Is All You Need is basically the ability for tokens to be able to communicate with one another. Here are some notes I took down in my Jupyter Notebook:

  • each token knows its content and position
  • the 8th token creates a query, saying "hey I'm looking for this kind of stuff, I'm a vowel, I'm in the 8th position, I'm looking for any consonants at positions up to 4"
  • then all the other tokens emit keys, and one of the channels could be "I am a consonant, and I am in a position up to 4"
  • and then that key would have a high number in that specific channel,
  • and when that query and key dot product they find each other and have a high affinity

I think it would be helpful for me to read back through the notes I took while watching the video. The first two-thirds of the video or so were quite clear, the last third was a challenge. I should keep pushing and get over this hump.

I started taking notes more in the Jupyter Notebooks than with pencil and paper, I think it's a pretty good system.

I plan to make a GitHub repository to upload all my materials from the Z2H series.

Still thinking about where I am heading next, probably back to Video 3 exercises and reading the Bengio et. al, 2003 paper.

More from A Civil Engineer to AI Software Engineer 🤖
All posts