Building a basic neural network [Post #2, Day 1]

January 6, 2025•291 words

I followed Graham Ganssle's Neural networks tutorial to build a basic neural network (a multilayer perceptron) from scratch. I studied this during my PhD so it's nice to come back to it now. My understanding is that the multilayer perceptron is an 'entry-level' artificial neural network but with concepts that underpin the more advanced architectures like Transformers. The basics are an input layer, hidden layer(s), and output layer. There is an activation function such as sigmoid (outputs numbers between 0 and 1) and ReLU (don't know what this one is yet). A key aspect is the backpropagation of error to update the weights and biases applied in the neural network. I believe this key finding was introduced by Geoffrey Hinton, I watched his interview on 60 Minutes. I found the video because I first went to Andrej Karpathy's website and found that he attended lectures by Hinton. There is a training stage where a portion of the data are used (e.g. 80%) with known answers, so the error (loss) can be calculated and used to update the weight and biases (parameters of the neural network). Hyperparameters must be chosen in the training stage, which include number of epochs (iterations through the neural network) and learning rate, which is how much the weights and biases can be adjusted (using gradient descent for example) with each epoch.

I wanted to get an overall framework of AI in my head, I used Clause to help me generate this image. I am adding notes to it as I learn more.

I am using Cursor to help with coding, it has a helpful chat window and I am using it to add annotation to each line of code to help me remember what is happening.

👍❤️🫶👏👌🤯🤔😂😍😭😢😡😮

More from A Civil Engineer to AI Software Engineer 🤖
All posts

Starting [Post #1, Day 0]

January 5, 2025•701 words

My goal is to become an AI software engineer. I don't know what that will entail exactly yet. For the moment I am focusing on learning more about the Transformer architecture, LLMs, and creating AI agents using the Eliza framework. I have been inspired by (or to use the crypto term, "pilled" by) Shaw, one of the founders of the Eliza framework. He is very motivational in his genuine care for the world of agentic AI to advance and for anyone who wants to be involved to join in the movement and to...

Read post

Digging into the details of my MLP [Post #3, Day 2]

January 7, 2025•678 words

I have worked my way sequentially through the forward pass of my multilayer perceptron (MLP) neural network. It involved matrix multiplication and the use of the sigmoid function to compute the activation value for each hidden layer node (i.e. neuron). I use the print function to check values as they flow through my neural network, then I check the computations with hand calculations and a spreadsheet. My neural network (from Graham Ganssle's example) has seven input features (VP, VS, and rho ...

Read post

Building a basic neural network [Post #2, Day 1]

More from A Civil Engineer to AI Software Engineer 🤖All posts

Starting [Post #1, Day 0]

Digging into the details of my MLP [Post #3, Day 2]

More from A Civil Engineer to AI Software Engineer 🤖
All posts