How Neural Networks Learn (Explained in plain English)

There are layers to learning everything. It all starts with an intuitive understanding which will likely be followed by grasping the theory and being able to apply it yourself.

Many resources follow the “immense yourself in it right away” method. They aim to teach all of it in one go. They dive straight into the math and proving the theory. They do this for the sake of completeness, of course. But in the age of seriously shortened attention spans, I don’t think telling people everything there is to know about a subject right off the bat is a good idea.

As you know, I've been working on my new course which will have a lot of hands-on exercises and complete explanations on how deep learning works. But I’d like to share an intuitive explanation of some of the main concepts of deep learning with you through these emails.

Today we have forward and backward propagation (aka how a neural network learns).

A brief background: deep learning has its base in neural networks. Neural networks are (you guessed it!) networks of neurons. Vertical groups of neurons are called a layer. Each neuron accepts inputs, does some calculations and spits out an output that is sent to all the neurons in the next layer. See the example diagram below.

a neural network

These calculations are done in all the layers until the network reaches the output layer (depicted on the right before the arrow in this example). Forward propagation is the process of collecting outputs from all neurons, layer by layer and calculating a final output.

It is not a learning step, it is only an inference step. Forward propagation is the network’s way of predicting an output, based on the input that is given to it. And of course, in the beginning, it is not perfect. It makes big mistakes.

Backpropagation (together with gradient descent) is the name of the process that learns from these mistakes. It looks at the error that was made and updates the neurons, so that next time their calculations will lead to a more accurate prediction.

If a network had only 1 layer, it would be very easy to know which neuron to tweak to get a better output. But that is not usually the case. We have layers affecting layers, affecting layers affecting the output. But somehow we need to know how to get better predictions.

Backpropagation calculates how much each neuron contributes to the output and in what way. As in, it answers the question (for every single neuron): what will happen to the output if I change this neuron?

Neurons will have varying levels of effects in the output and once we know what it is, gradient descent decides how to update them. It’s kind of like they are collaborating. Backpropagation provides the information and gradient descent makes decisions.

This process of forward propagation, calculating the errors, doing backpropagation and updating the neurons is how a neural network learns.

You might of course want to go further than the intuitive understanding of these concepts. And for that, I’ll be looking forward to welcoming you to the course.

I'm planning on making a video explaining these concepts so keep an eye out for it on YouTube!