How Neural Networks Work - A Primer

I'm Joanna Sandretto and I work as the Lead Developer at Greyfeather Capital. I also happen to be Matt's sister, in case you are wondering. I am a computer scientist whose focus is artificial intelligence. I think it is important for people to understand what neural networks are, and what they can (and cannot) do - so let's start at the beginning.

Artificial intelligence (AI) may seem like the new cool “thing” in computer science with companies using AI to build assistants like Siri or Cortana, predict what shoppers will buy, and even drive cars, but computer scientists have been researching artificial intelligence for decades. Machine Learning (ML) and Deep Learning (DL) are areas of AI that focus on systems that can learn and improve without human intervention. Neural networks are one way to build systems that learn.

Artificial neural networks are an AI technique that computer scientists have been researching since the 1940s. In fact, Marvin Minsky and Dean Edmonds used 3000 vacuum tubes and a surplus automatic pilot mechanism from a B-24 bomber to build the first neural network computer in 1950 (See Artificial Intelligence: A Modern Approach by Russell and Norvig).

Artificial neural networks are based on the structure of a human brain. In basic terms, each neural network consists of neurons, or units, that are connected to each other. Each unit is associated with some function that will map multiple inputs to an output. For example, a network with two inputs and a single layer might look like the following:

A neural network with two inputs and a single layer

A neural network with two inputs and a single layer

The function that inputs are passed through is referred to as the activation function.  The activation function controls what value gets passed to the next layer.  A neuron is said to “fire” when it passes a value, referred to as an activation value, to the next layer.  In some networks, neurons will fire if the input value is above some value and otherwise pass no value to the next layer. In more complex networks, neurons can fire with different strengths and the activation function will calculate the value that should be passed to the next layer. 

There are many types of networks, but in most cases, the units are organized into layers with each layer consisting of several neurons. Networks have different number of layers and different number of neurons in each layer. The basic goal of the network is to create a mapping between the inputs and the desired outputs, which will allow the network to accurately predict the output given certain inputs. 

The first layer is an input layer. The number of neurons in the input layer is defined by the number of features or inputs chosen for the data. As a crude example, let’s say that we wanted to investigate to what extent leverage levels are associated with overall stock performance.  We could have the output be the 3-month cumulative return and the input layer could be the past 12 quarters of debt/equity ratios.  In this case, the input layer would have 12 neurons, one for each quarter’s debt/equity ratio. 

After the input layer, the network will have one or more hidden layers. The hidden layers will transform the data using various functions. Different networks and different types of data will require different types of functions. These layers are creating the mapping from inputs to outputs. 

As data passes through the network, the weights associated with the function will be adjusted based on certain rules, this is the learning part of the neural network.  A popular way to update weights is using backpropagation.  In a network using backpropagation, the error of the network is calculated and fed into an optimization function, which then adjusts the weights of the neurons accordingly.