Complex networks, like the ones used in deep learning, are built from many individual components. In this chapter, we take a look at a perceptron, which composes neural networks. Keep scrolling!
Early neural network researchers in the 1950s took inspirations from a neuron, the building block of brains, to create an algorithm called the perceptron.
To understand what a perceptron is, let’s first take a look at a neuron.
A neuron is composed of 3 main parts:
Dendrites: Receivers of neural information that takes in stimuli from other neurons.
Soma: The “core” of a neuron where the nucleus is located; it decides if a signal will be sent to other neurons.
Axon: Long extending neck of the neuron that passes any signal to any of the neighbor neurons.
The perceptron, similarly, also has 3 major parts:
Inputs: inputs take in signals in the form of numbers and pass them to the perceptron.
Perceptron: the perceptron takes in the inputs and conducts some calculations on them.
Output: the output of the perceptron spits out the result of the calculations.
Between inputs and perceptrons, the axon-to-dendrite connection is simplified to an
edge, and signals transmitted are represented with numbers.
The input on top here is connected to the perceptron with a thicker edge, thus the signal is amplified, in this case by a factor of 2, when it reaches the neuron.
However, the input at the bottom has a thinner connection to the neuron, in this
case reducing the signal to two thirds of what it was. The perceptron then sums the
received signals: 2 from the top and 2 from the bottom, resulting in a final signal
The thickness of the edge is called weight.
Next, the perceptron processes the signal by "shaking" itself. In this case, doing so
increased the signal by 1.
The “shakiness” of a perceptron is called bias, which can either increase or decrease the value of the signal by any set amount.
Finally, an activation function is used to decide how strong a signal the perceptron should transmit to its neighbors. More on this issue soon!
The sigmoid function, a type of activation functions, is used here. It maps the signal value into the range of 0-1. In our case, 5 is being mapped to 0.99, which becomes the final output.
Early researchers tried to use these perceptrons to determine how likely an image contains a human face, but they quickly ran into a problem: the resulting probability of the model was often above 100% or below 0%.
To tackle this problem, the researchers put an activation function behind the output of the perceptron, which could control the output range of that perceptron. The sigmoid function and ReLU are examples of activation functions that are commonly used to achieve different output ranges.
Sigmoid calculates the probability that something is true. For example, if you want to determine if an image is a cat or not, you would use a sigmoid activation function, which transforms the output value into a probability (a number between 0 and 1).
ReLU makes sure that the final output is non-negative, and is commonly used in multilayer networks. Unlike sigmoid whose main purpose is to limit output range, ReLU is used to break linearity so that the multilayer networks can learn more complex concepts. We will talk about this notion in depth in Chapter 5.
Up to this point, we have looked at how neurons inspired the perceptron algorithm, and what components make up the perceptron. In the next chapter, we will discuss how data can be used to teach perceptrons and other more complex neural networks.
Perceptron: A neuron-inspired algorithm that is the
building block of more complex neural networks.
Weight: A numeric value associated with an edge (the connection between two perceptrons). It can amplify, reduce, or invert signals.
Bias: A numeric value added onto the weighted sum of all input signals.
Activation Function: A function that is used to control the final output of a perceptron.