An artificial neural network is a computing system that is comprised of a collection of connected units called neurons/nodes that are organized into what we call layers.
We have a input layer, several hidden layers, and an output layer. Each node in the input layer of a neural network represents an individual feature of the input data. The number of nodes in the output layer depends on the number of prediction classes present in the training set. The number of nodes in the hidden layer is arbitrary.
The most basic kind of layer is a dense layer, where each output of a layer is computed using every input to the layer, so each node in the a layer is connected to all nodes in the next layer.
Other types of layers are convolutional layers, for image data, recurrent layers, for time series data, etc. They do different transformations to the inputs.
Each connection between two nodes has an associated weight, which is just a number. The input will be multiplied by the weight assigned to that connection.
For each node in the second layer, a weighted sum is then computed with each of the incoming connections. This sum is then passed to an activation function.
node output = activation ( weighted sum of inputs ) = activation ( a1w1 + a2w2 + ... + aNwN )
Forward pass: Once we obtain the output for a given node, it is passed as input to the nodes in the next layer. The weights are updated as the net learns.
In an artificial neural network, an activation function is a non-linear function that does some transformation on the weighted sum of a node's inputs. The result activates or deactivates the neuron / node. Pretty much like brain neurons work.
If we only had linear transformations of our data values during a forward pass, the learned mapping in our network from input to output would also be linear. Having non-linear activation functions allows our neural networks to compute arbitrarily complex functions.
- Activation Functions in Neural Networks, by Sagar Sharma