[machine-learning] What is the role of the bias in neural networks?

I think that biases are almost always helpful. In effect, a bias value allows you to shift the activation function to the left or right, which may be critical for successful learning.

It might help to look at a simple example. Consider this 1-input, 1-output network that has no bias:

simple network

The output of the network is computed by multiplying the input (x) by the weight (w0) and passing the result through some kind of activation function (e.g. a sigmoid function.)

Here is the function that this network computes, for various values of w0:

network output, given different w0 weights

Changing the weight w0 essentially changes the "steepness" of the sigmoid. That's useful, but what if you wanted the network to output 0 when x is 2? Just changing the steepness of the sigmoid won't really work -- you want to be able to shift the entire curve to the right.

That's exactly what the bias allows you to do. If we add a bias to that network, like so:

simple network with a bias

...then the output of the network becomes sig(w0*x + w1*1.0). Here is what the output of the network looks like for various values of w1:

network output, given different w1 weights

Having a weight of -5 for w1 shifts the curve to the right, which allows us to have a network that outputs 0 when x is 2.

Examples related to machine-learning

Error in Python script "Expected 2D array, got 1D array instead:"? How to predict input image using trained model in Keras? What is the role of "Flatten" in Keras? How to concatenate two layers in keras? How to save final model using keras? scikit-learn random state in splitting dataset Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? What is the meaning of the word logits in TensorFlow? Can anyone explain me StandardScaler? Can Keras with Tensorflow backend be forced to use CPU or GPU at will?

Examples related to neural-network

How to initialize weights in PyTorch? Keras input explanation: input_shape, units, batch_size, dim, etc What is the role of "Flatten" in Keras? How to concatenate two layers in keras? Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? What is the meaning of the word logits in TensorFlow? How to return history of validation loss in Keras Keras model.summary() result - Understanding the # of Parameters Where do I call the BatchNormalization function in Keras? How to interpret "loss" and "accuracy" for a machine learning model

Examples related to artificial-intelligence

How to get Tensorflow tensor dimensions (shape) as int values? How to compute precision, recall, accuracy and f1-score for the multiclass case with scikit learn? What is the optimal algorithm for the game 2048? Epoch vs Iteration when training neural networks What's is the difference between train, validation and test set, in neural networks? What is the role of the bias in neural networks? What is the difference between supervised learning and unsupervised learning? What are good examples of genetic algorithms/genetic programming solutions? source of historical stock data What algorithm for a tic-tac-toe game can I use to determine the "best move" for the AI?

Examples related to backpropagation

What is the role of the bias in neural networks?