Keras model summary result - Understanding the of Parameters

Question

I have a simple NN model for detecting hand-written digits from a 28x28px image written in python using Keras  Theano backend    model0   Sequential     number of epochs to train for nb epoch   12  amount of data each iteration in an epoch sees batch size   128  model0 add Flatten input shape  1  img rows  img cols    model0 add Dense nb classes   model0 add Activation  softmax    model0 compile loss  categorical crossentropy             optimizer  sgd            metrics   accuracy     model0 fit X train  Y train  batch size batch size  nb epoch nb epoch        verbose 1  validation data  X test  Y test    score   model0 evaluate X test  Y test  verbose 0   print  Test score    score 0   print  Test accuracy    score 1     This runs well and I get  90  accuracy  I then perform the following command to get a summary of my network s structure by doing print model0 summary     This outputs the following   Layer  type          Output Shape   Param       Connected to                                                                                            flatten 1  Flatten     None  784      0           flatten input 1 0  0              dense 1  Dense       None  10        7850        flatten 1 0  0                    activation 1         None  10           0           dense 1 0  0                                                                                             Total params  7850   I don t understand how they get to 7850 total params and what that actually means

User · Answer

The easiest way to calculate number of neurons in one layer is: Param value / (number of units * 4)

Number of units is in predictivemodel.add(Dense(514,...)
Param value is Param in model.summary() function

For example in Paul Lo's answer , number of neurons in one layer is 264710 / (514 * 4 ) = 130

User · Answer

I feed a 514 dimensional real-valued input to a Sequential model in Keras  My model is constructed in following way        predictivemodel   Sequential       predictivemodel add Dense 514  input dim 514  W regularizer WeightRegularizer l1 0 000001 l2 0 000001   init  normal        predictivemodel add Dense 257  W regularizer WeightRegularizer l1 0 000001 l2 0 000001   init  normal        predictivemodel compile loss  mean squared error   optimizer  adam   metrics   accuracy      When I print model summary   I get following result   Layer  type     Output Shape  Param       Connected to                                                                                     dense 1  Dense   None  514    264710      dense input 1 0  0                                                                                 activation 1     None  514    0           dense 1 0  0                                                                                       dense 2  Dense   None  257    132355      activation 1 0  0                                                                                  Total params  397065                                                                     For the dense 1 layer   number of params is 264710  This is obtained as   514  input values    514  neurons in the first layer    514  bias values   For dense 2 layer  number of params is 132355  This is obtained as   514  input values    257  neurons in the second layer    257  bias values for neurons in the second layer

User · Answer

The  none  in the shape means it does not have a pre-defined number  For example  it can be the batch size you use during training  and you want to make it flexible by not assigning any value to it so that you can change your batch size  The model will infer the shape from the context of the layers   To get nodes connected to each layer  you can do the following   for layer in model layers      print layer name  layer inbound nodes  layer outbound nodes

User · Answer

For Dense Layers    output size    input size   1     number parameters    For Conv Layers   output channels    input channels   window size   1     number parameters   Consider following example   model   Sequential   Conv2D 32   3  3   activation  relu   input shape input shape   Conv2D 64   3  3   activation  relu    Conv2D 128   3  3   activation  relu    Dense num classes  activation  softmax       model summary                                                                     Layer  type                  Output Shape              Param                                                                        conv2d 1  Conv2D              None  222  222  32       896                                                                          conv2d 2  Conv2D              None  220  220  64       18496                                                                        conv2d 3  Conv2D              None  218  218  128      73856                                                                        dense 9  Dense                None  218  218  10       1290                                                                           Calculating params   assert 32    3    3 3    1     896 assert 64    32    3 3    1     18496 assert 128    64    3 3    1     73856 assert num classes    128   1     1290

User · Answer

Number of parameters is the amount of numbers that can be changed in the model  Mathematically this means number of dimensions of your optimization problem  For you as a programmer  each of this parameters is a floating point number  which typically takes 4 bytes of memory  allowing you to predict the size of this model once saved   This formula for this number is different for each neural network layer type  but for Dense layer it is simple  each neuron has one bias parameter and one weight per input   N   n neurons     n inputs   1

User · Answer

The number of parameters is 7850 because with every hidden unit you have 784 input weights and one weight of connection with bias  This means that every hidden unit gives you 785 parameters  You have 10 units so it sums up to 7850    The role of this additional bias term is really important  It significantly increases the capacity of your model  You can read details e g  here Role of Bias in Neural Networks

[python] Keras model.summary() result - Understanding the # of Parameters

Examples related to python

Examples related to machine-learning

Examples related to neural-network

Examples related to keras

Examples related to theano