[python] Deep-Learning Nan loss reasons

There are lots of things I have seen make a model diverge.

  1. Too high of a learning rate. You can often tell if this is the case if the loss begins to increase and then diverges to infinity.

  2. I am not to familiar with the DNNClassifier but I am guessing it uses the categorical cross entropy cost function. This involves taking the log of the prediction which diverges as the prediction approaches zero. That is why people usually add a small epsilon value to the prediction to prevent this divergence. I am guessing the DNNClassifier probably does this or uses the tensorflow opp for it. Probably not the issue.

  3. Other numerical stability issues can exist such as division by zero where adding the epsilon can help. Another less obvious one if the square root who's derivative can diverge if not properly simplified when dealing with finite precision numbers. Yet again I doubt this is the issue in the case of the DNNClassifier.

  4. You may have an issue with the input data. Try calling assert not np.any(np.isnan(x)) on the input data to make sure you are not introducing the nan. Also make sure all of the target values are valid. Finally, make sure the data is properly normalized. You probably want to have the pixels in the range [-1, 1] and not [0, 255].

  5. The labels must be in the domain of the loss function, so if using a logarithmic-based loss function all labels must be non-negative (as noted by evan pu and the comments below).

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to tensorflow

Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation Module 'tensorflow' has no attribute 'contrib' Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session' Could not install packages due to an EnvironmentError: [WinError 5] Access is denied: How do I use TensorFlow GPU? Which TensorFlow and CUDA version combinations are compatible? Could not find a version that satisfies the requirement tensorflow pip3: command not found How to import keras from tf.keras in Tensorflow? Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2

Examples related to machine-learning

Error in Python script "Expected 2D array, got 1D array instead:"? How to predict input image using trained model in Keras? What is the role of "Flatten" in Keras? How to concatenate two layers in keras? How to save final model using keras? scikit-learn random state in splitting dataset Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? What is the meaning of the word logits in TensorFlow? Can anyone explain me StandardScaler? Can Keras with Tensorflow backend be forced to use CPU or GPU at will?

Examples related to keras

Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session' What is the use of verbose in Keras while validating the model? Save and load weights in keras How to import keras from tf.keras in Tensorflow? How to check which version of Keras is installed? Can I run Keras model on gpu? How to check if keras tensorflow backend is GPU or CPU version? Keras input explanation: input_shape, units, batch_size, dim, etc

Examples related to theano

Deep-Learning Nan loss reasons Keras, how do I predict after I trained a model? Keras model.summary() result - Understanding the # of Parameters How do I install Keras and Theano in Anaconda Python on Windows?