I'm looking for the fastest way to check for the occurrence of NaN (`np.nan`

) in a NumPy array `X`

. `np.isnan(X)`

is out of the question, since it builds a boolean array of shape `X.shape`

, which is potentially gigantic.

I tried `np.nan in X`

, but that seems not to work because `np.nan != np.nan`

. Is there a fast and memory-efficient way to do this at all?

(To those who would ask "how gigantic": I can't tell. This is input validation for library code.)

This question is related to
`python`

`performance`

`numpy`

`nan`

Ray's solution is good. However, on my machine it is about 2.5x faster to use `numpy.sum`

in place of `numpy.min`

:

```
In [13]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 244 us per loop
In [14]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 97.3 us per loop
```

Unlike `min`

, `sum`

doesn't require branching, which on modern hardware tends to be pretty expensive. This is probably the reason why `sum`

is faster.

**edit** The above test was performed with a single NaN right in the middle of the array.

It is interesting to note that `min`

is slower in the presence of NaNs than in their absence. It also seems to get slower as NaNs get closer to the start of the array. On the other hand, `sum`

's throughput seems constant regardless of whether there are NaNs and where they're located:

```
In [40]: x = np.random.rand(100000)
In [41]: %timeit np.isnan(np.min(x))
10000 loops, best of 3: 153 us per loop
In [42]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.9 us per loop
In [43]: x[50000] = np.nan
In [44]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 239 us per loop
In [45]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.8 us per loop
In [46]: x[0] = np.nan
In [47]: %timeit np.isnan(np.min(x))
1000 loops, best of 3: 326 us per loop
In [48]: %timeit np.isnan(np.sum(x))
10000 loops, best of 3: 95.9 us per loop
```

- programming a servo thru a barometer
- Is there a way to view two blocks of code from the same file simultaneously in Sublime Text?
- python variable NameError
- Why my regexp for hyphenated words doesn't work?
- Comparing a variable with a string python not working when redirecting from bash script
- is it possible to add colors to python output?
- Get Public URL for File - Google Cloud Storage - App Engine (Python)
- Real time face detection OpenCV, Python
- xlrd.biffh.XLRDError: Excel xlsx file; not supported
- Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation
- Upgrade to python 3.8 using conda
- Unable to allocate array with shape and data type
- How to fix error "ERROR: Command errored out with exit status 1: python." when trying to install django-heroku using pip
- How to prevent Google Colab from disconnecting?
- "UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure." when plotting figure with pyplot on Pycharm
- How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function?
- "E: Unable to locate package python-pip" on Ubuntu 18.04
- Tensorflow 2.0 - AttributeError: module 'tensorflow' has no attribute 'Session'
- Jupyter Notebook not saving: '_xsrf' argument missing from post
- How to Install pip for python 3.7 on Ubuntu 18?
- Python: 'ModuleNotFoundError' when trying to import module from imported package
- OpenCV TypeError: Expected cv::UMat for argument 'src' - What is this?
- Requests (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.") Error in PyCharm requesting website
- How to setup virtual environment for Python in VS Code?
- Pylint "unresolved import" error in Visual Studio Code
- Pandas Merging 101
- Numpy, multiply array with scalar
- What is the meaning of "Failed building wheel for X" in pip install?
- Selenium: WebDriverException:Chrome failed to start: crashed as google-chrome is no longer running so ChromeDriver is assuming that Chrome has crashed
- Could not install packages due to an EnvironmentError: [Errno 13]
- OpenCV !_src.empty() in function 'cvtColor' error
- ConvergenceWarning: Liblinear failed to converge, increase the number of iterations
- How to downgrade python from 3.7 to 3.6
- I can't install pyaudio on Windows? How to solve "error: Microsoft Visual C++ 14.0 is required."?
- Iterating over arrays in Python 3
- How do I install opencv using pip?
- How do I install Python packages in Google's Colab?
- How do I use TensorFlow GPU?
- How to upgrade Python version to 3.7?
- How to resolve TypeError: can only concatenate str (not "int") to str
- How can I install a previous version of Python 3 in macOS using homebrew?
- Flask at first run: Do not use the development server in a production environment
- TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array
- What is the difference between Jupyter Notebook and JupyterLab?
- Pytesseract : "TesseractNotFound Error: tesseract is not installed or it's not in your path", how do I fix this?
- Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'"
- How do I resolve a TesseractNotFoundError?
- Trying to merge 2 dataframes but get ValueError
- Authentication plugin 'caching_sha2_password' is not supported
- Python Pandas User Warning: Sorting because non-concatenation axis is not aligned
- [Move to Fast check for NaN in NumPy]

- Why is 2 * (i * i) faster than 2 * i * i in Java?
- What is the difference between spark.sql.shuffle.partitions and spark.default.parallelism?
- How to check if a key exists in Json Object and get its value
- Why does C++ code for testing the Collatz conjecture run faster than hand-written assembly?
- Most efficient way to map function over numpy array
- The most efficient way to remove first N elements in a list?
- Fastest way to get the first n elements of a List into an Array
- Why is "1000000000000000 in range(1000000000000001)" so fast in Python 3?
- pandas loc vs. iloc vs. at vs. iat?
- Android Recyclerview vs ListView with Viewholder
- Android studio takes too much memory
- Increasing Heap Size on Linux Machines
- Replacing a 32-bit loop counter with 64-bit introduces crazy performance deviations with _mm_popcnt_u64 on Intel CPUs
- Swift Beta performance: sorting arrays
- Is Laravel really this slow?
- Are list-comprehensions and functional functions faster than "for loops"?
- Why is printing "B" dramatically slower than printing "#"?
- What is the runtime performance cost of a Docker container?
- Apache Spark: map vs mapPartitions?
- Controlling fps with requestAnimationFrame?
- Array vs ArrayList in performance
- Why shouldn't I use PyPy over CPython if PyPy is 6.3 times faster?
- apache server reached MaxClients setting, consider raising the MaxClients setting
- Split a large dataframe into a list of data frames based on common value in column
- Fastest way to determine if record exists
- for or while loop to do something n times
- Array vs. Object efficiency in JavaScript
- Python readlines() usage and efficient practice for reading
- What is a "cache-friendly" code?
- Most efficient way to concatenate strings in JavaScript?
- Get the second largest number in a list in linear time
- Why is it faster to check if dictionary contains the key, rather than catch the exception in case it doesn't?
- Java check if boolean is null
- High CPU Utilization in java application - why?
- How to pass values across the pages in ASP.net without using Session
- Java: int[] array vs int array[]
- Fastest way to ping a network range and return responsive hosts?
- List of all unique characters in a string?
- JavaScript style.display="none" or jQuery .hide() is more efficient?
- How to convert a huge list-of-vector to a matrix more efficiently?
- Fastest way to check a string is alphanumeric in Java
- Command-line Tool to find Java Heap Size and Memory Used (Linux)?
- When to use CouchDB over MongoDB and vice versa
- Is < faster than <=?
- Fastest way to check if a string matches a regexp in ruby?
- Getting HTTP code in PHP using curl
- Measure the time it takes to execute a t-sql query
- How to write a large buffer into a binary file in C++, fast?
- How to reduce the image size without losing quality in PHP
- postgresql COUNT(DISTINCT ...) very slow
- [Move to Fast check for NaN in NumPy]

- Unable to allocate array with shape and data type
- How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function?
- Numpy, multiply array with scalar
- TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array
- Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'"
- Pytorch tensor to numpy array
- Numpy Resize/Rescale Image
- what does numpy ndarray shape do?
- How to round a numpy array?
- numpy array TypeError: only integer scalar arrays can be converted to a scalar index
- Convert np.array of type float64 to type uint8 scaling values
- How to import cv2 in python3?
- How to calculate 1st and 3rd quartiles?
- Counting unique values in a column in pandas dataframe like in Qlik?
- Binning column with python pandas
- convert array into DataFrame in Python
- How to change a single value in a NumPy array?
- 'DataFrame' object has no attribute 'sort'
- ValueError: could not broadcast input array from shape (224,224,3) into shape (224,224)
- Pytorch reshape tensor dimension
- Python "TypeError: unhashable type: 'slice'" for encoding categorical data
- len() of a numpy array in python
- ValueError: cannot reshape array of size 30470400 into shape (50,1104,104)
- Python - AttributeError: 'numpy.ndarray' object has no attribute 'append'
- How to plot vectors in python using matplotlib
- How to plot an array in python?
- TypeError: 'DataFrame' object is not callable
- LogisticRegression: Unknown label type: 'continuous' using sklearn in python
- Python Pandas - Missing required dependencies ['numpy'] 1
- Pandas Split Dataframe into two Dataframes at a specific row
- What does 'index 0 is out of bounds for axis 0 with size 0' mean?
- What is the difference between i = i + 1 and i += 1 in a 'for' loop?
- Get index of a row of a pandas dataframe as an integer
- FutureWarning: elementwise comparison failed; returning scalar, but in the future will perform elementwise comparison
- TensorFlow ValueError: Cannot feed value of shape (64, 64, 3) for Tensor u'Placeholder:0', which has shape '(?, 64, 64, 3)'
- How to get element-wise matrix multiplication (Hadamard product) in numpy?
- Showing ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
- Pandas: convert dtype 'object' to int
- ValueError: all the input arrays must have same number of dimensions
- Numpy: Checking if a value is NaT
- How to split data into 3 sets (train, validation and test)?
- Pandas: Subtracting two date columns and the result being an integer
- How to get the indices list of all NaN value in numpy array?
- What is dtype('O'), in pandas?
- ImportError: cannot import name NUMPY_MKL
- why numpy.ndarray is object is not callable in my simple for python loop
- How to convert numpy arrays to standard TensorFlow format?
- ValueError when checking if variable is None or numpy.array
- TypeError: only length-1 arrays can be converted to Python scalars while plot showing
- TypeError: Invalid dimensions for image data when plotting array with imshow()
- [Move to Fast check for NaN in NumPy]

- Display rows with one or more NaN values in pandas dataframe
- How to find which columns contain any NaN value in Pandas dataframe
- How to set a cell to NaN in a pandas dataframe
- Elegant way to create empty pandas DataFrame with NaN of type float
- How to check if any value is NaN in a Pandas DataFrame
- How to replace NaNs by preceding values in pandas DataFrame?
- Pandas Replace NaN with blank/empty string
- Convert pandas.Series from dtype object to float, and errors to nans
- How to filter in NaN (pandas)?
- Replace None with NaN in pandas dataframe
- Counting the number of non-NaN elements in a numpy ndarray in Python
- Assigning a variable NaN in python without numpy
- pandas DataFrame: replace nan values with average of columns
- pandas GroupBy columns with NaN (missing) values
- Replace invalid values with None in Pandas DataFrame
- C/C++ NaN constant (literal)?
- How to select rows with one or more nulls from a pandas DataFrame without listing columns explicitly?
- Set value for particular cell in pandas DataFrame using index
- How to drop rows of Pandas DataFrame whose value in a certain column is NaN
- How to replace NaN values by Zeroes in a column of a Pandas Dataframe?
- Removing nan values from an array
- Fast check for NaN in NumPy
- How to turn NaN from parseInt into 0 for an empty string?
- Numpy - Replace a number with NaN
- Is it possible to set a number to NaN or infinity?
- convert nan value to zero
- How do you check that a number is NaN in JavaScript?
- In Java, what does NaN mean?
- How do you test to see if a double is equal to NaN?
- [Move to Fast check for NaN in NumPy]