[python] deleting rows in numpy array

I have an array that might look like this:

ANOVAInputMatrixValuesArray = [[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 
0.53172222], [ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]]

Notice that one of the rows has a zero value at the end. I want to delete any row that contains a zero, while keeping any row that contains non-zero values in all cells.

But the array will have different numbers of rows every time it is populated, and the zeros will be located in different rows each time.

I get the number of non-zero elements in each row with the following line of code:

NumNonzeroElementsInRows    = (ANOVAInputMatrixValuesArray != 0).sum(1)

For the array above, NumNonzeroElementsInRows contains: [5 4]

The five indicates that all possible values in row 0 are nonzero, while the four indicates that one of the possible values in row 1 is a zero.

Therefore, I am trying to use the following lines of code to find and delete rows that contain zero values.

for q in range(len(NumNonzeroElementsInRows)):
    if NumNonzeroElementsInRows[q] < NumNonzeroElementsInRows.max():
        p.delete(ANOVAInputMatrixValuesArray, q, axis=0)

But for some reason, this code does not seem to do anything, even though doing a lot of print commands indicates that all of the variables seem to be populating correctly leading up to the code.

There must be some easy way to simply "delete any row that contains a zero value."

Can anyone show me what code to write to accomplish this?

This question is related to python numpy delete-row

The answer is


I might be too late to answer this question, but wanted to share my input for the benefit of the community. For this example, let me call your matrix 'ANOVA', and I am assuming you're just trying to remove rows from this matrix with 0's only in the 5th column.

indx = []
for i in range(len(ANOVA)):
    if int(ANOVA[i,4]) == int(0):
        indx.append(i)

ANOVA = [x for x in ANOVA if not x in indx]

The simplest way to delete rows and columns from arrays is the numpy.delete method.

Suppose I have the following array x:

x = array([[1,2,3],
        [4,5,6],
        [7,8,9]])

To delete the first row, do this:

x = numpy.delete(x, (0), axis=0)

To delete the third column, do this:

x = numpy.delete(x,(2), axis=1)

So you could find the indices of the rows which have a 0 in them, put them in a list or a tuple and pass this as the second argument of the function.


import numpy as np 
arr = np.array([[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 0.53172222],[ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]])
print(arr[np.where(arr != 0.)])

numpy provides a simple function to do the exact same thing: supposing you have a masked array 'a', calling numpy.ma.compress_rows(a) will delete the rows containing a masked value. I guess this is much faster this way...


This is similar to your original approach, and will use less space than unutbu's answer, but I suspect it will be slower.

>>> import numpy as np
>>> p = np.array([[1.5, 0], [1.4,1.5], [1.6, 0], [1.7, 1.8]])
>>> p
array([[ 1.5,  0. ],
       [ 1.4,  1.5],
       [ 1.6,  0. ],
       [ 1.7,  1.8]])
>>> nz = (p == 0).sum(1)
>>> q = p[nz == 0, :]
>>> q
array([[ 1.4,  1.5],
       [ 1.7,  1.8]])

By the way, your line p.delete() doesn't work for me - ndarrays don't have a .delete attribute.


Here's a one liner (yes, it is similar to user333700's, but a little more straightforward):

>>> import numpy as np
>>> arr = np.array([[ 0.96488889, 0.73641667, 0.67521429, 0.592875, 0.53172222], 
                [ 0.78008333, 0.5938125, 0.481, 0.39883333, 0.]])
>>> print arr[arr.all(1)]
array([[ 0.96488889,  0.73641667,  0.67521429,  0.592875  ,  0.53172222]])

By the way, this method is much, much faster than the masked array method for large matrices. For a 2048 x 5 matrix, this method is about 1000x faster.

By the way, user333700's method (from his comment) was slightly faster in my tests, though it boggles my mind why.


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to numpy

Unable to allocate array with shape and data type How to fix 'Object arrays cannot be loaded when allow_pickle=False' for imdb.load_data() function? Numpy, multiply array with scalar TypeError: only integer scalar arrays can be converted to a scalar index with 1D numpy indices array Could not install packages due to a "Environment error :[error 13]: permission denied : 'usr/local/bin/f2py'" Pytorch tensor to numpy array Numpy Resize/Rescale Image what does numpy ndarray shape do? How to round a numpy array? numpy array TypeError: only integer scalar arrays can be converted to a scalar index

Examples related to delete-row

Difference between Destroy and Delete How to delete a certain row from mysql table with same column values? Delete a row from a SQL Server table Deleting records before a certain date Deleting Row in SQLite in Android Deleting specific rows from DataTable deleting rows in numpy array Delete all data in SQL Server database Deleting rows with MySQL LEFT JOIN SQL DELETE with JOIN another table for WHERE condition