[r] How to remove rows with any zero value

I have a problem to solve how to remove rows with a Zero value in R. In others hand, I can use na.omit() to delete all the NA values or use complete.cases() to delete rows that contains NA values.

Is there anyone know how to remove rows with a Zero Values in R?

For example :

Before

|    DateTime      | Mac1  | Mac2  | Mac3  | Mac4  |
----------------------------------------------------
| 2011-04-02 06:00 | 20    | 0     | 20    | 20    |  
| 2011-04-02 06:05 | 21    | 21    | 21    | 21    |  
| 2011-04-02 06:10 | 22    | 22    | 22    | 22    |  
| 2011-04-02 06:15 | 23    | 23    | 0     | 23    |  
| 2011-04-02 06:20 | 24    | 24    | 24    | 24    | 
| 2011-04-02 06:25 | 0     | 25    | 25    | 0     | 

After

|    DateTime      | Mac1  | Mac2  | Mac3  | Mac4  |
----------------------------------------------------
| 2011-04-02 06:05 | 21    | 21    | 21    | 21    |  
| 2011-04-02 06:10 | 22    | 22    | 22    | 22    |  
| 2011-04-02 06:20 | 24    | 24    | 24    | 24    |  

This question is related to r rows zero

The answer is


Well, you could swap your 0's for NA and then use one of those solutions, but for sake of a difference, you could notice that a number will only have a finite logarithm if it is greater than 0, so that rowSums of the log will only be finite if there are no zeros in a row.

dfr[is.finite(rowSums(log(dfr[-1]))),]

I prefer a simple adaptation of csgillespie's method, foregoing the need of a function definition:

d[apply(d!=0, 1, all),]

where d is your data frame.


You can use filter from dplyr package.

Let's call your data frame df

library(dplyr) df1 <- filter(df, Mac1 > 0, Mac2 > 0, Mac3 > 0, Mac4 > 0)

df1 will have only rows with entries above zero. Hope this helps.


In base R, we can select the columns which we want to test using grep, compare the data with 0, use rowSums to select rows which has all non-zero values.

cols <- grep("^Mac", names(df))
df[rowSums(df[cols] != 0) == length(cols), ]

#          DateTime Mac1 Mac2 Mac3 Mac4
#1 2011-04-02 06:05   21   21   21   21
#2 2011-04-02 06:10   22   22   22   22
#3 2011-04-02 06:20   24   24   24   24

Doing this with inverted logic but giving the same output

df[rowSums(df[cols] == 0) == 0, ]

In dplyr, we can use filter_at to test for specific columns and use all_vars to select rows where all the values are not equal to 0.

library(dplyr)
df %>%  filter_at(vars(starts_with("Mac")), all_vars(. != 0))

data

df <- structure(list(DateTime = structure(1:6, .Label = c("2011-04-02 06:00", 
"2011-04-02 06:05", "2011-04-02 06:10", "2011-04-02 06:15", "2011-04-02 06:20", 
"2011-04-02 06:25"), class = "factor"), Mac1 = c(20L, 21L, 22L, 
23L, 24L, 0L), Mac2 = c(0L, 21L, 22L, 23L, 24L, 25L), Mac3 = c(20L, 
21L, 22L, 0L, 24L, 25L), Mac4 = c(20L, 21L, 22L, 23L, 24L, 0L
)), class = "data.frame", row.names = c(NA, -6L))

I would probably go with Joran's suggestion of replacing 0's with NAs and then using the built in functions you mentioned. If you can't/don't want to do that, one approach is to use any() to find rows that contain 0's and subset those out:

set.seed(42)
#Fake data
x <- data.frame(a = sample(0:2, 5, TRUE), b = sample(0:2, 5, TRUE))
> x
  a b
1 2 1
2 2 2
3 0 0
4 2 1
5 1 2
#Subset out any rows with a 0 in them
#Note the negation with ! around the apply function
x[!(apply(x, 1, function(y) any(y == 0))),]
  a b
1 2 1
2 2 2
4 2 1
5 1 2

To implement Joran's method, something like this should get you started:

x[x==0] <- NA

I would do the following.

Set the zero to NA.

 data[data==0] <- NA
 data

Delete the rows associated with NA.

 data2<-data[complete.cases(data),]

Using tidyverse/dplyr, you can also remove rows with any zero value in a subset of variables:

# variables starting with Mac must be non-zero
filter_at(df, vars(starts_with("Mac")), all_vars((.) != 0))

# variables x, y, and z must be non-zero
filter_at(df, vars(x, y, z), all_vars((.) != 0))

# all numeric variables must be non-zero
filter_if(df, is.numeric, all_vars((.) != 0))

As dplyr 1.0.0 deprecated the scoped variants which @Feng Mai nicely showed, here is an update with the new syntax. This might be useful because in this case, across() doesn't work, and it took me some time to figure out the solution as follows.

The goal was to extract all rows that contain at least one 0 in a column.

df %>% 
  rowwise() %>% 
  filter(any(c_across(everything(.)) == 0))

with the data

df <- data.frame(a = 1:4, b= 1:0, c=0:3)
df <- rbind(df, c(0,0,0))
df <- rbind(df, c(9,9,9))

# A tibble: 4 x 3
# Rowwise: 
      a     b     c
  <dbl> <dbl> <dbl>
1     1     1     0
2     2     0     1
3     4     0     3
4     0     0     0

So it correctly doesn't return the last row containing all 9s.


Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to rows

SQL count rows in a table Converting rows into columns and columns into rows using R Delete rows containing specific strings in R How to append rows to an R data frame Excel Create Collapsible Indented Row Hierarchies Excel CSV. file with more than 1,048,576 rows of data Find which rows have different values for a given column in Teradata SQL More than 1 row in <Input type="textarea" /> Update multiple rows with different values in a single SQL query Repeat rows of a data.frame

Examples related to zero

JAVA How to remove trailing zeros from a double Check if bash variable equals 0 How to create a numeric vector of zero length in R How to remove rows with any zero value Remove/ truncate leading zeros by javascript/jquery PHP prepend leading zero before single digit number, on-the-fly Fastest way to zero out a 2d array in C? Nullable types: better way to check for null or zero in c#