Replace all 0 values to NA

Question

I have a dataframe with some numeric columns  Some row has a 0 value which should be considered as null in statistical analysis  What is the fastest way to replace all the 0 value to NULL in R

User · Answer

Because someone asked for the Data Table version of this  and because the given data frame solution does not work with data table  I am providing the solution below   Basically  use the    operator --  DT x    0  x    NA   library  data table    status   as data table occupationalStatus   head status  10      origin destination  N  1       1           1 50  2       2           1 16  3       3           1 12  4       4           1 11  5       5           1  2  6       6           1 12  7       7           1  0  8       8           1  0  9       1           2 19 10       2           2 40   status N    0  N    NA   head status  10      origin destination  N  1       1           1 50  2       2           1 16  3       3           1 12  4       4           1 11  5       5           1  2  6       6           1 12  7       7           1 NA  8       8           1 NA  9       1           2 19 10       2           2 40

User · Answer

In case anyone arrives here via google looking for the opposite  i e  how to replace all NAs in a data frame with 0   the answer is   df is na df    lt - 0   OR  Using dplyr   tidyverse  library dplyr  mtcars   gt   replace is na     0

User · Answer

Let me assume that your data frame is a mix of different datatypes and not all columns need to be modified   to modify only columns 12 to 18  of the total 21   just do this  df   12 18  df   12 18     0   lt - NA

User · Answer

An alternative way without the   lt - function   A sample data frame dat  shamelessly copied from  Chase s answer    dat    x y 1 0 2 2 1 2 3 1 1 4 2 1 5 0 0   Zeroes can be replaced with NA by the is na lt - function   is na dat   lt -  dat   dat     x  y 1 NA  2 2  1  2 3  1  1 4  2  1 5 NA NA

User · Answer

Replacing all zeroes to NA   df df    0   lt - NA     Explanation  1  It is not NULL what you should want to replace zeroes with  As it says in   NULL       NULL represents the null object in R   which is unique and  I guess  can be seen as the most uninformative and empty object 1 Then it becomes not so surprising that  data frame x   c 1  NULL  2       x   1 1   2 2   That is  R does not reserve any space for this null object 2 Meanwhile  looking at   NA  we see that     NA is a logical constant of length 1 which contains a missing value   indicator  NA can be coerced to any other vector type except raw    Importantly  NA is of length 1 so that R reserves some space for it  E g    data frame x   c 1  NA  2        x   1  1   2 NA   3  2   Also  the data frame structure requires all the columns to have the same number of elements so that there can be no  holes   i e   NULL values    Now you could replace zeroes by NULL in a data frame in the sense of completely removing all the rows containing at least one zero  When using  e g   var  cov  or cor  that is actually equivalent to first replacing zeroes with NA and setting the value of use as  complete obs   Typically  however  this is unsatisfactory as it leads to extra information loss   2  Instead of running some sort of loop  in the solution I use df    0 vectorization  df    0 returns  try it  a matrix of the same size as df  with the entries TRUE and FALSE  Further  we are also allowed to pass this matrix to the subsetting        see        Lastly  while the result of df df    0  is perfectly intuitive  it may seem strange that df df    0   lt - NA gives the desired effect  The assignment operator  lt - is indeed not always so smart and does not work in this way with some other objects  but it does so with data frames  see    lt -      1 The empty set in the set theory feels somehow related  2 Another similarity with the set theory  the empty set is a subset of every set  but we do not reserve any space for it

User · Answer

Sample data set seed 1  dat  lt - data frame x   sample 0 2  5  TRUE   y   sample 0 2  5  TRUE    -----   x y 1 0 2 2 1 2 3 1 1 4 2 1 5 0 0   replace zeros with NA dat dat  0   lt - NA  -----    x  y 1 NA  2 2  1  2 3  1  1 4  2  1 5 NA NA

User · Answer

dplyr  na if   is an option   library dplyr     df  lt - data frame col1   c 1  2  3  0                    col2   c 0  2  3  4                    col3   c 1  0  3  0                    col4   c  a    b    c    d     na if df  0    A tibble  4 x 4    col1  col2  col3 col4     lt dbl gt   lt dbl gt   lt dbl gt   lt chr gt  1     1    NA     1 a     2     2     2    NA b     3     3     3     3 c     4    NA     4    NA d

User · Answer

You can replace 0 with NA only in numeric fields  i e  excluding things like factors   but it works on a column-by-column basis   col col    0  amp  is numeric col    lt - NA   With a function  you can apply this to your whole data frame   changetoNA  lt - function colnum df        col  lt - df  colnum      if  is numeric col       edit  verifying column is numeric         col col    -1  amp  is numeric col    lt - NA           return col    df  lt - data frame sapply 1 5  changetoNA  df     Although you could replace the 1 5 with the number of columns in your data frame  or with 1 ncol df

[r] Replace all 0 values to NA

Examples related to r

Examples related to r-faq