How to replace NA values in a table for selected columns

Question

There are a lot of posts about replacing NA values  I am aware that one could replace NAs in the following table frame with the following   x is na x   lt -0   But  what if I want to restrict it to only certain columns  Let s me show you an example   First  let s start with a dataset   set seed 1234  x  lt - data frame a sample c 1 2 NA   10  replace T                   b sample c 1 2 NA   10  replace T                    c sample c 1 5 NA   10  replace T     Which gives       a  b  c 1   1 NA  2 2   2  2  2 3   2  1  1 4   2 NA  1 5  NA  1  2 6   2 NA  5 7   1  1  4 8   1  1 NA 9   2  1  5 10  2  1  1   Ok  so I only want to restrict the replacement to columns  a  and  b   My attempt was   x is na x   1 2  lt -0   and   x is na x 1 2    lt -0   Which does not work   My data table attempt  where y lt -data table x   was obviously never going to work   y is na y  list a b         I want to pass columns inside the is na argument but that obviously wouldn t work   I would like to do this in a data frame and a data table  My end goal is to recode the 1 2 to 0 1 in  a  and  b  while keeping  c  the way it is  since it is not a logical variable  I have a bunch of columns so I don t want to do it one by one  And  I d just like to know how to do this   Do you have any suggestions

User · Answer

Starting from the data table y  you can just write  y    cols   lapply  SD  function i  i is na i    lt - 0  i     SDcols   cols  Don t forget to library data table  before creating y and running this command

User · Answer

Building on  Robert McDonald s tidyr  replace na   answer  here are some dplyr options for controlling which columns the NAs are replaced   library tidyverse     by column type  x   gt     mutate if is numeric   replace na    0      select columns defined in vars col1  col2        x   gt     mutate at vars a  b  c    replace na    0      all columns  x   gt     mutate all  replace na    0

User · Answer

For a specific column  there is an alternative with sapply  DF  lt - data frame A   letters 1 5                B   letters 6 10                C   c 2  5  NA  8  NA    DF NEW  lt - sapply seq 1  nrow DF                        function i  ifelse is na DF i 3                                             TRUE                                         0                                         DF i 3     DF  3   lt - DF NEW DF

User · Answer

this works fine for me  DataTable DT   new DataTable     DT   DT AsEnumerable   Select R   gt          R  Campo1     valor        return  R      ToArray   CopyToDataTable

User · Answer

You can do   x   1 2  is na x   1 2     lt - 0   or better  IMHO   use the variable names   x c  a    b    is na x c  a    b       lt - 0   In both cases  1 2 or c  a    b   can be replaced by a pre-defined vector

User · Answer

Not sure if this is more concise  but this function will also find and allow replacement of NAs  or any value you like  in selected columns of a data table   update mat  lt - function dt  cols  criteria      require data table    x  lt - as data frame which criteria  TRUE  arr ind   TRUE     y  lt - as matrix subset x  x col  in  which  names dt   in  cols   arr ind   TRUE      y     To apply it   y update mat y  c  a    b    is na y     lt - 0   The function creates a matrix of the selected columns and rows  cell coordinates  that meet the input criteria  in this case is na    TRUE

User · Answer

it s quite handy with  data table  and  stringr   library data table  library stringr   x   lapply  SD  function xx   str replace na xx  0       FYI

User · Answer

Edit 2020-06-15  Since data table 1 12 4  Oct 2019   data table gains two functions to facilitate this  nafill and setnafill   nafill operates on columns   cols   c  a    b   y     cols     lapply  SD  nafill  fill 0    SDcols   cols    setnafill operates on tables  the replacements happen by-reference in-place   setnafill y  cols cols  fill 0    print y to show the effect y     This will also be more efficient than the other options  see  nafill for more  the last-observation-carried-forward  LOCF  and next-observation-carried-backward  NOCB  versions of NA imputation for time series     This will work for your data table version   for  col in c  a    b    y is na get col     col     0    Alternatively  as David Arenburg points out below  you can use set  side benefit - you can use it either on data frame or data table    for  col in 1 2  set x  which is na x  col      col  0

User · Answer

This is now trivial in tidyr with replace na    The function appears to work for data tables as well as data frames   tidyr  replace na x  list a 0  b 0

User · Answer

We can solve it in data table way with tidyr  repalce na function and lapply  library data table  library tidyr  setDT df  df  c  a   b   c    lapply  SD function x  replace na x 0    SDcols c  a   b   c      In this way  we can also solve paste columns with NA string  First  we replace na x     then we can use stringr  str c to combine columns

[r] How to replace NA values in a table for selected columns

Examples related to r

Examples related to replace

Examples related to dataframe

Examples related to data.table

Examples related to na