How to randomize or permute a dataframe rowwise and columnwise

Question

I have a dataframe  df1  like this        f1   f2   f3   f4   f5 d1   1    0    1    1    1   d2   1    0    0    1    0 d3   0    0    0    1    1 d4   0    1    0    0    1   The d1   d4 column is the rowname  the f1   f5 row is the columnname   To do sample df1   I get a new dataframe with count of 1 same as df1   So  the count of 1 is conserved for the whole dataframe but not for each row or each column   Is it possible to do the randomization row-wise or column-wise   I want to randomize the df1 column-wise for each column  i e  the number of 1 in each column remains the same  and each column need to be changed by at least once   For example  I may have a randomized df2 like this    Noted that the count of 1 in each column remains the same but the count of 1 in each row is different        f1   f2   f3   f4   f5 d1   1    0    0    0    1   d2   0    1    0    1    1 d3   1    0    0    1    1 d4   0    0    1    1    0   Likewise  I also want to randomize the df1 row-wise for each row  i e  the no  of 1 in each row remains the same  and each row need to be changed  but the no of changed entries could be different     For example  a randomized df3 could be something like this        f1   f2   f3   f4   f5 d1   0    1    1    1    1   lt - two entries are different d2   0    0    1    0    1   lt - four entries are different d3   1    0    0    0    1   lt - two entries are different d4   0    0    1    0    1   lt - two entries are different   PS   Many thanks for the help from Gavin Simpson  Joris Meys and Chase for the previous answers to my previous question on randomizing two columns

User · Answer

Of course you can sample each row   sapply  1 4  function  row  df1 row   lt  lt -sample df1 row       will shuffle the rows itself  so the number of 1 s in each row doesn t change  Small changes and it also works great with columns  but this is a exercise for the reader  -P

User · Answer

Random Samples and Permutations ina dataframe If it is in matrix form convert into data frame use the sample function from the base package indexes   sample 1 nrow df1   size 1 nrow df1   Random Samples and Permutations

User · Answer

You can also  sample  the same number of items in your data frame with something like this   nr lt -dim M  1  random M   M sample int nr

User · Answer

If the goal is to randomly shuffle each column  some of the above answers don t work since the columns are shuffled jointly  this preserves inter-column correlations    Others require installing a package   Yet a one-liner exist   df2   lapply df1  function x    sample x

User · Answer

This is another way to shuffle the data frame using package dplyr   row-wise   df2  lt - slice df1  sample 1 n       or  df2  lt - sample frac df1  1L    column-wise   df2  lt - select df1  one of sample names df1

User · Answer

Take a look at permatswap   in the vegan package  Here is an example maintaining both row and column totals  but you can relax that and fix only one of the row or column sums   mat  lt - matrix c 1 1 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 1   ncol   5  set seed 4  out  lt - permatswap mat  times   99  burnin   20000  thin   500  mtype    prab     This gives   R gt  out perm  1          1    2    3    4    5   1      1    0    1    1    1  2      0    1    0    1    0  3      0    0    0    1    1  4      1    0    0    0    1 R gt  out perm  2          1    2    3    4    5   1      1    1    0    1    1  2      0    0    0    1    1  3      1    0    0    1    0  4      0    0    1    0    1   To explain the call   out  lt - permatswap mat  times   99  burnin   20000  thin   500  mtype    prab      times is the number of randomised matrices you want  here 99 burnin is the number of swaps made before we start taking random samples  This allows the matrix from which we sample to be quite random before we start taking each of our randomised matrices thin says only take a random draw every thin swaps mtype    prab  says treat the matrix as presence absence  i e  binary 0 1 data    A couple of things to note  this doesn t guarantee that any column or row has been randomised  but if burnin is long enough there should be a good chance of that having happened  Also  you could draw more random matrices than you need and discard ones that don t match all your requirements   Your requirement to have different numbers of changes per row  also isn t covered here  Again you could sample more matrices than you want and then discard the ones that don t meet this requirement also

User · Answer

Given the R data frame    gt  df1   a b c 1 1 1 0 2 1 0 0 3 0 1 0 4 0 0 0   Shuffle row-wise    gt  df2  lt - df1 sample nrow df1      gt  df2   a b c 3 0 1 0 4 0 0 0 2 1 0 0 1 1 1 0   By default sample   randomly reorders the elements passed as the first argument   This means that the default size is the size of the passed array   Passing parameter replace FALSE  the default  to sample      ensures that sampling is done without replacement which accomplishes a row wise shuffle   Shuffle column-wise    gt  df3  lt - df1  sample ncol df1     gt  df3   c a b 1 0 1 1 2 0 1 0 3 0 0 1 4 0 0 0

User · Answer

you can also use the randomizeMatrix function in the R package picante  example   test  lt - matrix c 1 1 0 1 0 1 0 0 1 0 0 1 0 1 0 0  nrow 4 ncol 4   gt  test        1    2    3    4   1      1    0    1    0  2      1    1    0    1  3      0    0    0    0  4      1    0    1    0  randomizeMatrix test null model    frequency  iterations   1000          1    2    3    4   1      0    1    0    1  2      1    0    0    0  3      1    0    1    0  4      1    0    1    0  randomizeMatrix test null model    richness  iterations   1000          1    2    3    4   1      1    0    0    1  2      1    1    0    1  3      0    0    0    0  4      1    0    1    0  gt     The option null model  frequency  maintains column sums and richness maintains row sums  Though mainly used for randomizing species presence absence datasets in community ecology it works well here    This function has other null model options as well  check out following link for more details  page 36  of the  picante documentation

[r] How to randomize (or permute) a dataframe rowwise and columnwise?

Examples related to r

Examples related to random

Examples related to permutation