[r] How to order a data frame by one descending and one ascending column?

I have a data frame, which looks like that:

    P1  P2  P3  T1  T2  T3  I1  I2
1   2   3   5   52  43  61  6   "b"
2   6   4   3   72  NA  59  1   "a"
3   1   5   6   55  48  60  6   "f"
4   2   4   4   65  64  58  2   "b"

I want to sort it by I1 in descending order, and rows with the same value in I1 by I2 in ascending order, getting the rows in the order 1 3 4 2. But the order function seems to only take one decreasing argument, which is then TRUE or FALSE for all ordering vectors at once. How do I get my sort correct?

This question is related to r sorting dataframe

The answer is


The default sort is stable, so we sort twice: First by the minor key, then by the major key

rum1 <- rum[order(rum$I2, decreasing = FALSE),]
rum2 <- rum1[order(rum1$I1, decreasing = TRUE),]

I use rank:

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)

> rum[order(rum$I1, -rank(rum$I2), decreasing = TRUE), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

The correct way is:

rum[order(rum$T1, rum$T2, decreasing=c(T,F)), ]

rum[order(rum$T1, -rum$T2 ), ]

In @dudusan's example, you could also reverse the order of I1, and then sort ascending:

> rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
+   2   3   5   52  43  61  6   b
+   6   4   3   72  NA  59  1   a
+   1   5   6   55  48  60  6   f
+   2   4   4   65  64  58  2   b
+   1   5   6   55  48  60  6   c"), header = TRUE)
> f=factor(rum$I1)   
> levels(f) <- sort(levels(f), decreasing = TRUE)
> rum[order(as.character(f), rum$I2), ]
  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a
> 

This seems a bit shorter, you don't reverse the order of I2 twice.


I'm afraid Roman Luštrik's answer is wrong. It works on this input by chance. Consider for example its output on a very similar input (with an additional line similar to the original line 3 with "c" in the I2 column):

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)

rum$I2 <- as.character(rum$I2)
rum[order(rum$I1, rev(rum$I2), decreasing = TRUE), ]

  P1 P2 P3 T1 T2 T3 I1 I2
3  1  5  6 55 48 60  6  f
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

This is not the desired result: the first three values of I2 are f b c instead of b c f, which would be expected since the secondary sort is I2 in ascending order.

To get the reverse order of I2, you want the large values to be small and vice versa. For numeric values multiplying by -1 will do it, but for characters its a bit more tricky. A general solution for characters/strings would be to go through factors, reverse the levels (to make large values small and small values large) and change the factor back to characters:

rum <- read.table(textConnection("P1  P2  P3  T1  T2  T3  I1  I2
2   3   5   52  43  61  6   b
6   4   3   72  NA  59  1   a
1   5   6   55  48  60  6   f
2   4   4   65  64  58  2   b
1   5   6   55  48  60  6   c"), header = TRUE)

f=factor(rum$I2)
levels(f) = rev(levels(f))
rum[order(rum$I1, as.character(f), decreasing = TRUE), ]

  P1 P2 P3 T1 T2 T3 I1 I2
1  2  3  5 52 43 61  6  b
5  1  5  6 55 48 60  6  c
3  1  5  6 55 48 60  6  f
4  2  4  4 65 64 58  2  b
2  6  4  3 72 NA 59  1  a

Let df be the data frame with 2 fields A and B

Case 1: if your field A and B are numeric

df[order(df[,1],df[,2]),] - sorts fields A and B in ascending order
df[order(df[,1],-df[,2]),] - sorts fields A in ascending and B in descending order
priority is given to A.

Case 2: if field A or B is non numeric say factor or character

In our case if B is character and we want to sort in reverse order
df[order(df[,1],-as.numeric(as.factor(df[,2]))),] -> this sorts field A(numerical) in ascending and field B(character) in descending.
priority is given to A.

The idea is that you can apply -sign in order function ony on numericals. So for sorting character strings in descending order you have to coerce them to numericals.


    library(dplyr)
    library(tidyr)
    #supposing you want to arrange column 'c' in descending order and 'd' in ascending order. name of data frame is df
    ## first doing descending
    df<-arrange(df,desc(c))
    ## then the ascending order of col 'd;
    df <-arrange(df,d)

Simple one without rank :

rum[order(rum$I1, -rum$I2, decreasing = TRUE), ]

Examples related to r

How to get AIC from Conway–Maxwell-Poisson regression via COM-poisson package in R? R : how to simply repeat a command? session not created: This version of ChromeDriver only supports Chrome version 74 error with ChromeDriver Chrome using Selenium How to show code but hide output in RMarkdown? remove kernel on jupyter notebook Function to calculate R2 (R-squared) in R Center Plot title in ggplot2 R ggplot2: stat_count() must not be used with a y aesthetic error in Bar graph R multiple conditions in if statement What does "The following object is masked from 'package:xxx'" mean?

Examples related to sorting

Sort Array of object by object field in Angular 6 Sorting a list with stream.sorted() in Java How to sort dates from Oldest to Newest in Excel? how to sort pandas dataframe from one column Reverse a comparator in Java 8 Find the unique values in a column and then sort them pandas groupby sort within groups pandas groupby sort descending order Efficiently sorting a numpy array in descending order? Swift: Sort array of objects alphabetically

Examples related to dataframe

Trying to merge 2 dataframes but get ValueError How to show all of columns name on pandas dataframe? Python Pandas - Find difference between two data frames Pandas get the most frequent values of a column Display all dataframe columns in a Jupyter Python Notebook How to convert column with string type to int form in pyspark data frame? Display/Print one column from a DataFrame of Series in Pandas Binning column with python pandas Selection with .loc in python Set value to an entire column of a pandas dataframe