Thursday, May 24, 2018

r - How to sort a dataframe by multiple column(s)



I want to sort a data.frame by multiple columns. For example, with the data.frame below I would like to sort by column z (descending) then by column b (ascending):



dd <- data.frame(b = factor(c("Hi", "Med", "Hi", "Low"), 

levels = c("Low", "Med", "Hi"), ordered = TRUE),
x = c("A", "D", "A", "C"), y = c(8, 3, 9, 9),
z = c(1, 1, 1, 2))
dd
b x y z
1 Hi A 8 1
2 Med D 3 1
3 Hi A 9 1
4 Low C 9 2


Answer



You can use the order() function directly without resorting to add-on tools -- see this simpler answer which uses a trick right from the top of the example(order) code:



R> dd[with(dd, order(-z, b)), ]
b x y z
4 Low C 9 2
2 Med D 3 1
1 Hi A 8 1
3 Hi A 9 1



Edit some 2+ years later: It was just asked how to do this by column index. The answer is to simply pass the desired sorting column(s) to the order() function:



R> dd[order(-dd[,4], dd[,1]), ]
b x y z
4 Low C 9 2
2 Med D 3 1
1 Hi A 8 1
3 Hi A 9 1
R>



rather than using the name of the column (and with() for easier/more direct access).


No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...