0

I'm attempting to add another column to a data frame in R, based on the order of values of another variable in the data frame, for which many rows have the same value. So, I would like this new column to essentially be an ordinal version of the previous column, with all the rows that have the same, lowest value in that column assigned 1, and so on. Is there an easy way to do this?

(The data attached here is only the first few rows and does not have any repeats, but the full data set has in total 75 unique values, over 100.000 observations)

…   Value
1   0.6215278
2   0.5801653
3   0.5287239
4   0.5267176
5   0.5295736
6   0.5422419
7   0.5269841
8   0.5302013
9   0.5017794

3
  • Have you tried mydata$order <- order(mydata$Value) Commented May 12, 2020 at 17:43
  • @DanielO order won't work with the requested tie behavior "all the rows that have the same, lowest value in that column assigned 1". With order there can be only one 1. Commented May 12, 2020 at 17:46
  • Thanks for the lesson @GregorThomas. Commented May 12, 2020 at 17:47

2 Answers 2

2

Another option with frank

library(data.table)
frank(x, ties.method = 'dense')

data

x <- c(1, 1, 2, 3, 3, 4)
Sign up to request clarification or add additional context in comments.

Comments

1

The rank function is what you're looking for. It sounds like you want it with the argument ties.method = "min", but see the help page for other options. This will leave gaps, e.g., if 2 entries are tied for first place, the next one will get rank 3...

x = c(1, 1, 2, 3, 3, 4)
rank(x, ties.method = "min")
# [1] 1 1 3 4 4 6

If you don't want gaps use dplyr::dense_rank.

dplyr::dense_rank(x)
# [1] 1 1 2 3 3 4

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.