SQLDF R: Counting unique values in a data frame

Question

I have a data frame with one column. There are 10 rows.

(4.0 * 3.0)
(4.0 * 3.0)
(2.0 * (1.0 * (1.0 * 6.0)))
(4.0 * (3.0 * 1.0))
(6.0 * 2.0)
(6.0 * 2.0)
(2.0 * 6.0)
(2.0 * 6.0)
(2.0 * 6.0)
(6.0 * 2.0)

I need to extract the unique values in the column and the number of times it occurs. Using sqldf package I was able to get the unique values. But not the count.

Query:

sqldf("SELECT V1, COUNT(DISTINCT V1) as DinctC from dataset GROUP BY V1")

Output:

                           V1 DinctC
1 (2.0 * (1.0 * (1.0 * 6.0)))      1
2                 (2.0 * 6.0)      1
3         (4.0 * (3.0 * 1.0))      1
4                 (4.0 * 3.0)      1
5                 (6.0 * 2.0)      1

What I want is:

                           V1 DinctC
1 (2.0 * (1.0 * (1.0 * 6.0)))      1
2                 (2.0 * 6.0)      3
3         (4.0 * (3.0 * 1.0))      1
4                 (4.0 * 3.0)      2
5                 (6.0 * 2.0)      3

Edit: As Tim Biegeleisen pointed out "Distinct" is not a function therefore no need of the brackets. So updating DISTINCT(V1) to DISTINCT V1

SriniShine · Accepted Answer · 2017-06-11 13:53:41Z

5

We do not need the distinct keyword as we are using the GROUP BY clause.

sqldf("SELECT V1, COUNT(V1) as DinctC from dataset GROUP BY V1")

Result:

                           V1 DinctC
1 (2.0 * (1.0 * (1.0 * 6.0)))      1
2                 (2.0 * 6.0)      3
3         (4.0 * (3.0 * 1.0))      1
4                 (4.0 * 3.0)      2
5                 (6.0 * 2.0)      3

answered Jun 11, 2017 at 13:53

SriniShine

1,1397 gold badges31 silver badges47 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

akrun · Accepted Answer · 2017-06-11 13:29:03Z

1

We can use count

library(dplyr)
count(df, V1)
# A tibble: 5 x 2
#                          V1     n
#                       <chr> <int>
#1 (2.0 * (1.0 * (1.0 * 6.0)))     1
#2                 (2.0 * 6.0)     3
#3         (4.0 * (3.0 * 1.0))     1
#4                 (4.0 * 3.0)     2
#5                 (6.0 * 2.0)     3

Or table from base R

table(df$V1)

answered Jun 11, 2017 at 13:29

akrun

891k38 gold badges590 silver badges700 bronze badges

2 Comments

SriniShine Over a year ago

Thank you very much. However I'm looking for a way to get the same result with sqldf. In case I didn't find I'm going to use this method.

akrun Over a year ago

@SriniShine Thanks for the comments. I guess you already got another solution with sqldf

Collectives™ on Stack Overflow

SQLDF R: Counting unique values in a data frame

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related