2

I have the following dataframe (df) - there are more columns, but these are the relevant columns:

ID  Cost 
1    $100
1    $200
2    $50
2    $0
2    $40
3    $10
4    $100
5    $0
5    $50

I would like to subset this dataframe such that if any of the costs for a particular ID = $0, then it should remove all those rows (i.e. all the rows for that particular ID.)

Therefore, in this example, ID 2 and 5 contain a $0, so all of ID 2 and ID 5 rows should be deleted.

Here is the resulting df I would like:

    ID  Cost 
    1    $100
    1    $200
    3    $10
    4    $100

Could someone help with this? I tried some combinations of the subset function, but it didn't work.

** On a similar note: I have another dataframe with "NA"s - could you help me figure out the same problem, in case it were NAs, instead of 0's.

Thanks in advance!!

1
  • A data.table option is. library(data.table); setDT(df)[, if(!any(Cost=='$0')) .SD, ID] Commented Jun 9, 2015 at 17:09

3 Answers 3

4

try this:

subset(df,!df$ID %in% df$ID[is.na(df$Cost) | df$Cost == "$0"])

this gives you:

  ID Cost
1  1 $100
2  1 $200
6  3  $10
7  4 $100
Sign up to request clarification or add additional context in comments.

1 Comment

+1 nice job using subset. you may save keystrokes with with(df, subset(df,!ID %in% ID[is.na(Cost) | Cost == "$0"]))
3

Try

df[!df$ID %in% df$ID[df$Cost=="$0"],]

Comments

1

You can compute the IDs that you want to remove with something like tapply:

(has.zero <- tapply(df$Cost, df$ID, function(x) sum(x == 0) > 0))
#     1     2     3     4     5 
# FALSE  TRUE FALSE FALSE  TRUE 

Then you can subset, limiting to IDs that you don't want to remove:

df[!df$ID %in% names(has.zero)[has.zero],]
#   ID Cost
# 1  1  100
# 2  1  200
# 6  3   10
# 7  4  100

This is pretty flexible, because it enables you to limit IDs based on more complicated criteria (e.g. "the average cost for the ID must be at least xyz").

2 Comments

thanks @josilber! What if I want to remove the rows based on NAs?
Then you would change sum(x == 0) > 0 to sum(is.na(x)) > 0.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.