0

I have a very large data frame that I want to pull from to make smaller data frames based on specific conditions, but I'm not sure how to do this effectively. My data frame looks like this (but with hundreds more variables):

A      B      C    D
CAT    ABC    5    7
COW    DEF    5    8
DOG    GHI    5    8
BAT    JKL    5    8
MAN    MNO    6    8
HAT    PQR    6    8

What I would like to do is create a data frame like this:

CAT.ABC.5.7 <- subset(df, A=="CAT", B=="ABC", C==5, D==7)
COW.DEF.5.8 <- subset(df, A=="COW", B=="DEF", C==5, D==8)

etc.

It is important that I have the data frames labeled like this so I can quickly "pull" them later based on the values I need.

Thank you!

2
  • If you have "hundreds more variables", do you plan to string them all out like that CAT.DOG.TREE.5.7......WOTSIT.Q.FVAL? Commented Nov 24, 2015 at 18:11
  • Frank. No, I anticipate it would be (at most) 3 or 4 variables. Commented Nov 24, 2015 at 18:24

1 Answer 1

2

You can use interaction + split:

split(df, interaction(df, drop = TRUE))
## $CAT.ABC.5.7
##     A   B C D
## 1 CAT ABC 5 7
## 
## $COW.DEF.5.8
##     A   B C D
## 2 COW DEF 5 8
## 
## $DOG.GHI.5.8
##     A   B C D
## 3 DOG GHI 5 8
## 
## $BAT.JKL.5.8
##     A   B C D
## 4 BAT JKL 5 8
## 
## $MAN.MNO.6.8
##     A   B C D
## 5 MAN MNO 6 8
## 
## $HAT.PQR.6.8
##     A   B C D
## 6 HAT PQR 6 8
## 

If you really wanted to create separate data.frames, then do:

list2env(split(df, interaction(df, drop = TRUE)), envir = .GlobalEnv)
Sign up to request clarification or add additional context in comments.

2 Comments

Ananda, Thanks. But I want to specify the specific variables that will go into making each data frame. There are hundreds of other variables that will be subsetted dependent on these 3-5 variables.
@Orion Just pass the relevant var names like split(DF, interaction(DF[c("A","B","C","D")], drop=TRUE)), I think.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.