1

df:

  y x
  F T
  F F
  T T
  T F

df1:

 y z probs.x x probs.y   new
 F F    0.08 T     0.4 0.032
 F F    0.24 F     0.4 0.096
 F T    0.12 T     0.6 0.072
 F T    0.36 F     0.6 0.216
 T F    0.40 T     0.5 0.200
 T F    0.20 F     0.5 0.100
 T T    0.40 T     0.5 0.200
 T T    0.20 F     0.5 0.100

df and df1 are the two data frames. And for each row of df, I want to select the matching rows in df1, add the values in column “new”, and store output in a new data frame like this.

df_res:

  y x new
  F T .104
  F F .312
  T T .4
  T F .2

Kindly help me out! I have been toiling over this for a long time now. The table headers will change according to the variables, so please do do not hard code the table headers.

Thanks.

1
  • what is the criteria of matching df and df1? both y and x matching? and some them together? So, no matter what else the final will have only 4 rows? Commented Feb 22, 2014 at 3:39

3 Answers 3

2

I don't know how long is your data but this can be one approach.

df<- read.table(text="y x
F T
F F
T T
T F",header=T,sep="")

df1 <- read.table(text="y z probs.x x probs.y   new
 F F    0.08 T     0.4 0.032
 F F    0.24 F     0.4 0.096
 F T    0.12 T     0.6 0.072
 F T    0.36 F     0.6 0.216
 T F    0.40 T     0.5 0.200
 T F    0.20 F     0.5 0.100
 T T    0.40 T     0.5 0.200
 T T    0.20 F     0.5 0.100", header=T, sep="")

df$yx <- paste0(df$y,df$x)
df1$yx <- paste0(df1$y, df1$x)

# Update automatically using the for loop

for (i in 1:4){
  new[i] <- sum(df1[which(df1[,7]==df[i,3]),6])
}

df$new <- new
df
      y     x         yx   new
1 FALSE  TRUE  FALSETRUE 0.104
2 FALSE FALSE FALSEFALSE 0.312
3  TRUE  TRUE   TRUETRUE 0.400
4  TRUE FALSE  TRUEFALSE 0.200

Using sapply

new <- sapply(1:4, function(x) sum(df1[which(df1[,7]==df[x,3]),6]))
Sign up to request clarification or add additional context in comments.

Comments

1

it seems like if all you want is F,T combination. this works. otherwise you have to write more clearly.

text=" y z probs.x x probs.y   new
 F F    0.08 T     0.4 0.032
 F F    0.24 F     0.4 0.096
 F T    0.12 T     0.6 0.072
 F T    0.36 F     0.6 0.216
 T F    0.40 T     0.5 0.200
 T F    0.20 F     0.5 0.100
 T T    0.40 T     0.5 0.200
 T T    0.20 F     0.5 0.100"

df<-read.table(text=text, header=T)
df_res<-aggregate(data=df, new~interaction(y,x),sum)

  interaction(y, x)   new
1       FALSE.FALSE 0.312
2        TRUE.FALSE 0.200
3        FALSE.TRUE 0.104
4         TRUE.TRUE 0.400

Comments

1

Here's an answer using merge and plyr.

Read in your example data.frame:

df1 <- read.table(text="y z probs.x x probs.y   new
 F F    0.08 T     0.4 0.032
 F F    0.24 F     0.4 0.096
 F T    0.12 T     0.6 0.072
 F T    0.36 F     0.6 0.216
 T F    0.40 T     0.5 0.200
 T F    0.20 F     0.5 0.100
 T T    0.40 T     0.5 0.200
 T T    0.20 F     0.5 0.100", header=T, sep="")

If I understand, there are 2 steps to what your asking. First is to select rows in df1 that match patterns in df. That can be done with merge. The df you gave has all combinations of True and False for x and y. Let's leave one out so we can see the effect:

df <- read.table(text="y x
F T
T T
T F",header=T,sep="")

df_merged <- merge(df, df1, all.y=F)

The results are a new data.frame the omits the rows where both x and y are F. This is equivalent to a left join in a SQL database.

      y     x     z probs.x probs.y   new
1 FALSE  TRUE FALSE    0.08     0.4 0.032
2 FALSE  TRUE  TRUE    0.12     0.6 0.072
3  TRUE FALSE FALSE    0.20     0.5 0.100
4  TRUE FALSE  TRUE    0.20     0.5 0.100
5  TRUE  TRUE FALSE    0.40     0.5 0.200
6  TRUE  TRUE  TRUE    0.40     0.5 0.200

The second part of the question is to group the data and apply a sum to the groups. Plyr is a great tool for this kind of data manipulation:

library(plyr)
ddply(df_merged, .(y,x), function(df) c(new=sum(df$new)))

The dd means we are giving a data.frame and want a data.frame as a result. The next argument .(y,x) is a quoted expression and names the variables we're grouping by. The result is this:

      y     x   new
1 FALSE  TRUE 0.104
2  TRUE FALSE 0.200
3  TRUE  TRUE 0.400

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.