update a columns in dataframe based on another dataframe using loop

Question

This is similar to this question. But I need to do this for 1000 dataframes.

I have created 1000 data frame using codes below:

df <- replicate(1000, sensitivity.score, simplify = FALSE)
names(df) <- paste("score.rand", 1:length(df), sep = "")
list2env(df, envir = .GlobalEnv)

And the first data frame looks like this:

head(score.rand1)
                     Binomial S1 S2 S3 S4 S5 S6   S7   S8   S9
1    Astacoides betsileoensis  H  L  L  L  H  L <NA>    L    L
2        Astacoides caldwelli  H  L  L  L  H  L <NA>    H <NA>
3        Astacoides crosnieri  L  L  L  L  H  L <NA> <NA>    L
4     Astacoides granulimanus  L  L  L  L  H  L    H <NA>    L
5           Astacoides hobbsi  H  L  H  L  H  L <NA> <NA>    L
6 Astacoides madagascariensis  H  L  L  L  H  L <NA> <NA>    L

I have created another 1000 data frames using codes below:

mydf <- data.frame(Binomial, S4, S5, S6)
lst <- replicate(1000, mydf[sample(nrow(mydf)),] , simplify = FALSE)
names(lst) <- paste("rand.val", 1:length(lst), sep = "")
list2env(lst , envir = .GlobalEnv)

that look like this:

head(rand.val1)
                  Binomial       S4       S5   S6
229  Euastacus girurmulayn 46.63442 3.399884 39.0
168 Distocambarus crockeri 15.76044 6.322875 34.7
235      Euastacus jagabar 46.63442 3.399884 40.6
163        Cherax robustus 44.04395 3.108239 42.5
506  Procambarus ortmannii 88.58447 4.422301 24.0
392    Pacifastacus fortis 30.40509 5.860764 42.0

I need to replace columns S4, S5, S6 of 'score.rand1' dataframe by S4, S5, S6 columns of 'rand.val1' dataframe based on 'Binomial'. And the same for 'score.rand2' by 'rand.val2' ... 'score.rand3' by 'rand.val3' dataframe ... and so on for all 1000 dataframes.

You are looking for a merge function, something along the lines of merge(score.rand1, rand.val1, by = "Binomial"). — Roman Luštrik
– Roman Luštrik, Commented Mar 27, 2017 at 7:28
Thank you. But how can I do this for 1000 dataframes. I have tried using for loops but failed :( — Tiny_hopper
– Tiny_hopper, Commented Mar 27, 2017 at 7:31
This is why R has object list into which you store your data.frames and then work on the entire set using apply family of functions. If you have data.frames "littered" in your workspace, you'll have to manually scrape it (using ls()) and then retrieve them using get(). — Roman Luštrik
– Roman Luštrik, Commented Mar 27, 2017 at 7:41
That's really easier to work on lists. Why do you want to "unlist" your dfs to .Global? — utubun
– utubun, Commented Mar 27, 2017 at 7:48
Possible duplicate of stackoverflow.com/questions/8091303/… — zx8754
– zx8754, Commented Mar 27, 2017 at 7:49

SolomonRoberts · Accepted Answer · 2017-03-27 10:57:46Z

1

Have your datasets in two separate lists and then you can use a merge function with mapply to go through your two lists of dataframes, remove the redundant columns from the first, and then merge the two together, which would look something like this:

combined = mapply(function(rand,score){
    rand$S4 = NULL
    rand$S5 = NULL
    rand$S6 = NULL
    output = merge(x = rand, y = score, by = "Binomial")
}, scores,rand)

answered Mar 27, 2017 at 10:57

SolomonRoberts

1144 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Tiny_hopper Over a year ago

Thank you. The problem is now solved. What I did is: I removed first the column form dataframes in list 'df' and then applied the following codes: combined.list <- mapply (function (x, y) merge (x, y, by = "Binomial", all = T), x = df, y = lst, SIMPLIFY = F)

Collectives™ on Stack Overflow

update a columns in dataframe based on another dataframe using loop

1 Answer 1

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related