9

I am using doSMP as a parallel backend in Windows 7, with R 2.12.2. I incur in an error, and would like to understand the likely cause. Here is some sample code to reproduce the error.

require(foreach)
require(doSMP)
require(data.table)
wrk <- startWorkers(workerCount = 2)
registerDoSMP(wrk)
DF = data.table(x=c("b","b","b","a","a"),v=rnorm(5))
setkey(DF,x)
foreach( i=1:2)  %dopar% {
    DF[J("a"),]
}

The error message is

Error in { : task 1 failed - "could not find function "J""
4
  • I'll ask the obvious Q - what is J() I have run your code and get the same error on a Linux box with R 2.13-0-alpha and I can't find J() anywhere on that system. Commented Apr 1, 2011 at 13:03
  • this is the same question as Gavin, but when you call DF[J("a"),] what is J? Commented Apr 1, 2011 at 13:11
  • Ah, ignore that, I see that DF[J("a"),] works when not in the foreach() wrapper so it must be something particular to data.table. Will investigate more. Commented Apr 1, 2011 at 13:16
  • 1
    ok, J is a data.table function. I see ;) It looks like the spawned worker R instances need to have the data.table package loaded. Commented Apr 1, 2011 at 13:30

2 Answers 2

8

I've not used doSMP, but I did some digging around and it looks like this post gets at a similar issue.

so it looks like you should be able to do:

foreach( i=1:2, .packages="data.table")  %dopar% {
    DF[J("a"),]
}

I can't test as I don't have a Windows machine handy.

Sign up to request clarification or add additional context in comments.

Comments

6

OK, I asked Revolution computing, and Steve Weller (of RC) replied:

The problem is a R scoping issue. By default, foreach() will look for variables defined in it's own 'environment'. Any objects defined outside of it's scope need to be explicitly passed to it via the '.export' argument.

In your case, you will need to modify your 'foreach()' call to pass in the objects 'DF' and 'J':

...

foreach(i=1:2, .export=c("DF","J")) %dopar% {
...

I haven't tried either solution yet, but I trust both JD and RC...

2 Comments

I don't think you need "DF" in there. I get a warning if I use your line: already exporting variable(s): DF If you leave out "DF" it works without warning.
Since your foreach loop needs to use the data.table package, you should use .packages, as in @JD Long's answer, and then exporting isn't necessary. Exporting may also hurt your performance since I believe it will serialize the whole package along with the function. Loading a package is the best way to "export" functions to the workers.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.