1

Suppose I have a dataframe that looks like this:

SNP   Frequency
A     20
B     50
C     7

(The real dataframe has many more rows, of course.)

What I would like to do is pass some arguments to the command line that would allow me to set the input dataframe and the frequency in the command line. Here is what I have tried:

args = commandArgs()
df <-args[1]
freqsub <- subset(df, args[2],header=TRUE)

In the args[2] part I would ordinarily have Frequency > somenumber

I know how to work it when I have df <- args[1], but args[2] doesn't.

$ Rscript sumtest.R test.txt Frequency>20

"Error in subset.default(df, args[2], header = TRUE) : 
  argument "subset" is missing, with no default
Calls: subset -> subset.default
Execution halted"

Any ideas? Happy to edit if more information is required (I can't tell if it is the case, sorry).

1 Answer 1

2

I think you have to use the option trailingOnly = TRUE:

args = commandArgs (trailingOnly = TRUE)

Otherwise args[1] args[2] are not what you are expecting...

With trailingOnly = FALSE what you get in the first positions of args is information about how the R process is being run.

You can do:

print (args)

to see in your shell what are you really having in the args vector.

Besides that "Frequency>20" will be in args[2] as a character... so you will have to process it if you want to have as a parameter of the subset function.

In this case I will pas just the number as a parameter to be read in args[2]. Then you can do:

subset(df, Frequency > as.numeric (args[2]), header=TRUE)

So following your comments I will do 2 R scripts:

The first one just to make sure that you read the right parameters will be:

args = commandArgs (trailingOnly = TRUE)
myfile = args[1]
myfreq = as.numeric (args[2])
print (myfile)
print (myfreq)    

This you have to run it in your shell as:

Rscript script1.R file.txt 5

and you should get an output like:

file.txt
5

In your second script do:

myfile = "file.txt"
myfreq = 5

## and all computations you need
df = read.table (myfile, ...
subset(df, myfreq, ...)

Debug this second file (interactively) until it works and then change the first two lines: by the (3) commandArgs lines in the first file.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your suggestions. I tried what you suggested and got this issue: There were 50 or more warnings (use warnings() to see the first 50) Warning message: In eval(expr, envir, enclos) : NAs introduced by coercion Also, may I ask where you put the print (args) check? When you said shell I assumed it was this line: $ Rscript sumtest.R test.txt 20
I think you should first try to debug the analysis part in your code within the R interactive session and then convert your script into an executable using the commandArgs. The problem seems more related with the data you are processing or how you are doing it.
The print (args) you can put it anywhere that is suitable for you or more convenient for you to pint. It is just to make sure what you are reading...
Thanks for the further suggestions. I'm going to try some of these things and report back.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.