2

What is the most memory efficient and easiest (yes, I know those are sometimes mutually exclusive) way to create an R data frame then save it to an .Rdata file using Java?

Go easy on me though, I'm not a Java developer.

4
  • 1
    stackoverflow.com/questions/4034936/using-r-programming-in-java asked a week ago will probably help. Commented Oct 30, 2010 at 7:04
  • Does it need to be in .RData form? A csv file would import/save just as well. Commented Oct 31, 2010 at 9:27
  • Csv is what I've been using. Works fine 99% of the time. Sometimes it gets the column data types wrong. Commented Oct 31, 2010 at 11:04
  • Would just using colClasses help? Commented Oct 31, 2010 at 12:54

3 Answers 3

2

How about building a text datafile with structure() and retrieving it with dget()?

data.frame(x= 1:5, y= as.factor(1:5), z= as.character(1:5))

gives the same result as:

structure(list(x = 1:5, y = structure(1:5, .Label = c("1", "2", 
"3", "4", "5"), class = "factor"), z = structure(1:5, .Label = c("1", 
"2", "3", "4", "5"), class = "factor")), .Names = c("x", "y", 
"z"), row.names = c(NA, -5L), class = "data.frame")

It is not memory efficient per se, but you have more control over the data types. From R, you can show a data frame in the above long format by using dput() and retrieve it from a text file with dget(), and it shouldn't take too much parsing to write it from Java.

Sign up to request clarification or add additional context in comments.

1 Comment

Hey that's a neat idea I had not thought of. I have not used structure() before so it hadnt crossed my mind. I will play with that. Thanks!
0

It might be a bit of an overkill but rJava/JRI (http://rosuda.org/rJava/) give you a Java API to R. Essentially you get an R process that you can control programmatically from your Java code and obviously you can share data and create a .RData file through R calls.

Comments

0

My first inclination is to throw stuff in MySQL, but the overhead of creating tables, etc. probably doesn't make sense if these files are temporary in nature.

I agree with the others that if you want to run R from Java, rJava is the way to go, but this solution seems a little clumsy.

Along the lines of the simplicity of CSV files, but how about using a portable data format like NetCDF http://en.wikipedia.org/wiki/NetCDF instead? They should preserve data formats better and can be accessed from Java ( http://www.unidata.ucar.edu/software/netcdf-java/ ), R ( http://cran.r-project.org/web/packages/RNetCDF/ ), and even GDAL.

(My astro background forces me to mention FITS as an option too.)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.