0

I have the following code which goes through a list of files in folder and reads each into a data frame which is named based on the original file name:

myfiles = list.files(pattern="*.avg")
for (file in myfiles){
    dataname = paste("S",gsub("_","",gsub("\\.avg", "", file)), sep = "")
    assign(dataname, read.table(file, header = FALSE, col.names = c("wl","ref","sd")))
}

My problem is that once I have the 100 dataframes read into R, I cannot then access them directly using the variable dataname as this is a character string e.g "dataframe1".... I wish to have a further few statements within the loop to process each dataframe and hence need to access the dataframe. Any ideas on how I can do this?

3
  • Take a look at ?get Commented Dec 16, 2015 at 18:18
  • 4
    Yes to get but even better would be to store all the data.frames in a list, and then access via normal list indexing (name or number) Commented Dec 16, 2015 at 18:21
  • Good think, @arvi1000. In that case, I'd recommend mget. You can pass mget a vector of data frame names and it will return the list. Commented Dec 16, 2015 at 18:30

1 Answer 1

3

Expanding on @arvi1000's suggestion, you can just read the files directly into a list:

myfiles = list.files(pattern="*.avg")
file.list = sapply(myfiles, read.table, header=FALSE, col.names=c("wl","ref","sd"))

Now each element of the list is a data frame and the name of each element of the list is the name of the file that was the origin of the data frame stored in that list element.

If you want to process each file, you can also do that within sapply. For example:

file.list = sapply(myfiles, function(file.name) {
  df = read.table(file.name, header=FALSE, col.names=c("wl","ref","sd"))
  df$file.name = file.name  # Add name of origin file as a new column
  ... additional data frame processing statements ...
  })

The idea is that you just do all the data cleaning/processing, etc. that you want and then each cleaned/processed data frame will be an element of the list. In the example above, I added a column with the name of the file that a given data frame came from. This might be useful if you wanted to combine all of your data frames into a single data frame, but to be able to tell which file a given row came from. In that case, after reading the data frames into a list, you could then combine them into a single data frame as follows:

my_df = do.call(rbind, file.list)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.