1

I'm quite new to R, and came across a problem I can't solve based on my knowledge/books/internet.

So here is the problem:

I've got 60 csv files, for which I want to plot a scatter plot each. They're all formatted the same, so I should (theoretically) be able to solve this task with a nice loop. Here is my code:

library(tools)
library(ggplot2)
files = dir('~/Klima_hist_CPL/tillnov/ClimDatK1/*.csv')
for (Y in list.files(path = "~/Klima_hist_CPL/tillnov/ClimDatK1/",pattern =".csv", 
     all.files = FALSE, full.names = TRUE, recursive = FALSE,
     ignore.case = FALSE, include.dirs = FALSE, no.. = FALSE)){
 myData<-read.csv(Y)
 pdf("~/Klima_hist_CPL/tillnov/ClimDatK1/mypdf.pdf", width = 4, height = 4)   
 print(ggplot(data = myData, aes(ACTION_DATE, TEMP)) 
   + geom_point(aes(x = myData$ACTION_DATE, y = myData$TEMP_SET),colour=('blue')) 
   + geom_point(aes(x = myData$ACTION_DATE, y = myData$TEMP_MEASURED), colour=('red') ))
#newFilename <-paste(file_path_sans_ext(basename(Y)),".jpg")
#fp <-paste('~/Klima_hist_CPL/tillnov/ClimDatK1/',newFilename)
#writeJPEG(output,file=fp,append=FALSE)
 dev.off()
}

As you can see I tried around a bit and used fragments of code from previous tasks. Unfortunately they don't work when combined.

Summing up:

  • Multiple CSV files
  • all formatted the same
  • each one should be plotted
  • I do not care if this results in one pdf or 60 of them

2 Answers 2

5

I would read all the data into one big data.frame and use facets to make all the plots in ggplot2. Some pseudo code which shows the general code pattern:

library(dplyr)
list_of_dfs = lapply(list.files('path/to/files', pattern = '*csv'), 
    function(x) {
        dat = read.csv(x)
        dat$fname = x
        return(dat)
    })
one_big_df = list_of_dfs %>% bind_rows()
one_big_df %>% ggplot(aes(x = x, y = y)) + geom_point() + facet_wrap(~ fname)

Saving the plot can then be done using ggsave;

ggsave('plot.png', width = 16, height = 9)
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you very much! I tried your approach, but was hindered by some errors. But finally I managed to debug my own code and now it's working!
If you have specific questions, please feel free to ask more questions on SO.
0

Here is what worked for me ( just to lend a hand to somebody struggling with a similar problem)

The bottom line was to append full.names = T

to my files<-list.files(path = "~/Klima_hist_CPL/tillnov/ClimDatK1/", pattern = ".csv", full.names = T)

library(tools)
library(ggplot2)
files<-list.files(path = "~/Klima_hist_CPL/tillnov/ClimDatK1/", pattern = ".csv", full.names = T)
for (Y in files){
  myData<-read.csv(Y);
  fname <- basename(Y);
  fname <- substr(fname, 1, nchar(fname) - 4);
  pdf(
    sprintf("~/Klima_hist_CPL/tillnov/ClimDatK1/%s.pdf", fname), 
    width = 10, 
    height = 8
  );
  print(ggplot(data = myData, aes(ACTION_DATE, TEMP)) + 
      geom_point(aes(x = myData$ACTION_DATE, y = myData$TEMP_SET),colour=('blue')) + 
      geom_point(aes(x = myData$ACTION_DATE, y = myData$TEMP_MEASURED), colour=('red') ))
  dev.off() 
} 

3 Comments

Could you explain what went wrong in the example in your question. Now we have to do this ourselves and guess what the exact fix is.
sure: I always got to this error message: Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file '1-2016.csv': No such file or directory by appending the full.names= T the file could be found and the loop worked
Aah, that is a classic error :), I have made that one a few times myself. Could you edit this information into the answer? Not everyone reads the comments.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.