0

I am not a programmer but I have learned the basics in order to use R for statistics. I will try my best to describe my problem:

I have a table of 120 columns where each column represent cross-sectional coordinates, either x or y, of a small channel at two points in time: 2017 and 1920. The first row of the table are the names of the cross-sections coordinates, for instance "7X" and "7Y" are the (x,y) coordinates of a section named "7" in 2017 whereas "7BX" and "7BY" are the coordinates of the same section "7" in 1920. I managed to make the line plot of it using R and accommodate 4 of these plots in a single graph for printing using the following code where the table in R is named "SEC", I used the package ggpubr to put the separete graphs together:

library(ggpubr)

g <- ggplot(SEC, aes(x=`7X`, y = `7Y`, colour = "Observed"))+geom_line()+
  geom_line(aes(x = `7BX`,y = `7BY`, colour = "1974"), linetype = "dashed") +
  labs(x = "Distance [cm]", y = "Depth [cm]") + coord_equal() + 
  scale_x_continuous(position = "top", limits = c(0,750)) + 
  scale_y_continuous(limits = c(-280,0)) + 
  scale_colour_manual("", breaks = c("Observed", "1974"), values = c("Observed"="black", "1974"="blue"))
g2 <- ggplot(SEC, aes(x=`10X`, y = `10Y`, colour = "Observed"))+geom_line()+
  geom_line(aes(x = `10BX`,y = `10BY`, colour = "1974"), linetype = "dashed") +
  labs(x = "Distance [cm]", y = "Depth [cm]") + coord_equal() + 
  scale_x_continuous(position = "top", limits = c(0,750)) + 
  scale_y_continuous(limits = c(-280,0)) + 
  scale_colour_manual("", breaks = c("Observed", "1974"), values = c("Observed"="black", "1974"="blue"))
g3 <- ggplot(SEC, aes(x=`13X`, y = `13Y`, colour = "Observed"))+geom_line()+
  geom_line(aes(x = `13BX`,y = `13BY`, colour = "1974"), linetype = "dashed") +
  labs(x = "Distance [cm]", y = "Depth [cm]") + coord_equal() + 
  scale_x_continuous(position = "top", limits = c(0,750)) + 
  scale_y_continuous(limits = c(-280,0)) + 
  scale_colour_manual("", breaks = c("Observed", "1974"), values = c("Observed"="black", "1974"="blue"))
g4 <- ggplot(SEC, aes(x=`14X`, y = `14Y`, colour = "Observed"))+geom_line()+
  geom_line(aes(x = `14BX`,y = `14BY`, colour = "1974"), linetype = "dashed") +
  labs(x = "Distance [cm]", y = "Depth [cm]") + coord_equal() + 
  scale_x_continuous(position = "top", limits = c(0,750)) + 
  scale_y_continuous(limits = c(-280,0)) + 
  scale_colour_manual("", breaks = c("Observed", "1974"), values = c("Observed"="black", "1974"="blue"))


ggarrange(g, g2, g3, g4, ncol = 2, nrow = 2, common.legend = TRUE, legend = "bottom")

The code above produces the following graph (please note that I used zoom in the R studio environment and I right click -> copy image on the zoomed picture and pasted on paint because I still do not know how to save it as picture with the right "zoom" level): Plot generated with the above code

Everything is perfect up to that point. My question is how to add a loop to my code to do this graph every 4 columns and save it as a png, jpg, or something similar.

The data I used (modified for sharing) is:

  SEC <- structure(list(`7X` = c(7.5, 15, 22.5, 30, 37.5, 45, 52.5, 60, 
67.5, 75, 82.5, 90, 97.5, 105, 112.5, 120, 127.5, 135, 142.5, 
150, 157.5, 165, 172.5, 180, 187.5, 195, 202.5, 210, 217.5, 225, 
232.5, 240, 247.5, 255, 262.5, 270, 277.5, 285, 292.5, 300, 307.5, 
315, 322.5, 330, 337.5, 345, 352.5, 360, 367.5, NA, NA, NA, NA, 
NA, NA, NA, NA, NA), `7Y` = c(-25.9671090715505, -47.4607397762232, 
-53.7559172609319, -63.3665293310876, -66.6777325668064, -73.7850158514536, 
-75.8786077662389, -78.4717300522204, -86.6122602392833, -86.5085656086825, 
-99.7082525346791, -106.066956054077, -104.893267727827, -103.768964560977, 
-101.143312965043, -103.962172334764, -104.758547162389, -102.136349931386, 
-110.815517978626, -111.363366631309, -111.050166912353, -105.649062617965, 
-105.910377967987, -104.4320913694, -113.768783085737, -119.518754325158, 
-131.902196495777, -132.44782879906, -135.956263880875, -133.892807725805, 
-133.693311165822, -136.954487539369, -136.880936445156, -136.861399724998, 
-137.24878640853, -139.889844889866, -140.123989192931, -139.964791362668, 
-142.767842490807, -139.984728213883, -139.514265170192, -133.47785217087, 
-82.7273919344385, -75.020643340269, -61.9680666387492, -53.2860080778223, 
-51.0896682486046, -44.6102547614017, -35.7014461630998, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), `7BX` = c(0, 440, 640, 1080, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA), `7BY` = c(0, -210, -210, 0, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA), `10X` = c(32, 40, 48, 56, 64, 72, 80, 88, 96, 
104, 112, 120, 128, 136, 144, 152, 160, 168, 176, 184, 192, 200, 
208, 216, 224, 232, 240, 248, 256, 264, 272, 280, 288, 296, 304, 
312, 320, 328, 336, 344, 352, 360, 368, 376, 384, 392, 400, 408, 
416, 424, 432, 440, 448, 456, 464, 472, 480, 488), `10Y` = c(-94.5966356796394, 
-98.9763004291606, -103.076968535357, -106.962218988179, -110.617820502447, 
-114.115499665262, -116.110479384182, -120.384670012772, -135.012443220999, 
-140.641277783522, -149.397077818365, -152.23251255149, -154.594844651231, 
-161.870765592212, -169.050648283188, -168.468938070109, -178.406458075646, 
-189.60326884302, -185.215711843659, -192.652302594034, -204.420567844116, 
-214.802445709178, -262.006760906245, -269.627846515966, -271.928416747414, 
-280.842869544577, -286.192359059652, -286.393432557465, -287.096960178529, 
-286.681850224408, -286.247209161192, -283.325346268317, -280.952049206594, 
-275.950384188228, -258.70613971596, -259.410546763113, -245.655256400078, 
-236.838966940681, -228.287891246208, -225.674662960305, -225.790568242069, 
-226.182932581986, -226.575239267478, -227.964898636738, -226.343652570147, 
-200.896351276318, -191.905220163245, -175.399533006979, -168.597240169831, 
-163.1128036503, -157.861050484961, -155.229423199991, -139.207319012034, 
-127.927733637759, -120.782994141792, -113.149068161756, -109.895475650145, 
-94.4163178937629), `10BX` = c(0, 55, 300, 380, 550, 740, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA), `10BY` = c(0, -20, -155, -155, -30, 0, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA), `13X` = c(30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 
330, 360, 390, 420, 450, 480, 510, 540, 570, 600, 630, 660, 690, 
720, 750, 780, 810, 840, 870, 900, 930, 960, 990, 1020, 1050, 
1080, 1110, 1140, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA), `13Y` = c(-130.280130140096, 
-133.314602565698, -155.755735588693, -165.349822633039, -163.527278504803, 
-164.127092741566, -168.544964800923, -168.010043126269, -172.859848036266, 
-182.767172542781, -172.116768890092, -172.5868812035, -173.634903800562, 
-176.611077660323, -179.665100040058, -176.989870773949, -180.77134156612, 
-183.742221306137, -183.799677917615, -180.703438314547, -195.745531287296, 
-207.31260678753, -222.757679568742, -225.343317270965, -230.478091545319, 
-232.25420677185, -224.230717742185, -217.685383613481, -213.890519933422, 
-203.152992365013, -200.464974159305, -195.833697602067, -175.547017122402, 
-172.802992846061, -160.173459133272, -159.843210575388, -155.227573251256, 
-130.275570551425, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA), `13BX` = c(0, 308, 378, 
700, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA), `13BY` = c(0, -70.5943380693977, 
-142.827413298, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `14X` = c(40, 80, 
120, 160, 200, 240, 280, 320, 360, 400, 440, 480, 520, 560, 600, 
640, 680, 720, 760, 800, 840, 880, 920, 960, 1000, 1040, 1080, 
1120, 1160, 1200, 1240, 1280, 1320, 1360, 1400, 1440, 1480, 1520, 
1560, 1600, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA), `14Y` = c(-145.990632758813, -150.826188851428, 
-163.682940701739, -172.043833955967, -182.53083213644, -191.353726599893, 
-197.584471071481, -200.834572726043, -207.495959099511, -210.65543163322, 
-209.939464279794, -216.671860614474, -225.310844045373, -232.206404957882, 
-234.306313434513, -243.524048340371, -245.209549795867, -249.902953463223, 
-255.057143558744, -245.369858504693, -220.664700874663, -206.676224685967, 
-205.23664115722, -200.759982388337, -200.092376111362, -200.431526555313, 
-200.338637172383, -200.111899718351, -203.759654556748, -206.71146837615, 
-204.674270849751, -201.336543870959, -200.845407082769, -197.435021642656, 
-192.266899943151, -191.237294125464, -173.518399500314, -166.786712970063, 
-165.921143424977, -145.856527067335, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `14BX` = c(0, 
360, 460, 800, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), `14BY` = c(0, -43.3291743105714, 
-118.074399602666, 0, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("7X", 
"7Y", "7BX", "7BY", "10X", "10Y", "10BX", "10BY", "13X", "13Y", 
"13BX", "13BY", "14X", "14Y", "14BX", "14BY"), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -58L))

Thanks for any suggestion and apologizes if something is missing or the problem/question is not properly set.

3
  • see ?by -- you might not have to write an explicit loop. You can then call do.call(ggarrange, ...) on the resulting list. But you do need to provide a reproducible dataset for us to answer your question. Commented Aug 28, 2017 at 14:45
  • also, if you want to create separate plots for each combination of two grouping variables, you should check out facet_grid() which could eliminate the need for multiple plotting statements and ggarrange() Commented Aug 28, 2017 at 15:26
  • Hi @C8H10N4O2 I added the data set I used. Thanks for the comments. Commented Aug 30, 2017 at 12:16

1 Answer 1

1

1. Sorry to tell you this, but your dataset is a mess.

Yes, you could loop through the column indices and use get() or something similar to access the columns programatically. However, this code would be pretty unreadable and useless for any other situation.

The best practice would be to recognize that you really just have two observations - X and Y - at different times and locations. Your current rows don't really signify anything other than group-wise observation ID's. It's like someone just pasted a bunch of XY pairs side by side on a spreadsheet.

Your ideal dataset would have four columns:

  • Section ID (e.g. 7)
  • Point in time (i.e., Observed or B/1974)
  • X
  • Y

So first let's put your data into this "long" form. You should check out tidyverse for this and many related questions on SO. We will also use regular expressions for parsing out the key information.

#----- These functions help you parse your section/time codes ------
#----- To understand them, use the regex tutorial linked above.  ---
keyStartingDigits <- function(s) as.integer(regmatches(s ,regexpr('^\\d+',s) ))
keyEndingXorY     <- function(s) regmatches(s ,regexpr('[XY]$',s) )
#----- This function helps you parse observed/historical  ----------
keyTime <- function(s) {
  factor(
    ifelse(substr(s,nchar(s)-1,nchar(s)-1)=='B',
           '1974',
           'Observed'
    ), levels=c('1974','Observed')
  )
}

library(dplyr) # \  popular libaries for data manipulation
library(tidyr) # /  part of the 'tidyverse'
df <- 
  SEC %>% 
    mutate(obs_id = 1:n()) %>%
    gather(key=key, value=measurement,-obs_id, na.rm=TRUE) %>% # produces key-value pairs, e.g. [7X, 7.5]
    mutate(section=keyStartingDigits(key), # section ID, e.g. 7
           time = keyTime(key),            # 1974 or Observed
           dimension = keyEndingXorY(key)  # X or Y
           ) %>%
    select(-key) %>%
    spread(dimension,measurement) %>%
    select(-obs_id)

Now you have a nice dataset that looks like this.

# > head(df)
# # A tibble: 6 x 4
#   section     time     X          Y
#     <chr>   <fctr> <dbl>      <dbl>
# 1      10     1974     0    0.00000
# 2      10 Observed    32  -94.59664
# 3      13     1974     0    0.00000
# 4      13 Observed    30 -130.28013
# 5      14     1974     0    0.00000
# 6      14 Observed    40 -145.99063

Now you have your grouping variables in column values (not column names!) where it's much more natural to group by.

2. Don't repeat yourself (DRY)

You need one plotting function with one geom, not four plots each with two geoms. When you have your data in a proper normal form, it is easy to create the observed and historic lines using the same geom rather than plotting two subsets.

library(ggplot2)
myPlot <- function(section){
  line_cols <- c('Observed'='black', '1974'='blue')
  line_types <- c('Observed'='solid', '1974'='dashed')
  
  ggplot(df[df$section==section,], aes(x=X, y = Y))+
    geom_line(aes(colour = time, linetype=time)) +
    labs(x = "Distance [cm]", y = "Depth [cm]", caption=paste('Section',section)) + 
    coord_equal() + 
    scale_x_continuous(position = "top", limits = c(0,750)) + 
    scale_y_continuous(limits = c(-280,0)) + 
    scale_colour_manual(values = line_cols, guide=FALSE) +
    scale_linetype_manual(values=line_types, guide=FALSE) +
    theme(plot.caption = element_text(hjust=0.5,size=rel(1.5)))
}
myPlot(7) # example

enter image description here

3. Now it's easy to create plots by section

There are lots of options to arrange plots into a grid. You can pass a list of plots to your arrange function. This list of plots can be an lapply of the myPlot function already created.

library(gridExtra) # could also use ggarrange instead of marrangeGrob, or grid.arrange
pl <- lapply(sort(unique(df$section)), 
         function(i) myPlot(i)) # list of 4 plots
ggsave("plotgrid.png", 
       plot = marrangeGrob(pl, nrow=2, ncol=2), 
       device='png')

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. It is impressive what you did with my data set and it is very handy and quick to produce the plots per section separately with your code, but what I was trying to achieve was to put 4 of this plots (lets say, section 7, 10, 13 and 14) in the same figure (a grid of 2x2) for printing... but from here it should be easier.
@Daniel you can ggsave a grid of plots produced by gridExtra::grid.arrange or your ggarrange. Or, see ?png and dev.off. Open the png device, plot to device, and then close it. I'll update
C8H10N4O2 the help is very much appreciated. Now I will be able to handle it. Thanks.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.