1

I have a csv file ("data.csv") with several columns. My dependent variable is J and my independent variables are S1, S2, S3 and S4.

J  S1  S2  S3  S4  Z
1   4   5   3   2  0
2  12  11  34  44  0
3  12  15  22  21  1
4  10   9  10  11  1

I have managed to plot J and S1:

Reg.data <- read.csv ("C: \\ Users \\ data.csv", header = TRUE, sep = ';')
Library (ggplot2)
Qplot (data = reg.data, x = J, y = mean(S1), color = "red")

Now, I would like to plot (in the same graph) all my independent variables S1, S2, S3, S4 in different colours. I've tried (and I've searched in the forum) but I can not do it.

I would also like to know how to plot three axes: variable J, variables S (on the same axis) and covariable Z.

2 Answers 2

1

Without being sure I have correctly understood the question, you could try the following:

require(data.table)
require(ggplot2)

dat1 <- fread('J  S1  S2  S3  S4  Z
              1   4   5   3   2  0
              2  12  11  34  44  0
              3  12  15  22  21  1
              4  10   9  10  11  1')

temp <- melt(dat1, id.vars = c("J", "Z"))

ggplot(temp, aes(x = J, y = value, color = variable, shape = as.factor(Z))) +
  geom_point() 

This gives you the following plot: enter image description here

One limitation that this approach has is that I have assumed that Z is a variable with discrete values (and a small number of discrete values at that). If this is not the case you can map it to alpha perhaps.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for the answer. I'm gonna try this. But my data file is much more heavy, so... is there any way to do something similar but from the csv file? I mean, not from a table.
No R loads everything in memory. So you will have to read the CSV file and store it in a data table / data frame...
That being said, if you have a file that you can load in memory (as in you have enough RAM for it) data table and ggplot are quite efficient :)
What if I'd like to select only one level of Z. I mean, if I'd like to represent only data points for Z=0?
Well you can subset your plot dataset for example temp_new = temp[Z==0]
0

I ever so slightly "extended" the very helpful example above to read a CSV format data file (above) that is entitled 'exampledata.csv' and it looks like the following:

J,S1,S2,S3,S4,Z
1,4,5,3,2,0
2,12,11,34,44,0
3,12,15,22,21,1
4,10,9,10,11,1

To generate the very same plot that reads this CSV file, I used the following:

require(data.table)
require(ggplot2)
dat1 <- read.csv(file = "exampledata.csv")
temp <- melt(dat1, id.vars = c("J", "Z"))
ggplot(temp, aes(x = J, y = value, color = variable, shape = as.factor(Z))) +
geom_point(size = 3)

PS - I admittedly made the data points a bit larger because the eyes aren't as great as they used to be...

If you wish to use more data to create a bit more interesting plot, please use the following longer version of the CSV data file that is entitled 'exampledata.csv' (above):

J,S1,S2,S3,S4,Z
1,9,12,13,15,0
2,12,11,18,24,0
3,12,15,22,21,1
4,10,9,10,11,1
5,15,11,14,17,0
6,11,8,12,14,0
7,13,10,19,15,0
8,11,14,15,17,1
9,16,17,14,12,0
10,13,11,14,12,1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.