3

How do I add multiple regression lines to the same plot in plotly?

I want to graph the scatter plot, as well as a regression line for each CATEGORY

The scatter plot plots fine, however the graph lines are not graphed correctly (as compared to excel outputs, see below)

df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

fv <- df %>%
  filter(!is.na(x)) %>%
  lm(x ~ y + y*CATEGORY,.) %>%
  fitted.values()

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) %>%
  add_trace(x = ~y, y = ~fv, mode = "lines")

p
  • Apologies for not adding in all the information beforehand, and thanks for adding the suggestion of "y*CATEGORY" to fix the parallel line issue.

Excel Output https://i.sstatic.net/WYSfC.png

R Output https://i.sstatic.net/SCIJb.png

6
  • Please create a reproducible example, including data or at the very least the output of fv. See this post for guidance: stackoverflow.com/questions/5963269/… Commented Dec 18, 2018 at 15:52
  • Also, is it compulsory a plotly sintax? Commented Dec 18, 2018 at 15:53
  • 1
    Please use the r-plotly tag instead of plotly. Also you'll need to provide us with dput(df) or dput(head(df, 20)) (if it is too much data) so we can help. Commented Dec 18, 2018 at 16:08
  • The lines should be parallel based on your model. What you need to add is an interaction to your model if you expect the slopes to be different in each category (e.g. lm(x ~ y + y*CATEGORY, .) Commented Dec 18, 2018 at 16:20
  • @emilliman5 Thanks for that! I have added the new information to the original question, not sure if R regression line should match that in excel but I have linked both images in the question. Commented Dec 18, 2018 at 16:58

2 Answers 2

3

Try this:

library(plotly)
df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

df$fv <- df %>%
  filter(!is.na(x)) %>%
  lm(y ~ x*CATEGORY,.) %>%
  fitted.values()

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) %>%
  add_trace(x = ~x, y = ~fv, mode = "lines")

p

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

Another option is to fit the data to a lm() and then plot the fit using add_lines() with fitted():

library(plotly)
df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) 

fit<-lm(y~x*CATEGORY, data=df)
p %>% add_lines(x = ~x, y = fitted(fit))

Note: using x*CATEGORY (i.e., main + interaction) yields different slopes + offsets for the lines, while x+CATEGORY (i.e., main effects only) will fit one slope, and only change the offsets for each CATEGORY.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.