Hello all,
I have been using R for about 3 weeks and I am frustrated by a problem. I have
read R in a nutshell, scoured the internet for help but I either am not
understanding examples or am missing something completely basic. Here is the
problem:
I want to plot data that contains dates on the x axis. Then I want to fit a
line to the data. I have been unable to do it.
This is an example of the data (in a dataframe called
"tradeflavorbyday"), 40 lines of it (I'm sorry it's not in a
runnable form, not sure how to get that from R) :
tradeflavor timestamp x
1 1 2009-01-22 1
2 2 2009-01-22 1
3 1 2009-01-23 1
4 1 2009-01-27 54
5 1 2009-01-28 105
6 2 2009-01-28 2
7 16 2009-01-28 2
8 1 2009-01-29 71
9 16 2009-01-29 2
10 1 2009-01-30 42
11 1 2009-02-02 19
12 16 2009-02-02 2
13 1 2009-02-03 36
14 4 2009-02-03 2
15 8 2009-02-03 3
16 1 2009-02-04 73
17 8 2009-02-04 12
18 16 2009-02-04 7
19 1 2009-02-05 53
20 8 2009-02-05 6
21 16 2009-02-05 9
22 1 2009-02-06 38
23 4 2009-02-06 6
24 8 2009-02-06 2
25 16 2009-02-06 3
26 1 2009-02-09 42
27 2 2009-02-09 2
28 4 2009-02-09 1
29 8 2009-02-09 2
30 1 2009-02-10 87
31 4 2009-02-10 2
32 8 2009-02-10 4
33 16 2009-02-10 3
34 1 2009-02-11 55
35 2 2009-02-11 6
36 4 2009-02-11 4
37 8 2009-02-11 2
38 16 2009-02-11 8
39 1 2009-02-12 153
40 2 2009-02-12 6
The plot displays the x column as the yaxis and the date as the x axis, grouped
by the tradetype column.
The timestamp column:> class(tradeflavorbyday$timestamp)
[1] "POSIXlt" "POSIXt"
So in this case I want to plot tradetype 1 (method 1):
xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count")
Up to here it works great.
Now a abline through lm:
xylm <- lm(ydata~xdates) <------ this fails, can't do dates as below
abline(xylm, col="black")
> lm(ydata~xdates)
Error in model.frame.default(formula = ydata ~ xdates, drop.unused.levels =
TRUE) :
invalid type (list) for variable 'xdates'
So I try this instead (method 2):
xdata <- 1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor ==
1])
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
xylm <- lm(ydata~xdata) <------ now this works, great
abline(xylm, col="black")
The problem now is that I can't get the dates onto the xaxis. I have tried
turning off the axis using xaxt="n" and reploting using the
axis.POSIXct() call but it does not want to display the dates:
dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
axis.POSIXct(1, at=dateseq, format="%Y\n%b")
I have tried combining both approaches by plotting dates and trying to fit the
line using method 2:
xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
xdata <- 1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor ==
1])
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count", xaxt="n")
dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
axis.POSIXct(1, at=dateseq, format="%Y\n%b")
xylm <- lm(ydata~xdata) <- works
abline(xylm, col="black") <- does nothing
In this case the call to lm and abline "works" but nothing is drawn.
Confused I plugged in the coefficients manually (I have complete data, so they
will be different than the example data I pasted):
> lm(ydata~xdata)
Call:
lm(formula = ydata ~ xdata)
Coefficients:
(Intercept) xdata
6.11491 -0.02577
Abline(6.11491, -0.02577) <- call worked, but nothing shown
Just by chance I added many 0 to flatten out the slope:
Abline(6.11491, -0. 0000000002577) <- call worked and a horizontal line
appeared?????
So I took off a 0:
Abline(6.11491, -0. 000000002577) <- the line moved significantly down
So I took off another 0:
Abline(6.11491, -0. 00000002577) <- line disappeared
I guess the slope causes it to go vertical and disappear of the graph.
I have no idea how to solve my issue. If anyone can see my basic idiotic error
please point it out, or maybe you have another suggestion, I will gladly try it.
Thanks for your help!!
Nordlund, Dan (DSHS/RDA)
2012-Aug-28 17:23 UTC
[R] date in plot, can't add regression line
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Norbert Skalski > Sent: Tuesday, August 28, 2012 9:49 AM > To: r-help at r-project.org > Subject: [R] date in plot, can't add regression line > > Hello all, > > I have been using R for about 3 weeks and I am frustrated by a problem. > I have read R in a nutshell, scoured the internet for help but I either > am not understanding examples or am missing something completely basic. > Here is the problem: > > I want to plot data that contains dates on the x axis. Then I want to > fit a line to the data. I have been unable to do it. > > This is an example of the data (in a dataframe called > "tradeflavorbyday"), 40 lines of it (I'm sorry it's not in a runnable > form, not sure how to get that from R) : > tradeflavor timestamp x > 1 1 2009-01-22 1 > 2 2 2009-01-22 1 > 3 1 2009-01-23 1 > 4 1 2009-01-27 54 > 5 1 2009-01-28 105 > 6 2 2009-01-28 2 > 7 16 2009-01-28 2 > 8 1 2009-01-29 71 > 9 16 2009-01-29 2 > 10 1 2009-01-30 42 > 11 1 2009-02-02 19 > 12 16 2009-02-02 2 > 13 1 2009-02-03 36 > 14 4 2009-02-03 2 > 15 8 2009-02-03 3 > 16 1 2009-02-04 73 > 17 8 2009-02-04 12 > 18 16 2009-02-04 7 > 19 1 2009-02-05 53 > 20 8 2009-02-05 6 > 21 16 2009-02-05 9 > 22 1 2009-02-06 38 > 23 4 2009-02-06 6 > 24 8 2009-02-06 2 > 25 16 2009-02-06 3 > 26 1 2009-02-09 42 > 27 2 2009-02-09 2 > 28 4 2009-02-09 1 > 29 8 2009-02-09 2 > 30 1 2009-02-10 87 > 31 4 2009-02-10 2 > 32 8 2009-02-10 4 > 33 16 2009-02-10 3 > 34 1 2009-02-11 55 > 35 2 2009-02-11 6 > 36 4 2009-02-11 4 > 37 8 2009-02-11 2 > 38 16 2009-02-11 8 > 39 1 2009-02-12 153 > 40 2 2009-02-12 6 > > > The plot displays the x column as the yaxis and the date as the x axis, > grouped by the tradetype column. > The timestamp column: > > class(tradeflavorbyday$timestamp) > [1] "POSIXlt" "POSIXt" > > So in this case I want to plot tradetype 1 (method 1): > > xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1] > ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1] > > plot(xdates, ydata, col="black", xlab="Dates", ylab="Count") > > Up to here it works great. > > Now a abline through lm: > > xylm <- lm(ydata~xdates) <------ this fails, can't do dates as below > abline(xylm, col="black") > > > lm(ydata~xdates) > Error in model.frame.default(formula = ydata ~ xdates, > drop.unused.levels = TRUE) : > invalid type (list) for variable 'xdates' > >You might try converting timestamp as follows xdates <- as.POSIXct(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]) Your original code should now work. Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
First of all, a practical way to supply data is to use the function dput()
Just do dput(mydata) and copy and paste the results into your email. The reader
can copy and paste into R and have an identical data set.
I am not sure I have followed exactly what you are doing but here is something
that may approach what you want, done using the ggp;pt2 package. Do
'install.packages("ggplot2) if you do not have it.
Anyway here is roughly your data set in the dput format
mydata <- structure(list(tradeflavor = c(1L, 2L, 1L, 1L, 1L, 2L, 16L, 1L,
16L, 1L, 1L, 16L, 1L, 4L, 8L, 1L, 8L, 16L, 1L, 8L, 16L, 1L, 4L,
8L, 16L, 1L, 2L, 4L, 8L, 1L, 4L, 8L, 16L, 1L, 2L, 4L, 8L, 16L,
1L, 2L), timestamp = structure(c(14266, 14266, 14267, 14271,
14272, 14272, 14272, 14273, 14273, 14274, 14277, 14277, 14278,
14278, 14278, 14279, 14279, 14279, 14280, 14280, 14280, 14281,
14281, 14281, 14281, 14284, 14284, 14284, 14284, 14285, 14285,
14285, 14285, 14286, 14286, 14286, 14286, 14286, 14287, 14287
), class = "Date"), x = c(1L, 1L, 1L, 54L, 105L, 2L, 2L, 71L,
2L, 42L, 19L, 2L, 36L, 2L, 3L, 73L, 12L, 7L, 53L, 6L, 9L, 38L,
6L, 2L, 3L, 42L, 2L, 1L, 2L, 87L, 2L, 4L, 3L, 55L, 6L, 4L, 2L,
8L, 153L, 6L)), .Names = c("tradeflavor", "timestamp",
"x"), row.names = c(NA,
-40L), class = "data.frame")
#====================================library(ggplot2)
# first subset
m1data <- subset(mydata, tradeflavor == 1)
# plot for tradeflavor = 1
p1 <- ggplot(m1data , aes( timestamp, x)) + geom_point() +
geom_smooth(method = lm, se = FALSE)
p1
m2data <- subset(mydata, tradeflavor == 2)
p2 <- ggplot(m2data , aes( timestamp, x )) + geom_point() +
geom_smooth(method = lm, se = FALSE)
p2
# plot a grid of results
pgrid <- p <- ggplot(mydata , aes( timestamp, x)) + geom_point() +
geom_smooth(method = lm, se = FALSE) + facet_grid(tradeflavor ~ .)
pgrid
# Have fun with R.
John Kane
Kingston ON Canada
> -----Original Message-----
> From: norbert.skalski at ronin-capital.com
> Sent: Tue, 28 Aug 2012 11:48:32 -0500
> To: r-help at r-project.org
> Subject: [R] date in plot, can't add regression line
>
> Hello all,
>
> I have been using R for about 3 weeks and I am frustrated by a problem.
> I have read R in a nutshell, scoured the internet for help but I either
> am not understanding examples or am missing something completely basic.
> Here is the problem:
>
> I want to plot data that contains dates on the x axis. Then I want to
> fit a line to the data. I have been unable to do it.
>
> This is an example of the data (in a dataframe called
> "tradeflavorbyday"), 40 lines of it (I'm sorry it's not
in a runnable
> form, not sure how to get that from R) :
> tradeflavor timestamp x
> 1 1 2009-01-22 1
> 2 2 2009-01-22 1
> 3 1 2009-01-23 1
> 4 1 2009-01-27 54
> 5 1 2009-01-28 105
> 6 2 2009-01-28 2
> 7 16 2009-01-28 2
> 8 1 2009-01-29 71
> 9 16 2009-01-29 2
> 10 1 2009-01-30 42
> 11 1 2009-02-02 19
> 12 16 2009-02-02 2
> 13 1 2009-02-03 36
> 14 4 2009-02-03 2
> 15 8 2009-02-03 3
> 16 1 2009-02-04 73
> 17 8 2009-02-04 12
> 18 16 2009-02-04 7
> 19 1 2009-02-05 53
> 20 8 2009-02-05 6
> 21 16 2009-02-05 9
> 22 1 2009-02-06 38
> 23 4 2009-02-06 6
> 24 8 2009-02-06 2
> 25 16 2009-02-06 3
> 26 1 2009-02-09 42
> 27 2 2009-02-09 2
> 28 4 2009-02-09 1
> 29 8 2009-02-09 2
> 30 1 2009-02-10 87
> 31 4 2009-02-10 2
> 32 8 2009-02-10 4
> 33 16 2009-02-10 3
> 34 1 2009-02-11 55
> 35 2 2009-02-11 6
> 36 4 2009-02-11 4
> 37 8 2009-02-11 2
> 38 16 2009-02-11 8
> 39 1 2009-02-12 153
> 40 2 2009-02-12 6
>
>
> The plot displays the x column as the yaxis and the date as the x axis,
> grouped by the tradetype column.
> The timestamp column:
>> class(tradeflavorbyday$timestamp)
> [1] "POSIXlt" "POSIXt"
>
> So in this case I want to plot tradetype 1 (method 1):
>
> xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
> ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
>
> plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count")
>
> Up to here it works great.
>
> Now a abline through lm:
>
> xylm <- lm(ydata~xdates) <------ this fails, can't do dates as
below
> abline(xylm, col="black")
>
>> lm(ydata~xdates)
> Error in model.frame.default(formula = ydata ~ xdates, drop.unused.levels
> = TRUE) :
> invalid type (list) for variable 'xdates'
>
>
>
> So I try this instead (method 2):
> xdata <-
1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor
> == 1])
> ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
>
> xylm <- lm(ydata~xdata) <------ now this works, great
> abline(xylm, col="black")
>
> The problem now is that I can't get the dates onto the xaxis. I have
> tried turning off the axis using xaxt="n" and reploting using the
> axis.POSIXct() call but it does not want to display the dates:
>
> dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
> axis.POSIXct(1, at=dateseq, format="%Y\n%b")
>
>
>
>
> I have tried combining both approaches by plotting dates and trying to
> fit the line using method 2:
> xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
> xdata <-
1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor
> == 1])
> ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
>
> plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count", xaxt="n")
> dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
> axis.POSIXct(1, at=dateseq, format="%Y\n%b")
>
> xylm <- lm(ydata~xdata) <- works
> abline(xylm, col="black") <- does nothing
>
> In this case the call to lm and abline "works" but nothing is
drawn.
> Confused I plugged in the coefficients manually (I have complete data, so
> they will be different than the example data I pasted):
>
>> lm(ydata~xdata)
>
> Call:
> lm(formula = ydata ~ xdata)
>
> Coefficients:
> (Intercept) xdata
> 6.11491 -0.02577
>
> Abline(6.11491, -0.02577) <- call worked, but nothing shown
>
> Just by chance I added many 0 to flatten out the slope:
>
> Abline(6.11491, -0. 0000000002577) <- call worked and a horizontal line
> appeared?????
>
> So I took off a 0:
>
> Abline(6.11491, -0. 000000002577) <- the line moved significantly down
>
> So I took off another 0:
>
> Abline(6.11491, -0. 00000002577) <- line disappeared
>
> I guess the slope causes it to go vertical and disappear of the graph.
>
> I have no idea how to solve my issue. If anyone can see my basic idiotic
> error please point it out, or maybe you have another suggestion, I will
> gladly try it.
>
> Thanks for your help!!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________
GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at
http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most
webmails
Thanks for your suggestions,
The solution was as proposed by Mr. Nordlund, I had to convert the timestamps
again using as.POSIXct() I will have to remember that anytime I do any kind of
filtering/subselection of dates to reconvert them. Lesson learned.
Also thank you for the dput() suggestions for data, another tip I shall not
forget. This question is now closed.
Thanks!
-----Original Message-----
From: Norbert Skalski
Sent: Tuesday, August 28, 2012 11:49 AM
To: 'r-help at r-project.org'
Subject: date in plot, can't add regression line
Hello all,
I have been using R for about 3 weeks and I am frustrated by a problem. I have
read R in a nutshell, scoured the internet for help but I either am not
understanding examples or am missing something completely basic. Here is the
problem:
I want to plot data that contains dates on the x axis. Then I want to fit a
line to the data. I have been unable to do it.
This is an example of the data (in a dataframe called
"tradeflavorbyday"), 40 lines of it (I'm sorry it's not in a
runnable form, not sure how to get that from R) :
tradeflavor timestamp x
1 1 2009-01-22 1
2 2 2009-01-22 1
3 1 2009-01-23 1
4 1 2009-01-27 54
5 1 2009-01-28 105
6 2 2009-01-28 2
7 16 2009-01-28 2
8 1 2009-01-29 71
9 16 2009-01-29 2
10 1 2009-01-30 42
11 1 2009-02-02 19
12 16 2009-02-02 2
13 1 2009-02-03 36
14 4 2009-02-03 2
15 8 2009-02-03 3
16 1 2009-02-04 73
17 8 2009-02-04 12
18 16 2009-02-04 7
19 1 2009-02-05 53
20 8 2009-02-05 6
21 16 2009-02-05 9
22 1 2009-02-06 38
23 4 2009-02-06 6
24 8 2009-02-06 2
25 16 2009-02-06 3
26 1 2009-02-09 42
27 2 2009-02-09 2
28 4 2009-02-09 1
29 8 2009-02-09 2
30 1 2009-02-10 87
31 4 2009-02-10 2
32 8 2009-02-10 4
33 16 2009-02-10 3
34 1 2009-02-11 55
35 2 2009-02-11 6
36 4 2009-02-11 4
37 8 2009-02-11 2
38 16 2009-02-11 8
39 1 2009-02-12 153
40 2 2009-02-12 6
The plot displays the x column as the yaxis and the date as the x axis, grouped
by the tradetype column.
The timestamp column:> class(tradeflavorbyday$timestamp)
[1] "POSIXlt" "POSIXt"
So in this case I want to plot tradetype 1 (method 1):
xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count")
Up to here it works great.
Now a abline through lm:
xylm <- lm(ydata~xdates) <------ this fails, can't do dates as below
abline(xylm, col="black")
> lm(ydata~xdates)
Error in model.frame.default(formula = ydata ~ xdates, drop.unused.levels =
TRUE) :
invalid type (list) for variable 'xdates'
So I try this instead (method 2):
xdata <- 1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor ==
1])
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
xylm <- lm(ydata~xdata) <------ now this works, great
abline(xylm, col="black")
The problem now is that I can't get the dates onto the xaxis. I have tried
turning off the axis using xaxt="n" and reploting using the
axis.POSIXct() call but it does not want to display the dates:
dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
axis.POSIXct(1, at=dateseq, format="%Y\n%b")
I have tried combining both approaches by plotting dates and trying to fit the
line using method 2:
xdates <- tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor == 1]
xdata <- 1:length(tradeflavorbyday$timestamp[tradeflavorbyday$tradeflavor ==
1])
ydata <- tradeflavorbyday$x[tradeflavorbyday$tradeflavor == 1]
plot(xdates, ydata, col="black", xlab="Dates",
ylab="Count", xaxt="n")
dateseq = seq(xdates[1], xdates[length(xdates)], by="month")
axis.POSIXct(1, at=dateseq, format="%Y\n%b")
xylm <- lm(ydata~xdata) <- works
abline(xylm, col="black") <- does nothing
In this case the call to lm and abline "works" but nothing is drawn.
Confused I plugged in the coefficients manually (I have complete data, so they
will be different than the example data I pasted):
> lm(ydata~xdata)
Call:
lm(formula = ydata ~ xdata)
Coefficients:
(Intercept) xdata
6.11491 -0.02577
Abline(6.11491, -0.02577) <- call worked, but nothing shown
Just by chance I added many 0 to flatten out the slope:
Abline(6.11491, -0. 0000000002577) <- call worked and a horizontal line
appeared?????
So I took off a 0:
Abline(6.11491, -0. 000000002577) <- the line moved significantly down
So I took off another 0:
Abline(6.11491, -0. 00000002577) <- line disappeared
I guess the slope causes it to go vertical and disappear of the graph.
I have no idea how to solve my issue. If anyone can see my basic idiotic error
please point it out, or maybe you have another suggestion, I will gladly try it.
Thanks for your help!!