Dear Helpers, I have a dataset X, with no missing values, everything is in order, R reads it correctly, and I have already done some statistical analyses on the dataset. The data are in order by date (six sampling dates in one year, earliest to latest) and I now want to generate boxplots for each parameter for each date. However, R outputs the boxplots in some order that I do not understand (eg. 10.5.2011, 11.21.2011, 4.5.2011, 5.17.2011, 6.27.2011, 8.16.2011) instead of chronologically as the data are in the dataframe. I tried reformatting the date field from English (US) to German, just in case my R was confused, but still R seems to use its own rules. The data do not occur in any rank or order with R's way of organizing the dates (not highest to lowest, or lowest to highest) so I don't know why it is ordering the dates as it is. This also happened when I ran ANOVAs on the parameters. I got the same output (mean values and Tukey HSD significance) using the dates formatted in both ways (whew! I would have really been worried, otherwise!) But now for the boxplots, which might eventually be included in the paper, I want to make it easier for the reader to interpret, so would like to have them ordered from earliest (4.5.2011- April, to 11.21.2011 - November.) Does anyone have a suggestion for how I can correct this? Thanks very much and my apologies if this has been covered before. I looked before posting my question, however, and couldn't find anything. -- Kathleen Regan University of Hohenheim Institute of Soil Science and Land Evaluation Soil Biology Section Emil-Wolff-Str. 27 D-70593 Stuttgart-Hohenheim "Traveler, there is no road. We make the road by walking." phone: +49(0)711 459 23118 fax: +49(0)711 459 23117 E-mail:kath.regan@gmail.com [[alternative HTML version deleted]]
I think we need a bit more detall of the code that you are using and the structure of the data set. Would you please supply some sample data (see ?dput for a convenient way to supply data to the R-help list. Also the output of str() would be useful. A good guideline for asking a question in the list can be found at http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example John Kane Kingston ON Canada> -----Original Message----- > From: kath.regan at gmail.com > Sent: Wed, 28 Nov 2012 13:13:32 +0100 > To: r-help at r-project.org > Subject: [R] output data by date? > > Dear Helpers, > > I have a dataset X, with no missing values, everything is in order, R > reads > it correctly, and I have already done some statistical analyses on the > dataset. The data are in order by date (six sampling dates in one year, > earliest to latest) and I now want to generate boxplots for each > parameter > for each date. > > However, R outputs the boxplots in some order that I do not understand > (eg. > 10.5.2011, 11.21.2011, 4.5.2011, 5.17.2011, 6.27.2011, 8.16.2011) instead > of chronologically as the data are in the dataframe. > > I tried reformatting the date field from English (US) to German, just in > case my R was confused, but still R seems to use its own rules. The data > do > not occur in any rank or order with R's way of organizing the dates (not > highest to lowest, or lowest to highest) so I don't know why it is > ordering > the dates as it is. > > This also happened when I ran ANOVAs on the parameters. I got the same > output (mean values and Tukey HSD significance) using the dates formatted > in both ways (whew! I would have really been worried, otherwise!) But now > for the boxplots, which might eventually be included in the paper, I want > to make it easier for the reader to interpret, so would like to have them > ordered from earliest (4.5.2011- April, to 11.21.2011 - November.) > > Does anyone have a suggestion for how I can correct this? > > Thanks very much and my apologies if this has been covered before. I > looked > before posting my question, however, and couldn't find anything. > > -- > Kathleen Regan > > University of Hohenheim > > Institute of Soil Science and Land Evaluation > > Soil Biology Section > > Emil-Wolff-Str. 27 > > D-70593 Stuttgart-Hohenheim > > "Traveler, there is no road. We make the road by walking." > > phone: +49(0)711 459 23118 > > fax: +49(0)711 459 23117 > E-mail:kath.regan at gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.____________________________________________________________ FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! Visit http://www.inbox.com/photosharing to find out more!
Your date "string" is sorted in alphabetic order. You need to convert to a Date class and then sort as this example shows:> x <- c('10.5.2011', '11.21.2011', '4.5.2011', '5.17.2011', '6.27.2011', '8.16.2011') > # convert to Date > xD <- as.Date(x, format = "%m.%d.%Y") > # notice difference in ordering > sort(x)[1] "10.5.2011" "11.21.2011" "4.5.2011" "5.17.2011" "6.27.2011" "8.16.2011"> sort(xD)[1] "2011-04-05" "2011-05-17" "2011-06-27" "2011-08-16" "2011-10-05" "2011-11-21">On Wed, Nov 28, 2012 at 7:13 AM, Kathleen Regan <kath.regan at gmail.com> wrote:> Dear Helpers, > > I have a dataset X, with no missing values, everything is in order, R reads > it correctly, and I have already done some statistical analyses on the > dataset. The data are in order by date (six sampling dates in one year, > earliest to latest) and I now want to generate boxplots for each parameter > for each date. > > However, R outputs the boxplots in some order that I do not understand (eg. > 10.5.2011, 11.21.2011, 4.5.2011, 5.17.2011, 6.27.2011, 8.16.2011) instead > of chronologically as the data are in the dataframe. > > I tried reformatting the date field from English (US) to German, just in > case my R was confused, but still R seems to use its own rules. The data do > not occur in any rank or order with R's way of organizing the dates (not > highest to lowest, or lowest to highest) so I don't know why it is ordering > the dates as it is. > > This also happened when I ran ANOVAs on the parameters. I got the same > output (mean values and Tukey HSD significance) using the dates formatted > in both ways (whew! I would have really been worried, otherwise!) But now > for the boxplots, which might eventually be included in the paper, I want > to make it easier for the reader to interpret, so would like to have them > ordered from earliest (4.5.2011- April, to 11.21.2011 - November.) > > Does anyone have a suggestion for how I can correct this? > > Thanks very much and my apologies if this has been covered before. I looked > before posting my question, however, and couldn't find anything. > > -- > Kathleen Regan > > University of Hohenheim > > Institute of Soil Science and Land Evaluation > > Soil Biology Section > > Emil-Wolff-Str. 27 > > D-70593 Stuttgart-Hohenheim > > "Traveler, there is no road. We make the road by walking." > > phone: +49(0)711 459 23118 > > fax: +49(0)711 459 23117 > E-mail:kath.regan at gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it.