Dear anyone who knows more about R than me (so everyone). I have been bashing my head on the keyboard all day trying to do something with my table. I have some data, like so: yyyy-mm Rainfall(mm) 1 1977-02 17.4 2 1977-03 34.0 3 1977-04 26.2 4 1977-05 42.6 5 1977-06 58.6 6 1977-07 23.2 7 1977-08 26.8 8 1977-09 48.4 9 1977-10 47.0 10 1977-11 37.2 11 1977-12 15.0 12 1978-01 2.6 13 1978-02 6.8 14 1978-03 9.0 15 1978-04 46.6 The data continues on for x number of hundreds of data points. Simply put, I need to make that data.frame into this data.frame/table/matrix/grid/... you get the picture. Jan Feb Mar ... etc Year Rain Rain Rain Year Rain Rain Rain Year Rain Rain Rain Year etc... Year Year How on earth do I do it? I have made little to no progress on it all day. Also, just like all the other problems, the year and month will change every time depending upon which csv file or sql query I load into the program. If anyone has any pointers, that would be awesome. Cheers lads. -- View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html Sent from the R help mailing list archive at Nabble.com.
Hi Raging Jim may be this is a starting point. myDF<-read.table(stdin(),head=T,sep=",") yyyymm,Rainfall 1977-02,17.4 1977-03,34.0 1977-04,26.2 1977-05,42.6 1977-06,58.6 1977-07,23.2 1977-08,26.8 1977-09,48.4 1977-10,47.0 1977-11,37.2 1977-12,15.0 1978-01,2.6 1978-02,6.8 1978-03,9.0 1978-04,46.6 myDF$yyyy<-substr(myDF$yyyymm,1,4) myDF$mm<-substr(myDF$yyyymm,6,7) myDF<-subset(myDF, select=c(yyyy,mm,Rainfall)) myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy", timevar="mm",direction="wide") myDF.reshape best regards milton [[alternative HTML version deleted]]
the other alternative would be to edit my sql query so that that data is brought in from the database and put in to the correct format initially. "sqlQuery(conn, "select lsd,ttl_mo_prcp from mo_rains where stn_num=23090")" That is my very basic query. I have also been given this for use in orcale (I believe): my $query = "select to_char(stn_num,'00009') as stn, to_char(lsd,'yyyy') as yr, nvl(max(decode(to_char(lsd,'MON'),'JAN',ttl_mo_prcp)),-9999) jan, nvl(max(decode(to_char(lsd,'MON'),'FEB',ttl_mo_prcp)),-9999) feb, nvl(max(decode(to_char(lsd,'MON'),'MAR',ttl_mo_prcp)),-9999) mar, nvl(max(decode(to_char(lsd,'MON'),'APR',ttl_mo_prcp)),-9999) apr, nvl(max(decode(to_char(lsd,'MON'),'MAY',ttl_mo_prcp)),-9999) may, nvl(max(decode(to_char(lsd,'MON'),'JUN',ttl_mo_prcp)),-9999) jun, nvl(max(decode(to_char(lsd,'MON'),'JUL',ttl_mo_prcp)),-9999) jul, nvl(max(decode(to_char(lsd,'MON'),'AUG',ttl_mo_prcp)),-9999) aug, nvl(max(decode(to_char(lsd,'MON'),'SEP',ttl_mo_prcp)),-9999) sep, nvl(max(decode(to_char(lsd,'MON'),'OCT',ttl_mo_prcp)),-9999) oct, nvl(max(decode(to_char(lsd,'MON'),'NOV',ttl_mo_prcp)),-9999) nov, nvl(max(decode(to_char(lsd,'MON'),'DEC',ttl_mo_prcp)),-9999) dec from mo_rains where (stn_num in ($stns)) group by stn_num, to_char(lsd,'yyyy') order by to_char(lsd,'yyyy') desc;"; But I think that sorts by station number, as it is designed for multiple stations at a time, whereas mine is for one station only. Yet if I plug it into R just to see what happens, I get a plethora of extraordinarily long errors which I can post if needed. -- View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555813.html Sent from the R help mailing list archive at Nabble.com.
On Feb 15, 2010, at 1:22 AM, milton ruser wrote:> Hi Raging Jim > > may be this is a starting point. > > myDF<-read.table(stdin(),head=T,sep=",")Those "yyyymm" entries will become factors, which can lead to confusion for newbies. Might be more straightforward to always use stringsAsFactors=FALSE in the read.table arguments. I see that the yyymm column later gets thrown away so it may not matter here.> yyyymm,Rainfall > 1977-02,17.4 > 1977-03,34.0 > 1977-04,26.2 > 1977-05,42.6 > 1977-06,58.6 > 1977-07,23.2 > 1977-08,26.8 > 1977-09,48.4 > 1977-10,47.0 > 1977-11,37.2 > 1977-12,15.0 > 1978-01,2.6 > 1978-02,6.8 > 1978-03,9.0 > 1978-04,46.6 >When I did a very similar maneuver, I added an extra NA entry at the beginning: myDF <- rbind(list(yyyymm="1977-01", Rainfall=NA), myDF) ... so the columns would start with January. (The warning is harmless.)> myDF$yyyy<-substr(myDF$yyyymm,1,4) > myDF$mm<-substr(myDF$yyyymm,6,7) > myDF<-subset(myDF, select=c(yyyy,mm,Rainfall)) > myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy", > timevar="mm",direction="wide") > myDF.reshape > best regardsWhen the time comes to rename those columns, knowing that there is a system constant called month.names may come in handy. Perhaps (untested): names(myDF.reshape) <- c("Year", month.names[1:12])> > milton-- David Winsemius, MD Heritage Laboratories West Hartford, CT
Try this: Lines <- " yyyy-mm Rainfall(mm) 1 1977-02 17.4 2 1977-03 34.0 3 1977-04 26.2 4 1977-05 42.6 5 1977-06 58.6 6 1977-07 23.2 7 1977-08 26.8 8 1977-09 48.4 9 1977-10 47.0 10 1977-11 37.2 11 1977-12 15.0 12 1978-01 2.6 13 1978-02 6.8 14 1978-03 9.0 15 1978-04 46.6" library(zoo) z <- read.zoo(textConnection(Lines), FUN = as.yearmon) sq <- seq(floor(start(z)), ceiling(end(z)) - 1/12, 1/12) DF <- matrix(merge(z, zoo(, sq)), byrow = TRUE, nc = 12, dimnames list(unique(floor(sq)), month.abb)) The first statement reads the data into a zoo object z using yearmon (year/month) time class. yearmon objects represent Jan 1977 as 1977, Feb 1977 as 1977+1/12, Mar 1977 as 1977+2/12, ..., Dec 1977 as 1977+11/12, etc. so the floor and ceiling operations on the start and end of the series can be used to form a sequence from the beginning of the first year to the end of the last. Finally we merge z together with a zero width series (i.e. no data, only times) and then form the result into a matrix with the appropriate dimension names.> DFJan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec 1977 NA 17.4 34 26.2 42.6 58.6 23.2 26.8 48.4 47 37.2 15 1978 2.6 6.8 9 46.6 NA NA NA NA NA NA NA NA On Mon, Feb 15, 2010 at 1:07 AM, RagingJim <nowa0021 at flinders.edu.au> wrote:> > Dear anyone who knows more about R than me (so everyone). I have been bashing > my head on the keyboard all day trying to do something with my table. > > I have some data, like so: > ? ? yyyy-mm ? ? Rainfall(mm) > 1 ? 1977-02 ? ? ? ?17.4 > 2 ? 1977-03 ? ? ? ?34.0 > 3 ? 1977-04 ? ? ? ?26.2 > 4 ? 1977-05 ? ? ? ?42.6 > 5 ? 1977-06 ? ? ? ?58.6 > 6 ? 1977-07 ? ? ? ?23.2 > 7 ? 1977-08 ? ? ? ?26.8 > 8 ? 1977-09 ? ? ? ?48.4 > 9 ? 1977-10 ? ? ? ?47.0 > 10 ?1977-11 ? ? ? ?37.2 > 11 ?1977-12 ? ? ? ?15.0 > 12 ?1978-01 ? ? ? ? 2.6 > 13 ?1978-02 ? ? ? ? 6.8 > 14 ?1978-03 ? ? ? ? 9.0 > 15 ?1978-04 ? ? ? ?46.6 > > The data continues on for x number of hundreds of data points. Simply put, I > need to make that data.frame into this data.frame/table/matrix/grid/... you > get the picture. > > ? ? ? ? Jan ?Feb ?Mar ... etc > Year ? Rain Rain Rain > Year ? Rain Rain Rain > Year ? Rain Rain Rain > Year ? etc... > Year > Year > > How on earth do I do it? I have made little to no progress on it all day. > Also, just like all the other problems, the year and month will change every > time depending upon which csv file or sql query I load into the program. If > anyone has any pointers, that would be awesome. > > Cheers lads. > -- > View this message in context: http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >