thr3ads.net - R help - [R] Separating columns, and sorting by rows [Feb 2010]

If this information is useful, please help other people find it:
Share via:

RagingJim

2010-Feb-15 06:07 UTC

[R] Separating columns, and sorting by rows

Dear anyone who knows more about R than me (so everyone). I have been bashing
my head on the keyboard all day trying to do something with my table.

I have some data, like so:
     yyyy-mm     Rainfall(mm)
1   1977-02        17.4
2   1977-03        34.0
3   1977-04        26.2
4   1977-05        42.6
5   1977-06        58.6
6   1977-07        23.2
7   1977-08        26.8
8   1977-09        48.4
9   1977-10        47.0
10  1977-11        37.2
11  1977-12        15.0
12  1978-01         2.6
13  1978-02         6.8
14  1978-03         9.0
15  1978-04        46.6

The data continues on for x number of hundreds of data points. Simply put, I
need to make that data.frame into this data.frame/table/matrix/grid/... you
get the picture.

         Jan  Feb  Mar ... etc
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   Rain Rain Rain
Year   etc...
Year
Year

How on earth do I do it? I have made little to no progress on it all day.
Also, just like all the other problems, the year and month will change every
time depending upon which csv file or sql query I load into the program. If
anyone has any pointers, that would be awesome.

Cheers lads.
-- 
View this message in context:
http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html
Sent from the R help mailing list archive at Nabble.com.

milton ruser

2010-Feb-15 06:22 UTC

head link

[R] Separating columns, and sorting by rows

Hi Raging Jim

may be this is a starting point.

myDF<-read.table(stdin(),head=T,sep=",")
yyyymm,Rainfall
1977-02,17.4
1977-03,34.0
1977-04,26.2
1977-05,42.6
1977-06,58.6
1977-07,23.2
1977-08,26.8
1977-09,48.4
1977-10,47.0
1977-11,37.2
1977-12,15.0
1978-01,2.6
1978-02,6.8
1978-03,9.0
1978-04,46.6

myDF$yyyy<-substr(myDF$yyyymm,1,4)
myDF$mm<-substr(myDF$yyyymm,6,7)
myDF<-subset(myDF, select=c(yyyy,mm,Rainfall))
myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy",
 timevar="mm",direction="wide")
myDF.reshape
best regards

milton

	[[alternative HTML version deleted]]

RagingJim

2010-Feb-15 06:25 UTC

head link

[R] Separating columns, and sorting by rows

the other alternative would be to edit my sql query so that that data is
brought in from the database and put in to the correct format initially.

"sqlQuery(conn, "select lsd,ttl_mo_prcp from mo_rains where
stn_num=23090")"

That is my very basic query. I have also been given this for use in orcale
(I believe):

        my $query = "select to_char(stn_num,'00009') as stn,
to_char(lsd,'yyyy') as yr,
                                               
nvl(max(decode(to_char(lsd,'MON'),'JAN',ttl_mo_prcp)),-9999)
jan,
                                               
nvl(max(decode(to_char(lsd,'MON'),'FEB',ttl_mo_prcp)),-9999)
feb,
                                               
nvl(max(decode(to_char(lsd,'MON'),'MAR',ttl_mo_prcp)),-9999)
mar,
                                               
nvl(max(decode(to_char(lsd,'MON'),'APR',ttl_mo_prcp)),-9999)
apr,
                                               
nvl(max(decode(to_char(lsd,'MON'),'MAY',ttl_mo_prcp)),-9999)
may,
                                               
nvl(max(decode(to_char(lsd,'MON'),'JUN',ttl_mo_prcp)),-9999)
jun,
                                               
nvl(max(decode(to_char(lsd,'MON'),'JUL',ttl_mo_prcp)),-9999)
jul,
                                               
nvl(max(decode(to_char(lsd,'MON'),'AUG',ttl_mo_prcp)),-9999)
aug,
                                               
nvl(max(decode(to_char(lsd,'MON'),'SEP',ttl_mo_prcp)),-9999)
sep,
                                               
nvl(max(decode(to_char(lsd,'MON'),'OCT',ttl_mo_prcp)),-9999)
oct,
                                               
nvl(max(decode(to_char(lsd,'MON'),'NOV',ttl_mo_prcp)),-9999)
nov,
                                               
nvl(max(decode(to_char(lsd,'MON'),'DEC',ttl_mo_prcp)),-9999) dec
                 from mo_rains 
                 where (stn_num in ($stns)) 
                                 group by stn_num, to_char(lsd,'yyyy')
                                 order by to_char(lsd,'yyyy')
desc;";

But I think that sorts by station number, as it is designed for multiple
stations at a time, whereas mine is for one station only. Yet if I plug it
into R just to see what happens, I get a plethora of extraordinarily long
errors which I can post if needed.


-- 
View this message in context:
http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555813.html
Sent from the R help mailing list archive at Nabble.com.

David Winsemius

2010-Feb-15 07:09 UTC

head link

[R] Separating columns, and sorting by rows

On Feb 15, 2010, at 1:22 AM, milton ruser wrote:
> Hi Raging Jim
>
> may be this is a starting point.
>
> myDF<-read.table(stdin(),head=T,sep=",")
Those "yyyymm" entries will become factors, which can lead to  
confusion for newbies. Might be more straightforward to always use  
stringsAsFactors=FALSE in the read.table arguments.  I see that the  
yyymm column later gets thrown away so it may not matter here.
> yyyymm,Rainfall
> 1977-02,17.4
> 1977-03,34.0
> 1977-04,26.2
> 1977-05,42.6
> 1977-06,58.6
> 1977-07,23.2
> 1977-08,26.8
> 1977-09,48.4
> 1977-10,47.0
> 1977-11,37.2
> 1977-12,15.0
> 1978-01,2.6
> 1978-02,6.8
> 1978-03,9.0
> 1978-04,46.6
>
When I did a very similar maneuver, I added an extra NA entry at the  
beginning:

myDF <- rbind(list(yyyymm="1977-01", Rainfall=NA), myDF)

... so the columns would start with January. (The warning is harmless.)
> myDF$yyyy<-substr(myDF$yyyymm,1,4)
> myDF$mm<-substr(myDF$yyyymm,6,7)
> myDF<-subset(myDF, select=c(yyyy,mm,Rainfall))
>
myDF.reshape<-reshape(myDF,v.names="Rainfall",idvar="yyyy",
> timevar="mm",direction="wide")
> myDF.reshape
> best regards
When the time comes to rename those columns, knowing that there is a  
system constant called month.names may come in handy. Perhaps  
(untested):

names(myDF.reshape) <- c("Year", month.names[1:12])
>
> milton--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Gabor Grothendieck

2010-Feb-15 08:25 UTC

head link

[R] Separating columns, and sorting by rows

Try this:

Lines <- "    yyyy-mm     Rainfall(mm)
1   1977-02        17.4
2   1977-03        34.0
3   1977-04        26.2
4   1977-05        42.6
5   1977-06        58.6
6   1977-07        23.2
7   1977-08        26.8
8   1977-09        48.4
9   1977-10        47.0
10  1977-11        37.2
11  1977-12        15.0
12  1978-01         2.6
13  1978-02         6.8
14  1978-03         9.0
15  1978-04        46.6"

library(zoo)
z <- read.zoo(textConnection(Lines), FUN = as.yearmon)

sq <- seq(floor(start(z)), ceiling(end(z)) - 1/12, 1/12)
DF <- matrix(merge(z, zoo(, sq)), byrow = TRUE, nc = 12, dimnames
list(unique(floor(sq)), month.abb))

The first statement reads the data into a zoo object z using yearmon
(year/month) time class.

yearmon objects represent Jan 1977 as 1977, Feb 1977 as 1977+1/12, Mar
1977 as 1977+2/12, ..., Dec 1977 as 1977+11/12, etc. so the floor and
ceiling operations on the start and end of the series can be used to
form a sequence from the beginning of the first year to the end of the
last.

Finally we merge z together with a zero width series (i.e. no data,
only times) and then form the result into a matrix with the
appropriate dimension names.
> DF     Jan  Feb Mar  Apr  May  Jun  Jul  Aug  Sep Oct  Nov Dec
1977  NA 17.4  34 26.2 42.6 58.6 23.2 26.8 48.4  47 37.2  15
1978 2.6  6.8   9 46.6   NA   NA   NA   NA   NA  NA   NA  NA


On Mon, Feb 15, 2010 at 1:07 AM, RagingJim <nowa0021 at flinders.edu.au>
wrote:>
> Dear anyone who knows more about R than me (so everyone). I have been
bashing
> my head on the keyboard all day trying to do something with my table.
>
> I have some data, like so:
> ? ? yyyy-mm ? ? Rainfall(mm)
> 1 ? 1977-02 ? ? ? ?17.4
> 2 ? 1977-03 ? ? ? ?34.0
> 3 ? 1977-04 ? ? ? ?26.2
> 4 ? 1977-05 ? ? ? ?42.6
> 5 ? 1977-06 ? ? ? ?58.6
> 6 ? 1977-07 ? ? ? ?23.2
> 7 ? 1977-08 ? ? ? ?26.8
> 8 ? 1977-09 ? ? ? ?48.4
> 9 ? 1977-10 ? ? ? ?47.0
> 10 ?1977-11 ? ? ? ?37.2
> 11 ?1977-12 ? ? ? ?15.0
> 12 ?1978-01 ? ? ? ? 2.6
> 13 ?1978-02 ? ? ? ? 6.8
> 14 ?1978-03 ? ? ? ? 9.0
> 15 ?1978-04 ? ? ? ?46.6
>
> The data continues on for x number of hundreds of data points. Simply put,
I
> need to make that data.frame into this data.frame/table/matrix/grid/... you
> get the picture.
>
> ? ? ? ? Jan ?Feb ?Mar ... etc
> Year ? Rain Rain Rain
> Year ? Rain Rain Rain
> Year ? Rain Rain Rain
> Year ? etc...
> Year
> Year
>
> How on earth do I do it? I have made little to no progress on it all day.
> Also, just like all the other problems, the year and month will change
every
> time depending upon which csv file or sql query I load into the program. If
> anyone has any pointers, that would be awesome.
>
> Cheers lads.
> --
> View this message in context:
http://n4.nabble.com/Separating-columns-and-sorting-by-rows-tp1555806p1555806.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Feb 2010 - Separating columns, and sorting by rows

[R] Separating columns, and sorting by rows

[R] Separating columns, and sorting by rows

[R] Separating columns, and sorting by rows

[R] Separating columns, and sorting by rows

[R] Separating columns, and sorting by rows

Possibly Parallel Threads