thr3ads.net - R help - [R] Reshape (pivot) question [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Lauri Nikkinen

2007-Feb-20 08:02 UTC

[R] Reshape (pivot) question

Hi R-users,

I have a data set like this (first ten rows):

    id patient_id date code class eala ID1564262 1562 6.4.2006 12:00 5555 1
NA ID1564262 1562 6.4.2006 12:00 5555 1 NA ID1564264 1365 14.2.2006 14:35
5555 1 50 ID1564265 1342 7.4.2006 14:30 2222 2 50 ID1564266 1648 7.4.200614:30
2222 2 50 ID1564267 1263 10.2.2006 15:45 2222 2 10 ID1564267 1263
10.2.200615:45
3333 3 10 ID1564269 5646 13.5.2006 17:02 3333 3 10 ID1564270 7561
13.5.200617:02
6666 1 10 ID1564271 1676 15.5.2006 20:41 2222 2 20

How can I do a new (pivot?) data.frame in R which I can achieve by MS SQL:

select  eala,
 datepart(month, date) as month,
 datepart(year, date) as year,
 count(distinct id) as id_count,
 count(distinct patient_id) as patient_count,
 count(distinct(case when class = 1 then code else null end)) as count_1,
 count(distinct(case when class = 2 then code else null end)) as count_2,
 count(distinct(case when class = 3 then code else null end)) as count_3,
into temp2
from temp1
group by datepart(month, date), datepart(year, date), eala
order by datepart(month, date), datepart(year, date), eala

I tried something like this but could not go further:

stats <- function(x) {
    count <- function(x) length(na.omit(x))
    c(
      n = count(x),
      uniikit = length(unique(x))
      )
}
library(reshape)
attach(dframe)
dfm <- melt(dframe, measure.var=c("id","patient_id"),
id.var=c("code",""this
should be month"",""this should be year),
variable_name="variable")

cast(dfm, code + month + year ~ variable, stats)

Regards,

Lauri

	[[alternative HTML version deleted]]

jim holtman

2007-Feb-20 13:12 UTC

head link

[R] Reshape (pivot) question

Haven't quite learned to 'cast' yet, but I have always used the
'apply'
functions for this type of processing:
> x <- "id patient_id date code class eala+     ID1564262 1562 6.4.200612:00 5555 1 NA
+     ID1564262 1562 6.4.200612:00 5555 1 NA
+     ID1564264 1365 14.2.200614:35 5555 1 50
+     ID1564265 1342 7.4.200614:30 2222 2 50
+     ID1564266 1648 7.4.200614:30 2222 2 50
+     ID1564267 1263 10.2.200615:45 2222 2 10
+     ID1564267 1263 10.2.200615:45 3333 3 10
+     ID1564269 5646 13.5.200617:02 3333 3 10
+     ID1564270 7561 13.5.200617:02 6666 1 10
+     ID1564271 1676 15.5.200620:41 2222 2 20">
> x.in <- read.table(textConnection(x), header=TRUE)
> # 'by' seems to drop NAs so convert to a character string for
processing
> x.in$eala <- ifelse(is.na(x.in$eala), "NA",
as.character(x.in$eala))
> # convert date to POSIXlt so we can access the year and month
> myDate <- strptime(x.in$date, "%d.%m.%Y%H:%M")
> x.in$year <- myDate$year + 1900
> x.in$month <- myDate$mon+1
> # split the data by eala, year, month and summarize
> x.by <- by(x.in, list(x.in$eala, x.in$year, x.in$month), function(x){+     data.frame(eala=x$eala[1], month=x$month[1], year=x$year[1],
+         icount=length(unique(x$id)), pcount=length(unique(x$patient_id)),
+         count1=sum(x$class == 1), count2=sum(x$class == 2),
count3=sum(x$class == 3))
+ })> # convert back to a data frame
> do.call(rbind, x.by)  eala month year icount pcount count1 count2 count3
1   10     2 2006      1      1      0      1      1
2   50     2 2006      1      1      1      0      0
3   50     4 2006      2      2      0      2      0
4   NA     4 2006      1      1      2      0      0
5   10     5 2006      2      2      1      0      1
6   20     5 2006      1      1      0      1      0>
>


On 2/20/07, Lauri Nikkinen <lauri.nikkinen@iki.fi>
wrote:>
> Hi R-users,
>
> I have a data set like this (first ten rows):
>
>    id patient_id date code class eala ID1564262 1562 6.4.2006 12:00 5555 1
> NA ID1564262 1562 6.4.2006 12:00 5555 1 NA ID1564264 1365 14.2.2006 14:35
> 5555 1 50 ID1564265 1342 7.4.2006 14:30 2222 2 50 ID1564266 1648
> 7.4.200614:30
> 2222 2 50 ID1564267 1263 10.2.2006 15:45 2222 2 10 ID1564267 1263
> 10.2.200615:45
> 3333 3 10 ID1564269 5646 13.5.2006 17:02 3333 3 10 ID1564270 7561
> 13.5.200617:02
> 6666 1 10 ID1564271 1676 15.5.2006 20:41 2222 2 20
>
> How can I do a new (pivot?) data.frame in R which I can achieve by MS SQL:
>
> select  eala,
> datepart(month, date) as month,
> datepart(year, date) as year,
> count(distinct id) as id_count,
> count(distinct patient_id) as patient_count,
> count(distinct(case when class = 1 then code else null end)) as count_1,
> count(distinct(case when class = 2 then code else null end)) as count_2,
> count(distinct(case when class = 3 then code else null end)) as count_3,
> into temp2
> from temp1
> group by datepart(month, date), datepart(year, date), eala
> order by datepart(month, date), datepart(year, date), eala
>
> I tried something like this but could not go further:
>
> stats <- function(x) {
>    count <- function(x) length(na.omit(x))
>    c(
>      n = count(x),
>      uniikit = length(unique(x))
>      )
> }
> library(reshape)
> attach(dframe)
> dfm <- melt(dframe,
measure.var=c("id","patient_id"), id.var=c
> ("code",""this
> should be month"",""this should be year),
variable_name="variable")
>
> cast(dfm, code + month + year ~ variable, stats)
>
> Regards,
>
> Lauri
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Feb 2007 - Reshape (pivot) question

[R] Reshape (pivot) question

[R] Reshape (pivot) question

Apparently Analagous Threads