thr3ads.net - R help - [R] Help with reshape/reShape and indexing [May 2009]

If this information is useful, please help other people find it:
Share via:

Dana Sevak

2009-May-13 06:02 UTC

[R] Help with reshape/reShape and indexing

Dear R Helpers,

I have trouble applying reShape and reshape although I read the documentation
and several posts, so I would very much appreciate your help on the two points
below.

I have a dataframe

df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13", "14", "20", "25", "30"),
X2 = c(200, 250, 300, 600, 700, 4))
> df  Name X1  X2
1    a 12 200
2    a 13 250
3    a 14 300
4    b 20 600
5    b 25 700
6    c 30 900

First I need to add an additional column to this dataframe that will count the
number of rows per each Name entry.  The resulting df should look like:

df.index = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 =
c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))
> df.index  Name X1  X2    Index
1    a 12 200    1
2    a 13 250    2
3    a 14 300    3
4    b 20 600    1
5    b 25 700    2
6    c 30 900    1

How can I do this?


Secondly, I would like to reshape this dataframe in the form:
> df2   1  2  3
a 12 13 14
b 20 25 NA
c 30 NA NA

Since the df is sorted by Name and X2, I would need that the available X1 values
populate the resulting rows in df2 from left to right (i.e. if only one value is
available, it is written in the first column and the remaining columns get NAs).
If I could generate the Index column, I think I could accomplish this with:

df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
colnames(df2) = c("V1", "V2", "V3")

However, is there a way to get to df2 without using the Index column and still
have the NAs written as described above?

Thank you so much for your help on these two issues.

With best regards,
Dana Sevak

Uwe Ligges

2009-May-13 09:14 UTC

head link

[R] Help with reshape/reShape and indexing

Dana Sevak wrote:> Dear R Helpers,
> 
> I have trouble applying reShape and reshape although I read the
documentation and several posts, so I would very much appreciate your help on
the two points below.
> 
> I have a dataframe
> 
> df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13", "14", "20", "25", "30"),
X2 = c(200, 250, 300, 600, 700, 4))
> 
>> df
>   Name X1  X2
> 1    a 12 200
> 2    a 13 250
> 3    a 14 300
> 4    b 20 600
> 5    b 25 700
> 6    c 30 900
> 
> First I need to add an additional column to this dataframe that will count
the number of rows per each Name entry.  The resulting df should look like:
> 
> df.index = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 =
c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))
> 
>> df.index
>   Name X1  X2    Index
> 1    a 12 200    1
> 2    a 13 250    2
> 3    a 14 300    3
> 4    b 20 600    1
> 5    b 25 700    2
> 6    c 30 900    1
> 
> How can I do this?
> 
> 
> Secondly, I would like to reshape this dataframe in the form:
> 
>> df2
>    1  2  3
> a 12 13 14
> b 20 25 NA
> c 30 NA NA

This does it more or less your way:

ds <- split(df, df$Name)
ds <- lapply(ds, function(x){x$Index <- seq_along(x[,1]); x})
df2 <- unsplit(ds, df$Name)
tapply(df2$X1, df2[,c("Name", "Index")], function(x) x)

athough there may exist much easier ways ...

Uwe Ligges


> Since the df is sorted by Name and X2, I would need that the available X1
values populate the resulting rows in df2 from left to right (i.e. if only one
value is available, it is written in the first column and the remaining columns
get NAs).  If I could generate the Index column, I think I could accomplish this
with:
> 
> df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
> colnames(df2) = c("V1", "V2", "V3")
> 
> However, is there a way to get to df2 without using the Index column and
still have the NAs written as described above?
> 
> Thank you so much for your help on these two issues.
> 
> With best regards,
> Dana Sevak
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Dimitris Rizopoulos

2009-May-13 09:28 UTC

head link

[R] Help with reshape/reShape and indexing

one way is the following:

df.index <- df
df.index$Index <- ave(seq_along(df$Name), df$Name, FUN = seq_along)
df.index

df2 <- reshape(df.index[c("Name", "Index",
"X1")], timevar = "Index",
idvar = "Name", direction = "wide")
df2


I hope it helps.

Best,
Dimitris


Dana Sevak wrote:> Dear R Helpers,
> 
> I have trouble applying reShape and reshape although I read the
documentation and several posts, so I would very much appreciate your help on
the two points below.
> 
> I have a dataframe
> 
> df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13", "14", "20", "25", "30"),
X2 = c(200, 250, 300, 600, 700, 4))
> 
>> df
>   Name X1  X2
> 1    a 12 200
> 2    a 13 250
> 3    a 14 300
> 4    b 20 600
> 5    b 25 700
> 6    c 30 900
> 
> First I need to add an additional column to this dataframe that will count
the number of rows per each Name entry.  The resulting df should look like:
> 
> df.index = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 =
c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))
> 
>> df.index
>   Name X1  X2    Index
> 1    a 12 200    1
> 2    a 13 250    2
> 3    a 14 300    3
> 4    b 20 600    1
> 5    b 25 700    2
> 6    c 30 900    1
> 
> How can I do this?
> 
> 
> Secondly, I would like to reshape this dataframe in the form:
> 
>> df2
>    1  2  3
> a 12 13 14
> b 20 25 NA
> c 30 NA NA
> 
> Since the df is sorted by Name and X2, I would need that the available X1
values populate the resulting rows in df2 from left to right (i.e. if only one
value is available, it is written in the first column and the remaining columns
get NAs).  If I could generate the Index column, I think I could accomplish this
with:
> 
> df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
> colnames(df2) = c("V1", "V2", "V3")
> 
> However, is there a way to get to df2 without using the Index column and
still have the NAs written as described above?
> 
> Thank you so much for your help on these two issues.
> 
> With best regards,
> Dana Sevak
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

Gabor Grothendieck

2009-May-13 12:01 UTC

head link

[R] Help with reshape/reShape and indexing

Try this:

DF$Index <- ave(1:nrow(DF), DF$Name, FUN = seq_along)
reshape(DF[-3], dir = "wide", idvar = "Name", timevar =
"Index")

Also see the reshape package for another similar facility.



On Wed, May 13, 2009 at 2:02 AM, Dana Sevak <dana.sevak at yahoo.com>
wrote:>
> Dear R Helpers,
>
> I have trouble applying reShape and reshape although I read the
documentation and several posts, so I would very much appreciate your help on
the two points below.
>
> I have a dataframe
>
> df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13", "14", "20", "25", "30"),
X2 = c(200, 250, 300, 600, 700, 4))
>
>> df
> ?Name X1 ?X2
> 1 ? ?a 12 200
> 2 ? ?a 13 250
> 3 ? ?a 14 300
> 4 ? ?b 20 600
> 5 ? ?b 25 700
> 6 ? ?c 30 900
>
> First I need to add an additional column to this dataframe that will count
the number of rows per each Name entry. ?The resulting df should look like:
>
> df.index = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 =
c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))
>
>> df.index
> ?Name X1 ?X2 ? ?Index
> 1 ? ?a 12 200 ? ?1
> 2 ? ?a 13 250 ? ?2
> 3 ? ?a 14 300 ? ?3
> 4 ? ?b 20 600 ? ?1
> 5 ? ?b 25 700 ? ?2
> 6 ? ?c 30 900 ? ?1
>
> How can I do this?
>
>
> Secondly, I would like to reshape this dataframe in the form:
>
>> df2
> ? 1 ?2 ?3
> a 12 13 14
> b 20 25 NA
> c 30 NA NA
>
> Since the df is sorted by Name and X2, I would need that the available X1
values populate the resulting rows in df2 from left to right (i.e. if only one
value is available, it is written in the first column and the remaining columns
get NAs). ?If I could generate the Index column, I think I could accomplish this
with:
>
> df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
> colnames(df2) = c("V1", "V2", "V3")
>
> However, is there a way to get to df2 without using the Index column and
still have the NAs written as described above?
>
> Thank you so much for your help on these two issues.
>
> With best regards,
> Dana Sevak
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Ista Zahn

2009-May-13 13:02 UTC

head link

[R] Help with reshape/reShape and indexing

Hi Dana,
> ---------- Forwarded message ----------
> From: Dana Sevak <dana.sevak at yahoo.com>
> To: r-help at r-project.org
> Date: Tue, 12 May 2009 23:02:00 -0700 (PDT)
> Subject: [R] Help with reshape/reShape and indexing
>
> Dear R Helpers,
>
> I have trouble applying reShape and reshape although I read the
documentation and several posts, so I would very much appreciate your help on
the two points below.
>There are usually many ways to accomplish any given task in R, and
which one you use is a matter of preference. I've settled on use the
reshape package for these kinds of tasks. If you're comfortable with
the solutions already suggested there's no need to continue reading.
Otherwise here's another approach:
> I have a dataframe
>
> df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13", "14", "20", "25", "30"),
X2 = c(200, 250, 300, 600, 700, 4))
>
> > df
> Name X1 ?X2
> 1 ? ?a 12 200
> 2 ? ?a 13 250
> 3 ? ?a 14 300
> 4 ? ?b 20 600
> 5 ? ?b 25 700
> 6 ? ?c 30 900
>
> First I need to add an additional column to this dataframe that will count
the number of rows per each Name entry. ?The resulting df should look like:
>
> df.index = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c(12, 13, 14, 20, 25, 30), X2 =
c(200, 250, 300, 600, 700, 4), Index=c(1,2,3,1,2,1))
>
> > df.index
> Name X1 ?X2 ? ?Index
> 1 ? ?a 12 200 ? ?1
> 2 ? ?a 13 250 ? ?2
> 3 ? ?a 14 300 ? ?3
> 4 ? ?b 20 600 ? ?1
> 5 ? ?b 25 700 ? ?2
> 6 ? ?c 30 900 ? ?1
>
> How can I do this?
>Easy enough with the plyr package (loaded with reshape):

df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13",
"14", "20", "25", "30"), X2 = c(200,
250, 300, 600, 700, 4))
library(reshape)
df$Index <- ddply(df, "Name", colwise(seq_along))[,1]
>
> Secondly, I would like to reshape this dataframe in the form:
>
> > df2
> ?1 ?2 ?3
> a 12 13 14
> b 20 25 NA
> c 30 NA NA
>
> Since the df is sorted by Name and X2, I would need that the available X1
values populate the resulting rows in df2 from left to right (i.e. if only one
value is available, it is written in the first column and the remaining columns
get NAs).
I don't really understand this. What happened to X2? Anyway, I would
do it like this:
> df$X2 <- NULL
> m.df <- melt(df, measure.vars="X1")
> df.final <- cast(m.df, ... ~ Index)
> df.final  Name variable  1    2    3
1    a       X1 12   13   14
2    b       X1 20   25 <NA>
3    c       X1 30 <NA> <NA>

But I don't see why you want to drop X2, so I would actually do

df = data.frame(Name=c("a", "a", "a",
"b", "b", "c"), X1=c("12",
"13",
"14", "20", "25", "30"), X2 = c(200,
250, 300, 600, 700, 4))
df$Index <- ddply(df, "Name", colwise(seq_along))[,1]
df$X2 <- as.character(df$X2)
m.df <- melt(df, measure.vars=c("X1","X2"))
df.final <- cast(m.df, ... ~ Index)
df.final
  Name variable   1    2    3
1    a       X1  12   13   14
2    a       X2 200  250  300
3    b       X1  20   25 <NA>
4    b       X2 600  700 <NA>
5    c       X1  30 <NA> <NA>
6    c       X2   4 <NA> <NA>

All the best,
Ista> ?If I could generate the Index column, I think I could accomplish this
with:
>
> df2 = reShape(df.index$X1, id=df.index$Name, colvar=df.index$Index)
> colnames(df2) = c("V1", "V2", "V3")
>
> However, is there a way to get to df2 without using the Index column and
still have the NAs written as described above?
>
> Thank you so much for your help on these two issues.
>
> With best regards,
> Dana Sevak

Dana Sevak

2009-May-13 15:34 UTC

head link

[R] Help with reshape/reShape and indexing

To all of you who answered me: Thank you so much!
Each approach taught me something new and I really appreciate your help!

Best regards,
Dana Sevak

Possibly Parallel Threads

Search for more apparently analagous threads

R help - May 2009 - Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

[R] Help with reshape/reShape and indexing

Possibly Parallel Threads