thr3ads.net - R help - [R] reshaping data [Jul 2012]

If this information is useful, please help other people find it:
Share via:

AC Del Re

2012-Jul-25 00:59 UTC

[R] reshaping data

Hi,

I am trying to reshape data from a long to wide format but have a specific
task that I cannot get to output properly.

# SAMPLE DATA;
id <- c(1,2,2,3,3,3)
time <-c(0,0,5, 0, 2, 10)
x <- rnorm(length(id))
long <- data.frame(id,time,x)

# To reshape, I would like to exclude 'id' values that have NO duplicate
(i.e., remove
#  id=1 in this case). My attempts failed because the functions removes the
first value of each id (and I would like to preserve them if length of each
unique id >1), e.g.:

junk <- long[duplicated(long$id),]  # REMOVES TOO MANY ROWS!
junk <-subset(long, seq(id) - match(id,id) >0) # SAME

Essentially, I  would like to preserve all values of an id with more than
one row. Any ideas are much appreciated.

In addition, is there an easy way to create a new variable based on the
number of instances of each id (in the long dataset)? e.g.

id time           x                NEW_VARIABLE
1  1    0 -0.03921791     1
2  2    0 -1.07869262     1
3  2    5  1.73442621     2
4  3    0 -0.64356207     1
5  3    2  1.19691074     2
6  3   10  0.62035225    3

Thank you,

AC

	[[alternative HTML version deleted]]

Rui Barradas

2012-Jul-25 05:01 UTC

head link

[R] reshaping data

Hello,

Try the following.


# We are going to use this twice
sl <- split(long, long$id)

# Remove groups with only one row
l2 <- lapply(sl, function(x) if(nrow(x) > 1) x)
l2 <- do.call(rbind, l2)
l2

# Create a new variable
l3 <- lapply(sl, function(x) cbind(x, NEW_VARIABLE=seq_len(nrow(x))))
l3 <- do.call(rbind, l3)
l3


Hope this helps,

Rui Barradas

Em 25-07-2012 01:59, AC Del Re escreveu:> Hi,
>
> I am trying to reshape data from a long to wide format but have a specific
> task that I cannot get to output properly.
>
> # SAMPLE DATA;
> id <- c(1,2,2,3,3,3)
> time <-c(0,0,5, 0, 2, 10)
> x <- rnorm(length(id))
> long <- data.frame(id,time,x)
>
> # To reshape, I would like to exclude 'id' values that have NO
duplicate
> (i.e., remove
> #  id=1 in this case). My attempts failed because the functions removes the
> first value of each id (and I would like to preserve them if length of each
> unique id >1), e.g.:
>
> junk <- long[duplicated(long$id),]  # REMOVES TOO MANY ROWS!
> junk <-subset(long, seq(id) - match(id,id) >0) # SAME
>
> Essentially, I  would like to preserve all values of an id with more than
> one row. Any ideas are much appreciated.
>
> In addition, is there an easy way to create a new variable based on the
> number of instances of each id (in the long dataset)? e.g.
>
> id time           x                NEW_VARIABLE
> 1  1    0 -0.03921791     1
> 2  2    0 -1.07869262     1
> 3  2    5  1.73442621     2
> 4  3    0 -0.64356207     1
> 5  3    2  1.19691074     2
> 6  3   10  0.62035225    3
>
> Thank you,
>
> AC
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Jul 2012 - reshaping data

[R] reshaping data

[R] reshaping data

Apparently Analagous Threads