Hello Everyone,
would you be able to assist with some expertise on how to get the following done
in a way that can be applied to a data set with different dimensions and without
all the line items here?
we have:
id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of
course in real data set, usually in magnitude of 10000
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
? ? ? ? ?
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
of unique "letters" is less than 4000 in real data set and they are no
duplicates within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
? ? ? ? ? sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below
50 in real data set and they are no duplicates within same ID
data<-data.frame(id=id,letter=letter,weight=weight)
#goal is to get the following transformation where a column is added for each
unique letter and the weight is pulled into the column if the letter exist
within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described
here
datatransfer<-data.frame(data,apply(data[2],2,function(x)
ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="E",data$weight,NA)))
colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,
thanks
Andras?
Hi! Maybe this would do the trick: --- snip --- library(reshape2) # Use 'reshape2' library(dplyr) # Use 'dplyr' datatransfer<-data %>% mutate(letter2=letter) %>% dcast(id+letter~letter2, value.var="weight") --- snip --- Or did I misunderstood something? Best, Kimmo 2019-01-06, 13:16 +0000, Andras Farkas via R-help wrote:> Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with > different dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may > differ of course in real data set, usually in magnitude of 10000 > letter<- > c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s > ample(c("A","B","C","D","E"),2), > > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu > mber of unique "letters" is less than 4000 in real data set and they > are no duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), > sample(c(1:30),4),sample(c(1:30),4))#number of unique > weights is below 50 in real data set and they are no duplicates > within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added > for each unique letter and the weight is pulled into the column if > the letter exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Like this (using base R only)?
dat<-data.frame(id=id,letter=letter,weight=weight) # using your data
ud <- unique(dat$id)
ul = unique(dat$letter)
d <- with(dat,
data.frame(
letter = rep(ul, e = length(ud)),
id = rep(ud, length(ul))
) )
merge(dat[,c(2,1,3)],d, all.y = TRUE)
## resulting in:
letter id weight
1 A 1 25
2 A 2 28
3 A 3 14
4 A 4 27
5 A 5 NA
6 B 1 13
7 B 2 14
8 B 3 NA
9 B 4 15
10 B 5 2
11 C 1 NA
12 C 2 NA
13 C 3 NA
14 C 4 NA
15 C 5 25
16 D 1 24
17 D 2 18
18 D 3 NA
19 D 4 29
20 D 5 27
21 E 1 NA
22 E 2 2
23 E 3 20
24 E 4 25
25 E 5 28
Cheers,
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help at r-project.org> wrote:
> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
>
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
>
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they
are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
> sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
>
colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
... and my reordering of column indices was unnecessary:
merge(dat, d, all.y = TRUE)
will do.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <
r-help at r-project.org> wrote:
> Hello Everyone,
>
> would you be able to assist with some expertise on how to get the
> following done in a way that can be applied to a data set with different
> dimensions and without all the line items here?
>
> we have:
>
> id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ
> of course in real data set, usually in magnitude of 10000
>
>
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
>
>
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
> of unique "letters" is less than 4000 in real data set and they
are no
> duplicates within same ID
> weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
> sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is
> below 50 in real data set and they are no duplicates within same ID
>
>
> data<-data.frame(id=id,letter=letter,weight=weight)
>
> #goal is to get the following transformation where a column is added for
> each unique letter and the weight is pulled into the column if the letter
> exist within the ID, otherwise NA
> #so we would get datatransform like below but without the many steps
> described here
>
> datatransfer<-data.frame(data,apply(data[2],2,function(x)
> ifelse(x=="A",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="B",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="C",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="D",data$weight,NA)))
>
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
> ifelse(x=="E",data$weight,NA)))
>
>
colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
> much appreciate the help,
>
> thanks
>
> Andras
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Thanks Bert this will do...
Andras
Sent from Yahoo Mail on Android
On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter<bgunter.4567 at gmail.com>
wrote: ... and my reordering of column indices was unnecessary:??? merge(dat,
d, all.y = TRUE)will do.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <r-help at
r-project.org> wrote:
Hello Everyone,
would you be able to assist with some expertise on how to get the following done
in a way that can be applied to a data set with different dimensions and without
all the line items here?
we have:
id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of
course in real data set, usually in magnitude of 10000
letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2),
? ? ? ? ?
sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number
of unique "letters" is less than 4000 in real data set and they are no
duplicates within same ID
weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2),
? ? ? ? ? sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below
50 in real data set and they are no duplicates within same ID
data<-data.frame(id=id,letter=letter,weight=weight)
#goal is to get the following transformation where a column is added for each
unique letter and the weight is pulled into the column if the letter exist
within the ID, otherwise NA
#so we would get datatransform like below but without the many steps described
here
datatransfer<-data.frame(data,apply(data[2],2,function(x)
ifelse(x=="A",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="B",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="C",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="D",data$weight,NA)))
datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x)
ifelse(x=="E",data$weight,NA)))
colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E")
much appreciate the help,
thanks
Andras?
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]