Hello Everyone, would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here? we have: id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000 letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), ? ? ? ? ? sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), ? ? ? ? ? sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID data<-data.frame(id=id,letter=letter,weight=weight) #goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA #so we would get datatransform like below but without the many steps described here datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA))) colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") much appreciate the help, thanks Andras?
Hi! Maybe this would do the trick: --- snip --- library(reshape2) # Use 'reshape2' library(dplyr) # Use 'dplyr' datatransfer<-data %>% mutate(letter2=letter) %>% dcast(id+letter~letter2, value.var="weight") --- snip --- Or did I misunderstood something? Best, Kimmo 2019-01-06, 13:16 +0000, Andras Farkas via R-help wrote:> Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with > different dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may > differ of course in real data set, usually in magnitude of 10000 > letter<- > c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),s > ample(c("A","B","C","D","E"),2), > > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#nu > mber of unique "letters" is less than 4000 in real data set and they > are no duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), > sample(c(1:30),4),sample(c(1:30),4))#number of unique > weights is below 50 in real data set and they are no duplicates > within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added > for each unique letter and the weight is pulled into the column if > the letter exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data$weight,NA))) > datatransfer<- > data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Like this (using base R only)? dat<-data.frame(id=id,letter=letter,weight=weight) # using your data ud <- unique(dat$id) ul = unique(dat$letter) d <- with(dat, data.frame( letter = rep(ul, e = length(ud)), id = rep(ud, length(ul)) ) ) merge(dat[,c(2,1,3)],d, all.y = TRUE) ## resulting in: letter id weight 1 A 1 25 2 A 2 28 3 A 3 14 4 A 4 27 5 A 5 NA 6 B 1 13 7 B 2 14 8 B 3 NA 9 B 4 15 10 B 5 2 11 C 1 NA 12 C 2 NA 13 C 3 NA 14 C 4 NA 15 C 5 25 16 D 1 24 17 D 2 18 18 D 3 NA 19 D 4 29 20 D 5 27 21 E 1 NA 22 E 2 2 23 E 3 20 24 E 4 25 25 E 5 28 Cheers, Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help < r-help at r-project.org> wrote:> Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with different > dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ > of course in real data set, usually in magnitude of 10000 > > letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), > > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number > of unique "letters" is less than 4000 in real data set and they are no > duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), > sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is > below 50 in real data set and they are no duplicates within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added for > each unique letter and the weight is pulled into the column if the letter > exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
... and my reordering of column indices was unnecessary: merge(dat, d, all.y = TRUE) will do. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help < r-help at r-project.org> wrote:> Hello Everyone, > > would you be able to assist with some expertise on how to get the > following done in a way that can be applied to a data set with different > dimensions and without all the line items here? > > we have: > > id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ > of course in real data set, usually in magnitude of 10000 > > letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), > > sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number > of unique "letters" is less than 4000 in real data set and they are no > duplicates within same ID > weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), > sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is > below 50 in real data set and they are no duplicates within same ID > > > data<-data.frame(id=id,letter=letter,weight=weight) > > #goal is to get the following transformation where a column is added for > each unique letter and the weight is pulled into the column if the letter > exist within the ID, otherwise NA > #so we would get datatransform like below but without the many steps > described here > > datatransfer<-data.frame(data,apply(data[2],2,function(x) > ifelse(x=="A",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="B",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="C",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="D",data$weight,NA))) > datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) > ifelse(x=="E",data$weight,NA))) > > colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") > much appreciate the help, > > thanks > > Andras > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks Bert this will do... Andras Sent from Yahoo Mail on Android On Sun, Jan 6, 2019 at 1:09 PM, Bert Gunter<bgunter.4567 at gmail.com> wrote: ... and my reordering of column indices was unnecessary:??? merge(dat, d, all.y = TRUE)will do. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sun, Jan 6, 2019 at 5:16 AM Andras Farkas via R-help <r-help at r-project.org> wrote: Hello Everyone, would you be able to assist with some expertise on how to get the following done in a way that can be applied to a data set with different dimensions and without all the line items here? we have: id<-c(1,1,1,2,2,2,2,3,3,4,4,4,4,5,5,5,5)#length of unique IDs may differ of course in real data set, usually in magnitude of 10000 letter<-c(sample(c("A","B","C","D","E"),3),sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),2), ? ? ? ? ? sample(c("A","B","C","D","E"),4),sample(c("A","B","C","D","E"),4))#number of unique "letters" is less than 4000 in real data set and they are no duplicates within same ID weight<-c(sample(c(1:30),3),sample(c(1:30),4),sample(c(1:30),2), ? ? ? ? ? sample(c(1:30),4),sample(c(1:30),4))#number of unique weights is below 50 in real data set and they are no duplicates within same ID data<-data.frame(id=id,letter=letter,weight=weight) #goal is to get the following transformation where a column is added for each unique letter and the weight is pulled into the column if the letter exist within the ID, otherwise NA #so we would get datatransform like below but without the many steps described here datatransfer<-data.frame(data,apply(data[2],2,function(x) ifelse(x=="A",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="B",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="C",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="D",data$weight,NA))) datatransfer<-data.frame(datatransfer,apply(datatransfer[2],2,function(x) ifelse(x=="E",data$weight,NA))) colnames(datatransfer)<-c("id","weight","letter","A","B","C","D","E") much appreciate the help, thanks Andras? ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]