Xianming Wei
2015-Aug-11 01:05 UTC
[R] replacing then summing by values from another dataframe
[I might have sent the following request to a wrong email address - 'r-help-request at r-project.org'] Hi, I have two data frame dat1 and dat2. dat1 <- data.frame(pid = paste('C', 1:5, sep = ''), m1 = c(2, 2, 1, -1, 0), m2 = c(1, 0, 1, -1, 1), m3 = c(0, 1, 1, -1, 0)) dat2 <- data.frame(mid = paste('m', 1:3, sep = ''), '0' = c(-19.5482, -.512, -.492), '1' = c(.007, 3.241, -2.256), '2' =c(1.223, -4.490, 1.779)) names(dat2)[-1] <- c('0', '1', '2') dat1 contains individuals with scores of three measurements (-1 represents missing) and dat2 with the effect of the different levels of the three measurements. What I'd like to do is to summise the effects of three measurements based on the level effects. So C1 I want to get the values of dat2 for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and m3 at level 0 = -0.4920 and sum them up as 3.972. I can only think of a loop to do that at the moment. Because of much higher dimensions of actual two datasets, I need help to come up with an efficient / elegant approach. Any help is much appreciated. Regards, Xianming Regards, Xianming -------------------- Internet e-Mail Disclaimer -------------------- PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted with it are intended solely for the use of the addressee(s) and may contain information, which is confidential or privileged. If you are not the intended recipient, be aware that any disclosure, copying, distribution, or use of the contents of this information is prohibited. In such case, you should destroy this message and kindly notify the sender by reply e-mail. The views and opinions expressed in this e-mail are those of the sender and do not necessarily reflect the views of the company. VIRUSES: Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. Sugar Research Australia Limited does not represent or warrant that files attached to this email are free from computer viruses or other defects and accepts no liability for any loss or damage that may be caused by software viruses [[alternative HTML version deleted]]
PIKAL Petr
2015-Aug-11 06:24 UTC
[R] replacing then summing by values from another dataframe
Hi Not sure about elegancy/efficiency. library(reshape2) dat3 <- melt(dat2) dat3 mid variable value 1 m1 0 -19.5482 2 m2 0 -0.5120 3 m3 0 -0.4920 4 m1 1 0.0070 5 m2 1 3.2410 6 m3 1 -2.2560 7 m1 2 1.2230 8 m2 2 -4.4900 9 m3 2 1.7790 dat4<-melt(dat1) Using pid as id variables dat4$value[dat4$value== -1] <- NA names(dat4)[2:3] <- c("mid","variable") dat4 pid mid variable 1 C1 m1 2 2 C2 m1 2 3 C3 m1 1 4 C4 m1 NA 5 C5 m1 0 6 C1 m2 1 7 C2 m2 0 8 C3 m2 1 9 C4 m2 NA 10 C5 m2 1 11 C1 m3 0 12 C2 m3 1 13 C3 m3 1 14 C4 m3 NA 15 C5 m3 0 dat5 <- merge(dat4, dat3, all.x=T) aggregate(dat5$value, list(dat5$pid), sum, na.rm=T) Group.1 x 1 C1 3.9720 2 C2 -1.5450 3 C3 0.9920 4 C4 0.0000 5 C5 -16.7992 Cheers Petr> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of > Xianming Wei > Sent: Tuesday, August 11, 2015 3:05 AM > To: R-help at r-project.org > Subject: [R] replacing then summing by values from another dataframe > > [I might have sent the following request to a wrong email address - > 'r-help-request at r-project.org'] > > Hi, > > > > I have two data frame dat1 and dat2. > > > > dat1 <- data.frame(pid = paste('C', 1:5, sep = ''), > > m1 = c(2, 2, 1, -1, 0), > > m2 = c(1, 0, 1, -1, 1), > > m3 = c(0, 1, 1, -1, 0)) > > dat2 <- data.frame(mid = paste('m', 1:3, sep = ''), > > '0' = c(-19.5482, -.512, -.492), > > '1' = c(.007, 3.241, -2.256), > > '2' =c(1.223, -4.490, 1.779)) > names(dat2)[-1] <- c('0', '1', '2') > > > > dat1 contains individuals with scores of three measurements (-1 > represents missing) and dat2 with the effect of the different levels of > the three measurements. What I'd like to do is to summise the effects > of three measurements based on the level effects. So C1 I want to get > the values of dat2 for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and > m3 at level 0 = -0.4920 and sum them up as 3.972. > > > > I can only think of a loop to do that at the moment. Because of much > higher dimensions of actual two datasets, I need help to come up with > an efficient / elegant approach. > > > > Any help is much appreciated. > > > > > > Regards, > > Xianming > > >________________________________ Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny pouze jeho adres?t?m. Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho syst?mu. Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat. Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i zpo?d?n?m p?enosu e-mailu. V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?: - vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a to z jak?hokoliv d?vodu i bez uveden? d?vodu. - a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout; Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce s dodatkem ?i odchylkou. - trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m dosa?en?m shody na v?ech jej?ch n?le?itostech. - odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost ??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn? pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn? osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi ?i osob? j?m zastoupen? zn?m?. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.
Gerrit Eichner
2015-Aug-11 07:05 UTC
[R] replacing then summing by values from another dataframe
Hello, Xianming, I have changed your (particular) data structure: use matrices because you have only numeric scores and effects, use NA instead of -1 as missing value (as usual), don't use columns for ids or row/column names (except for the easy of reading the data structures), increase your score values in dat1 by 1 to obtain valid column indices for dat2. Finally, loop (!) rowwise through your matrix dat1 and construct an index-matrix (!) to index dat2 (and sum up the indexed elements). Hope this does what you want. (See below.) The same remark regarding elegancy/efficiency applies as in Petr's solution (but w/o an additional package ;-)). dat1 <- cbind( c(2, 2, 1, NA, 0), c(1, 0, 1, NA, 1), c(0, 1, 1, NA, 0)) # dimnames( dat1) <- list( paste0( 'C', 1:5), paste0( "m", 1:3)) dat2 <- cbind( c(-19.5482, -.512, -.492), c(.007, 3.241, -2.256), c(1.223, -4.490, 1.779)) # rownames( dat2) <- paste0 ('m', 1:3) apply( dat1 + 1, 1, function( idx, d2) sum( d2[ cbind( seq( nrow( d2)), idx)]), d2 = dat2 ) Hth -- Gerrit On Tue, 11 Aug 2015, Xianming Wei wrote:> [I might have sent the following request to a wrong email address - 'r-help-request at r-project.org'] > > Hi, > > > > I have two data frame dat1 and dat2. > > > > dat1 <- data.frame(pid = paste('C', 1:5, sep = ''), > > m1 = c(2, 2, 1, -1, 0), > > m2 = c(1, 0, 1, -1, 1), > > m3 = c(0, 1, 1, -1, 0)) > > dat2 <- data.frame(mid = paste('m', 1:3, sep = ''), > > '0' = c(-19.5482, -.512, -.492), > > '1' = c(.007, 3.241, -2.256), > > '2' =c(1.223, -4.490, 1.779)) names(dat2)[-1] <- c('0', '1', '2') > > > > dat1 contains individuals with scores of three measurements (-1 represents missing) and dat2 with the effect of the different levels of the three measurements. What I'd like to do is to summise the effects of three measurements based on the level effects. So C1 I want to get the values of dat2 for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and m3 at level 0 = -0.4920 and sum them up as 3.972. > > > > I can only think of a loop to do that at the moment. Because of much higher dimensions of actual two datasets, I need help to come up with an efficient / elegant approach. > > > > Any help is much appreciated. > > > > > > Regards, > Xianming > > > -------------------- Internet e-Mail Disclaimer -------------------- > > PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted with it are intended solely for the use of the addressee(s) and may contain information, which is confidential or privileged. If you are not the intended recipient, be aware that any disclosure, copying, distribution, or use of the contents of this information is prohibited. In such case, you should destroy this message and kindly notify the sender by reply e-mail. The views and opinions expressed in this e-mail are those of the sender and do not necessarily reflect the views of the company. > > VIRUSES: Email transmission cannot be guaranteed to be secure or error free, as information may be intercepted, corrupted, lost, destroyed, arrive late or incomplete or contain viruses. This email and any files attached to it have been checked with virus detection software before transmission. You should nonetheless carry out your own virus check before opening any attachment. Sugar Research Australia Limited does not represent or warrant that files attached to this email are free from computer viruses or other defects and accepts no liability for any loss or damage that may be caused by software viruses > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.