Xianming Wei
2015-Aug-11 01:05 UTC
[R] replacing then summing by values from another dataframe
[I might have sent the following request to a wrong email address -
'r-help-request at r-project.org']
Hi,
I have two data frame dat1 and dat2.
dat1 <- data.frame(pid = paste('C', 1:5, sep = ''),
m1 = c(2, 2, 1, -1, 0),
m2 = c(1, 0, 1, -1, 1),
m3 = c(0, 1, 1, -1, 0))
dat2 <- data.frame(mid = paste('m', 1:3, sep = ''),
'0' = c(-19.5482, -.512, -.492),
'1' = c(.007, 3.241, -2.256),
'2' =c(1.223, -4.490, 1.779))
names(dat2)[-1] <- c('0', '1', '2')
dat1 contains individuals with scores of three measurements (-1 represents
missing) and dat2 with the effect of the different levels of the three
measurements. What I'd like to do is to summise the effects of three
measurements based on the level effects. So C1 I want to get the values of dat2
for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and m3 at level 0 = -0.4920 and
sum them up as 3.972.
I can only think of a loop to do that at the moment. Because of much higher
dimensions of actual two datasets, I need help to come up with an efficient /
elegant approach.
Any help is much appreciated.
Regards,
Xianming
Regards,
Xianming
-------------------- Internet e-Mail Disclaimer --------------------
PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted with
it are intended solely for the use of the addressee(s) and may contain
information, which is confidential or privileged. If you are not the intended
recipient, be aware that any disclosure, copying, distribution, or use of the
contents of this information is prohibited. In such case, you should destroy
this message and kindly notify the sender by reply e-mail. The views and
opinions expressed in this e-mail are those of the sender and do not necessarily
reflect the views of the company.
VIRUSES: Email transmission cannot be guaranteed to be secure or error free, as
information may be intercepted, corrupted, lost, destroyed, arrive late or
incomplete or contain viruses. This email and any files attached to it have been
checked with virus detection software before transmission. You should
nonetheless carry out your own virus check before opening any attachment. Sugar
Research Australia Limited does not represent or warrant that files attached to
this email are free from computer viruses or other defects and accepts no
liability for any loss or damage that may be caused by software viruses
[[alternative HTML version deleted]]
PIKAL Petr
2015-Aug-11 06:24 UTC
[R] replacing then summing by values from another dataframe
Hi
Not sure about elegancy/efficiency.
library(reshape2)
dat3 <- melt(dat2)
dat3
mid variable value
1 m1 0 -19.5482
2 m2 0 -0.5120
3 m3 0 -0.4920
4 m1 1 0.0070
5 m2 1 3.2410
6 m3 1 -2.2560
7 m1 2 1.2230
8 m2 2 -4.4900
9 m3 2 1.7790
dat4<-melt(dat1)
Using pid as id variables
dat4$value[dat4$value== -1] <- NA
names(dat4)[2:3] <- c("mid","variable")
dat4
pid mid variable
1 C1 m1 2
2 C2 m1 2
3 C3 m1 1
4 C4 m1 NA
5 C5 m1 0
6 C1 m2 1
7 C2 m2 0
8 C3 m2 1
9 C4 m2 NA
10 C5 m2 1
11 C1 m3 0
12 C2 m3 1
13 C3 m3 1
14 C4 m3 NA
15 C5 m3 0
dat5 <- merge(dat4, dat3, all.x=T)
aggregate(dat5$value, list(dat5$pid), sum, na.rm=T)
Group.1 x
1 C1 3.9720
2 C2 -1.5450
3 C3 0.9920
4 C4 0.0000
5 C5 -16.7992
Cheers
Petr
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
> Xianming Wei
> Sent: Tuesday, August 11, 2015 3:05 AM
> To: R-help at r-project.org
> Subject: [R] replacing then summing by values from another dataframe
>
> [I might have sent the following request to a wrong email address -
> 'r-help-request at r-project.org']
>
> Hi,
>
>
>
> I have two data frame dat1 and dat2.
>
>
>
> dat1 <- data.frame(pid = paste('C', 1:5, sep = ''),
>
> m1 = c(2, 2, 1, -1, 0),
>
> m2 = c(1, 0, 1, -1, 1),
>
> m3 = c(0, 1, 1, -1, 0))
>
> dat2 <- data.frame(mid = paste('m', 1:3, sep = ''),
>
> '0' = c(-19.5482, -.512,
-.492),
>
> '1' = c(.007, 3.241, -2.256),
>
> '2' =c(1.223, -4.490, 1.779))
> names(dat2)[-1] <- c('0', '1', '2')
>
>
>
> dat1 contains individuals with scores of three measurements (-1
> represents missing) and dat2 with the effect of the different levels of
> the three measurements. What I'd like to do is to summise the effects
> of three measurements based on the level effects. So C1 I want to get
> the values of dat2 for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and
> m3 at level 0 = -0.4920 and sum them up as 3.972.
>
>
>
> I can only think of a loop to do that at the moment. Because of much
> higher dimensions of actual two datasets, I need help to come up with
> an efficient / elegant approach.
>
>
>
> Any help is much appreciated.
>
>
>
>
>
> Regards,
>
> Xianming
>
>
>
________________________________
Tento e-mail a jak?koliv k n?mu p?ipojen? dokumenty jsou d?v?rn? a jsou ur?eny
pouze jeho adres?t?m.
Jestli?e jste obdr?el(a) tento e-mail omylem, informujte laskav? neprodlen? jeho
odes?latele. Obsah tohoto emailu i s p??lohami a jeho kopie vyma?te ze sv?ho
syst?mu.
Nejste-li zam??len?m adres?tem tohoto emailu, nejste opr?vn?ni tento email
jakkoliv u??vat, roz?i?ovat, kop?rovat ?i zve?ej?ovat.
Odes?latel e-mailu neodpov?d? za eventu?ln? ?kodu zp?sobenou modifikacemi ?i
zpo?d?n?m p?enosu e-mailu.
V p??pad?, ?e je tento e-mail sou??st? obchodn?ho jedn?n?:
- vyhrazuje si odes?latel pr?vo ukon?it kdykoliv jedn?n? o uzav?en? smlouvy, a
to z jak?hokoliv d?vodu i bez uveden? d?vodu.
- a obsahuje-li nab?dku, je adres?t opr?vn?n nab?dku bezodkladn? p?ijmout;
Odes?latel tohoto e-mailu (nab?dky) vylu?uje p?ijet? nab?dky ze strany p??jemce
s dodatkem ?i odchylkou.
- trv? odes?latel na tom, ?e p??slu?n? smlouva je uzav?ena teprve v?slovn?m
dosa?en?m shody na v?ech jej?ch n?le?itostech.
- odes?latel tohoto emailu informuje, ?e nen? opr?vn?n uzav?rat za spole?nost
??dn? smlouvy s v?jimkou p??pad?, kdy k tomu byl p?semn? zmocn?n nebo p?semn?
pov??en a takov? pov??en? nebo pln? moc byly adres?tovi tohoto emailu p??padn?
osob?, kterou adres?t zastupuje, p?edlo?eny nebo jejich existence je adres?tovi
?i osob? j?m zastoupen? zn?m?.
This e-mail and any documents attached to it may be confidential and are
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender.
Delete the contents of this e-mail with all attachments and its copies from your
system.
If you are not the intended recipient of this e-mail, you are not authorized to
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by
modifications of the e-mail or by delay with transfer of the email.
In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately
accept such offer; The sender of this e-mail (offer) excludes any acceptance of
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into
any contracts on behalf of the company except for cases in which he/she is
expressly authorized to do so in writing, and such authorization or power of
attorney is submitted to the recipient or the person represented by the
recipient, or the existence of such authorization is known to the recipient of
the person represented by the recipient.
Gerrit Eichner
2015-Aug-11 07:05 UTC
[R] replacing then summing by values from another dataframe
Hello, Xianming,
I have changed your (particular) data structure: use matrices because you
have only numeric scores and effects, use NA instead of -1 as missing
value (as usual), don't use columns for ids or row/column names (except
for the easy of reading the data structures), increase your score values
in dat1 by 1 to obtain valid column indices for dat2. Finally, loop (!)
rowwise through your matrix dat1 and construct an index-matrix (!) to
index dat2 (and sum up the indexed elements). Hope this does what you
want. (See below.)
The same remark regarding elegancy/efficiency applies as in Petr's
solution (but w/o an additional package ;-)).
dat1 <- cbind( c(2, 2, 1, NA, 0),
c(1, 0, 1, NA, 1),
c(0, 1, 1, NA, 0))
# dimnames( dat1) <- list( paste0( 'C', 1:5), paste0( "m",
1:3))
dat2 <- cbind( c(-19.5482, -.512, -.492),
c(.007, 3.241, -2.256),
c(1.223, -4.490, 1.779))
# rownames( dat2) <- paste0 ('m', 1:3)
apply( dat1 + 1, 1,
function( idx, d2)
sum( d2[ cbind( seq( nrow( d2)), idx)]),
d2 = dat2
)
Hth -- Gerrit
On Tue, 11 Aug 2015, Xianming Wei wrote:
> [I might have sent the following request to a wrong email address -
'r-help-request at r-project.org']
>
> Hi,
>
>
>
> I have two data frame dat1 and dat2.
>
>
>
> dat1 <- data.frame(pid = paste('C', 1:5, sep = ''),
>
> m1 = c(2, 2, 1, -1, 0),
>
> m2 = c(1, 0, 1, -1, 1),
>
> m3 = c(0, 1, 1, -1, 0))
>
> dat2 <- data.frame(mid = paste('m', 1:3, sep = ''),
>
> '0' = c(-19.5482, -.512, -.492),
>
> '1' = c(.007, 3.241, -2.256),
>
> '2' =c(1.223, -4.490, 1.779))
names(dat2)[-1] <- c('0', '1', '2')
>
>
>
> dat1 contains individuals with scores of three measurements (-1 represents
missing) and dat2 with the effect of the different levels of the three
measurements. What I'd like to do is to summise the effects of three
measurements based on the level effects. So C1 I want to get the values of dat2
for m1 at level 2 = 1.223, m2 at level 1 = 3.241 and m3 at level 0 = -0.4920 and
sum them up as 3.972.
>
>
>
> I can only think of a loop to do that at the moment. Because of much higher
dimensions of actual two datasets, I need help to come up with an efficient /
elegant approach.
>
>
>
> Any help is much appreciated.
>
>
>
>
>
> Regards,
> Xianming
>
>
> -------------------- Internet e-Mail Disclaimer --------------------
>
> PRIVILEGED - PRIVATE AND CONFIDENTIAL: This email and any files transmitted
with it are intended solely for the use of the addressee(s) and may contain
information, which is confidential or privileged. If you are not the intended
recipient, be aware that any disclosure, copying, distribution, or use of the
contents of this information is prohibited. In such case, you should destroy
this message and kindly notify the sender by reply e-mail. The views and
opinions expressed in this e-mail are those of the sender and do not necessarily
reflect the views of the company.
>
> VIRUSES: Email transmission cannot be guaranteed to be secure or error
free, as information may be intercepted, corrupted, lost, destroyed, arrive late
or incomplete or contain viruses. This email and any files attached to it have
been checked with virus detection software before transmission. You should
nonetheless carry out your own virus check before opening any attachment. Sugar
Research Australia Limited does not represent or warrant that files attached to
this email are free from computer viruses or other defects and accepts no
liability for any loss or damage that may be caused by software viruses
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.