thr3ads.net - R help - [R] How to join data.frames and vectors of different length, in an inteligent way? [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Hvidberg, Martin

2008-Jun-10 13:05 UTC

[R] How to join data.frames and vectors of different length, in an inteligent way?

I have a data set something like this:

 

"YYYY", "Value"

1972 , 117

1984 , 73

1969 , 92

1976 , 113

1999 , 80

1996 , 78

1976 , 98

1984 , 106

1976 , 99

 

it could be created with:
> dafSamp <-
data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
 

The real dataset is of cause much larger, app. 100.000 samples

 

I need to adjust each value to remove any tendency of some years generally
having higher values and others lower, since this is an unwanted artifact from
different measuring traditions.

My plan is to generate an average for each year Ay, as well as a global average
Ag. Then each value should be multiplied by Ay/Ag.

 

 

I can make the averages like this:

 
> Ag <- mean(dafSamp[,2])
> Ag
[1] 95.11111

 
> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')
> Ay
  Group.1        x

1    1969  92.0000

2    1972 117.0000

3    1976 103.3333

4    1984  89.5000

5    1996  78.0000

6    1999  80.0000

 

 

To see how many samples from each year I could write:

 
> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]),
FUN='length')
> Cy
  Group.1 x

1    1969 1

2    1972 1

3    1976 3

4    1984 2

5    1996 1

6    1999 1

 

 

I would like to create a new vector with the adjusted values (dafSmap[,2] *
Ay(for a relevant year) / Ag)

 

I tried to write:

 

vecAA <- dafSamp[,2] *  Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag

 

but the result is all NAs :-( Might have seen that coming, Not the same
length...

 

Question: How do I go about making such calculation?

 

:-) Martin Hvidberg

 

Here is the code in full, if you want to try it...

 

dafSamp <-
data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))

Ag <- mean(dafSamp[,2])

Ag

Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')

Ay

Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='length')

Cy

vecAA <- dafSamp[,2] *  Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag

 




 
	University of Aarhus <http://www.au.dk/en>  	Danmarks Miljøundersøgelser
<http://www.dmu.dk/>
	
Hvidberg, Martin
<http://www2.dmu.dk/1_Om_DMU/2_medarbejdere/cv/employee2_NH.asp?PersonID=MHV>
Senior Geographer (Climatology, Spatial modeling)
<http://www.geogr.ku.dk/>
N 55°41m43.48s E 12°06m05.13s ETRS89
National Environmental Research Inst. <http://www.dmu.dk/International/>  
P.O. Box 358 
Frederiksborgvej 399 
DK-4000 Roskilde	
Martin.Hvidberg@dmu.dk 
www.dmu.dk/AtmosphericEnvironment/ 	tel:
fax: 	+45 46 30 11 55
+45 46 30 12 14 	
	
	
 

	[[alternative HTML version deleted]]

Chuck Cleland

2008-Jun-10 14:24 UTC

head link

[R] How to join data.frames and vectors of different length, in an inteligent way?

You could put the group averages back into dafSamp using ave():

dafSamp <- data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),
                  c(117,73,92,113,80,78,98,106,99)))

dafSamp$Ay <- ave(dafSamp$X2, dafSamp$X1, FUN=mean)

dafSamp$vecAA <- dafSamp$X2 * (dafSamp$Ay / mean(dafSamp$X2))

dafSamp
     X1  X2       Ay     vecAA
1 1972 117 117.0000 143.92640
2 1984  73  89.5000  68.69334
3 1969  92  92.0000  88.99065
4 1976 113 103.3333 122.76869
5 1999  80  80.0000  67.28972
6 1996  78  78.0000  63.96729
7 1976  98 103.3333 106.47196
8 1984 106  89.5000  99.74650
9 1976  99 103.3333 107.55841

?ave

On 6/10/2008 9:05 AM, Hvidberg, Martin wrote:> I have a data set something like this:
> 
>  
> 
> "YYYY", "Value"
> 
> 1972 , 117
> 
> 1984 , 73
> 
> 1969 , 92
> 
> 1976 , 113
> 
> 1999 , 80
> 
> 1996 , 78
> 
> 1976 , 98
> 
> 1984 , 106
> 
> 1976 , 99
> 
>  
> 
> it could be created with:
> 
>> dafSamp <-
data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
> 
>  
> 
> The real dataset is of cause much larger, app. 100.000 samples
> 
>  
> 
> I need to adjust each value to remove any tendency of some years generally
having higher values and others lower, since this is an unwanted artifact from
different measuring traditions.
> 
> My plan is to generate an average for each year Ay, as well as a global
average Ag. Then each value should be multiplied by Ay/Ag.
> 
>  
> 
>  
> 
> I can make the averages like this:
> 
>  
> 
>> Ag <- mean(dafSamp[,2])
> 
>> Ag
> 
> [1] 95.11111
> 
>  
> 
>> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]),
FUN='mean')
> 
>> Ay
> 
>   Group.1        x
> 
> 1    1969  92.0000
> 
> 2    1972 117.0000
> 
> 3    1976 103.3333
> 
> 4    1984  89.5000
> 
> 5    1996  78.0000
> 
> 6    1999  80.0000
> 
>  
> 
>  
> 
> To see how many samples from each year I could write:
> 
>  
> 
>> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]),
FUN='length')
> 
>> Cy
> 
>   Group.1 x
> 
> 1    1969 1
> 
> 2    1972 1
> 
> 3    1976 3
> 
> 4    1984 2
> 
> 5    1996 1
> 
> 6    1999 1
> 
>  
> 
>  
> 
> I would like to create a new vector with the adjusted values (dafSmap[,2] *
Ay(for a relevant year) / Ag)
> 
>  
> 
> I tried to write:
> 
>  
> 
> vecAA <- dafSamp[,2] *  Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
> 
>  
> 
> but the result is all NAs :-( Might have seen that coming, Not the same
length...
> 
>  
> 
> Question: How do I go about making such calculation?
> 
>  
> 
> :-) Martin Hvidberg
> 
>  
> 
> Here is the code in full, if you want to try it...
> 
>  
> 
> dafSamp <-
data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,73,92,113,80,78,98,106,99)))
> 
> Ag <- mean(dafSamp[,2])
> 
> Ag
> 
> Ay <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]), FUN='mean')
> 
> Ay
> 
> Cy <- aggregate(x=dafSamp[,2], by=list(dafSamp[,1]),
FUN='length')
> 
> Cy
> 
> vecAA <- dafSamp[,2] *  Ay[which(Ay[,1]==dafSamp[,1]),2] / Ag
> 
>  
> 
> 
> 
> 
>  
> 	University of Aarhus <http://www.au.dk/en>  	Danmarks
Milj?unders?gelser <http://www.dmu.dk/>
> 	
> Hvidberg, Martin
<http://www2.dmu.dk/1_Om_DMU/2_medarbejdere/cv/employee2_NH.asp?PersonID=MHV>
> Senior Geographer (Climatology, Spatial modeling)
<http://www.geogr.ku.dk/>
> N 55?41m43.48s E 12?06m05.13s ETRS89
> National Environmental Research Inst.
<http://www.dmu.dk/International/>
> P.O. Box 358 
> Frederiksborgvej 399 
> DK-4000 Roskilde	
> Martin.Hvidberg at dmu.dk 
> www.dmu.dk/AtmosphericEnvironment/ 	tel:
> fax: 	+45 46 30 11 55
> +45 46 30 12 14 	
> 
> 	[[alternative HTML version deleted]]
> 
> ------------------------------------------------------------------------
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

Hvidberg, Martin

2008-Jun-11 06:52 UTC

head link

[R] How to join data.frames and vectors of different length, in an inteligent way?

Thanks Chuck

With your help I managed to write the code as I wanted it.
The result looks like this:

dafSamp <-
data.frame(cbind(c(1972,1984,1969,1976,1999,1996,1976,1984,1976),c(117,7
3,92,113,80,78,98,106,99)))
dafSamp$Ay <- ave(dafSamp$X2, dafSamp$X1, FUN=mean)
dafSamp$AA <- dafSamp$X2 * (mean(dafSamp$X2)/dafSamp$Ay)
dafSamp$My <- ave(dafSamp$X2, dafSamp$X1, FUN=median)
dafSamp$MA <- dafSamp$X2 * (median(dafSamp$X2)/dafSamp$My)
par(mfrow=c(1,2))
boxplot(AA~X1, data=dafSamp, main="Mean mode")
boxplot(MA~X1, data=dafSamp, main="Median mode")

It works like a dream.Thanks for you time
Martin

Reasonably Related Threads

Search for more apparently analagous threads

R help - Jun 2008 - How to join data.frames and vectors of different length, in an inteligent way?

[R] How to join data.frames and vectors of different length, in an inteligent way?

[R] How to join data.frames and vectors of different length, in an inteligent way?

[R] How to join data.frames and vectors of different length, in an inteligent way?

Reasonably Related Threads