thr3ads.net - R help - [R] create a new dataframe with intervals and computing a weighted average for each of its rows [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Luis Miguel Cerchiaro Barros

2013-Nov-24 09:08 UTC

[R] create a new dataframe with intervals and computing a weighted average for each of its rows

I need you help with this problem, I have a data-frame like this:     
    BHID=c(43,43,43,43,44,44,44,44,44)   
FROM=c(50.9,46.7,44.2,43.1,52.3,51.9,49.3,46.2,42.38)   
TO=c(46.7,44.2,43.1,40.9,51.9,49.3,46.2,42.38,36.3)   
AR=c(45,46,0.0,38.45,50.05,22.9,0,25,9)    DF<-data.frame(BHID,FROM,TO,VALUE)
#add the length     DF$LENGTH=DF$FROM-DF$TO
where:
+ BHID: is the borehole identification+ FROM: is  the start for every interval+
TO: is the end for every interval+ AR: is the value of our variable+ LENGTH: is
the distance between FROM and TO
what I want, is create a data frame which is "normalized", it means
that every interval has the same length and the column **AR** is calculated as a
Weighted arithmetic mean from the old **AR** and  **LENGTH** as its weight.
For more clarity I going to show you how should look the desire data frame.
    BHID	FROM	TO	    AR     	LENGTH    43	    50.9	47.9	   45.0	    3.0    43	  
47.9	44.9	   45.6	    3.0    43	    44.9	41.9	   26.113      3.0    43	    41.9
40.9    38.45        1.0    44....
where:
1. AR is the Weighted arithmetic mean
I have to make a clarification about the result:
here I attached an example of my excel table with calculations:
    ROW_ID BHID	NEW_FROM NEW_TO	NEW_AR	OLD_FROM OLD_TO	WEIGHTS	OLD_AR    1	     
43	 50.9	      47.9	        45	         50.9	 46.7	           3.0	    45    2	  
43	 47.9	      44.9	        45.6	         50.9	 46.7	           1.2	    45    2	
43	 47.9	      44.9		                 46.7	 44.2	           1.8	    46    3	    
43	 44.9	      41.9	        26.113	  46.7	 44.2	           0.7	    46    3	     
43	 44.9	      41.9		                  44.2	 43.1	           1.1	    0    3	    
43	 44.9	      41.9		                 43.1	 40.9	           1.2	    38.45    4	 
43	 41.9	      40.9	        38.45	  43.1	 40.9	           1.0	    38.45

you see guys, the NEW_AR is the weighted mean of the OLD_AR and its weights are
in the column WEIGHTS.
If you see the column LENGTH in the original data frame you can see, that the
values are different, with the "normalization" we try to make that
LENGTH uniform, in this case we choose the value 3.0 of course the last value of
each borehole data could had a different LENGTH in this case 1.0
What I have done to achieve the result
OK guys in first place I have to say, I am not a professional and I am still
learning  how to use R,
my approximation is not elegant, I am trying to take the start and end of each
borehole and use the function skeleton what I wrote, to create an uniform
skeleton for the whole dataframe.
    skeleton<-function(DF,LEN){    # define function to create a new skeleton
divide.int<-function(FROM,TO,div){    n=as.integer((FROM-TO)/div)+1   
from=seq(FROM,(FROM-(n-1)*div),-div)   
to=seq(FROM-(n-(n-1))*div,FROM-(n-1)*div,-div)    to[n]=TO   
range<-data.frame(BHID=borehole_names[i,1],FROM=from,TO=to) # create a
data.frame class object    range<-range[!(range$FROM==range$TO),] # erase the
last value    }    # subset the data set for every borehole   
borehole_names<-unique(DF["BHID"]) # collars id with cores   
borehole_number<-nrow(borehole_names)  # collar number    #define an empty
data.frame    
borehole_Out<-data.frame(BHID=integer(),FROM=numeric(),TO=numeric())    #
initialize the counter    i=1    # from this point starts the
loop---------------    while(i<=borehole_number){    DFi <- subset(DF,
BHID %in% borehole_names[i,1]) # Individual data frame for each boreholes    #
take the beginning and end of every BOREHOLE    startBH<-head(DFi$FROM,1)   
endBH<-tail(DFi$TO,1)    # create the normalized intervals   
borehole_i<-divide.int(FROM=startBH,TO=endBH,div=LEN)   
borehole_Out<-rbind(borehole_Out,borehole_i)    i=i+1    }    borehole_Out   
}    # TEST------------------------------------------   
TEST<-skeleton(DF=DF,LEN=3.0)    TEST$LENGTH=TEST$FROM-TEST$TO
later I am trying to use the packages PLYR or DATA.TABLE to calculate the
weighted means in AR but as I said I just started to use R and don't
understand yet how this packages work
again thanks in advanced and sorry for my bumpy english



 		 	   		  
	[[alternative HTML version deleted]]

Bert Gunter

2013-Nov-24 15:19 UTC

head link

[R] create a new dataframe with intervals and computing a weighted average for each of its rows

This post is complete garbage, and a great example of why not
bothering to read or follow the posting guide will cause a post to be
ignored.

1. It was not posted in plain text as the posting guide asks.

2. dput() was not used to pass example data

3. It appears the OP has not done due diligence by going through the
Introduction to R or other online tutorials to learn how R works,
although the post was so garbled that I may be wrong about that. My
apology, if so.

Cheers,
Bert



On Sun, Nov 24, 2013 at 1:08 AM, Luis Miguel Cerchiaro Barros
<luis_cerchiaro at hotmail.com> wrote:>
>
>
>
> I need you help with this problem, I have a data-frame like this:
>     BHID=c(43,43,43,43,44,44,44,44,44)   
FROM=c(50.9,46.7,44.2,43.1,52.3,51.9,49.3,46.2,42.38)   
TO=c(46.7,44.2,43.1,40.9,51.9,49.3,46.2,42.38,36.3)   
AR=c(45,46,0.0,38.45,50.05,22.9,0,25,9)    DF<-data.frame(BHID,FROM,TO,VALUE)
#add the length     DF$LENGTH=DF$FROM-DF$TO
> where:
> + BHID: is the borehole identification+ FROM: is  the start for every
interval+ TO: is the end for every interval+ AR: is the value of our variable+
LENGTH: is the distance between FROM and TO
> what I want, is create a data frame which is "normalized", it
means that every interval has the same length and the column **AR** is
calculated as a Weighted arithmetic mean from the old **AR** and  **LENGTH** as
its weight.
> For more clarity I going to show you how should look the desire data frame.
>     BHID        FROM    TO          AR          LENGTH    43        50.9   
47.9       45.0     3.0    43       47.9        44.9       45.6     3.0    43   
44.9        41.9       26.113      3.0    43            41.9        40.9   
38.45        1.0    44....
> where:
> 1. AR is the Weighted arithmetic mean
> I have to make a clarification about the result:
> here I attached an example of my excel table with calculations:
>     ROW_ID BHID NEW_FROM NEW_TO NEW_AR  OLD_FROM OLD_TO WEIGHTS OLD_AR    1
43   50.9         47.9              45               50.9    46.7             
3.0      45    2                 43   47.9         44.9              45.6       
50.9    46.7              1.2      45    2                 43   47.9        
44.9                               46.7    44.2              1.8      46    3   
43   44.9         41.9              26.113    46.7   44.2              0.7     
46    3                 43   44.9         41.9                               
44.2   43.1              1.1      0    3                  43   44.9         41.9
43.1    40.9              1.2      38.45    4              43   41.9        
40.9              38.45     43.1   40.9              1.0      38.45
>
> you see guys, the NEW_AR is the weighted mean of the OLD_AR and its weights
are in the column WEIGHTS.
> If you see the column LENGTH in the original data frame you can see, that
the values are different, with the "normalization" we try to make that
LENGTH uniform, in this case we choose the value 3.0 of course the last value of
each borehole data could had a different LENGTH in this case 1.0
> What I have done to achieve the result
> OK guys in first place I have to say, I am not a professional and I am
still learning  how to use R,
> my approximation is not elegant, I am trying to take the start and end of
each borehole and use the function skeleton what I wrote, to create an uniform
skeleton for the whole dataframe.
>     skeleton<-function(DF,LEN){    # define function to create a new
skeleton    divide.int<-function(FROM,TO,div){   
n=as.integer((FROM-TO)/div)+1    from=seq(FROM,(FROM-(n-1)*div),-div)   
to=seq(FROM-(n-(n-1))*div,FROM-(n-1)*div,-div)    to[n]=TO   
range<-data.frame(BHID=borehole_names[i,1],FROM=from,TO=to) # create a
data.frame class object    range<-range[!(range$FROM==range$TO),] # erase the
last value    }    # subset the data set for every borehole   
borehole_names<-unique(DF["BHID"]) # collars id with cores   
borehole_number<-nrow(borehole_names)  # collar number    #define an empty
data.frame    
borehole_Out<-data.frame(BHID=integer(),FROM=numeric(),TO=numeric())    #
initialize the counter    i=1    # from this point starts the
loop---------------    while(i<=borehole_number){    DFi <- subset(DF,
BHID %in% borehole_names[i,1]) # Individual data frame for each boreholes    #
take the beginning and end of every BOREHOLE    startBH<-head(DFi$FROM,1)   
endBH<-t!
>  ail(DFi$TO,1)    # create the normalized intervals   
borehole_i<-divide.int(FROM=startBH,TO=endBH,div=LEN)   
borehole_Out<-rbind(borehole_Out,borehole_i)    i=i+1    }    borehole_Out   
}    # TEST------------------------------------------   
TEST<-skeleton(DF=DF,LEN=3.0)    TEST$LENGTH=TEST$FROM-TEST$TO
> later I am trying to use the packages PLYR or DATA.TABLE to calculate the
weighted means in AR but as I said I just started to use R and don't
understand yet how this packages work
> again thanks in advanced and sorry for my bumpy english
>
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

Possibly Parallel Threads

Search for more apparently analagous threads

R help - Nov 2013 - create a new dataframe with intervals and computing a weighted average for each of its rows

[R] create a new dataframe with intervals and computing a weighted average for each of its rows

[R] create a new dataframe with intervals and computing a weighted average for each of its rows

Possibly Parallel Threads