thr3ads.net - R help - [R] weighted average grouped by variables [Nov 2017]

If this information is useful, please help other people find it:
Share via:

Thierry Onkelinx

2017-Nov-09 14:17 UTC

[R] weighted average grouped by variables

Dear Massimo,

It seems straightforward to use weighted.mean() in a dplyr context

library(dplyr)
mydf %>%
  group_by(date_time, type) %>%
  summarise(vel = weighted.mean(speed, n_vehicles))

Best regards,



ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Kliniekstraat 25, B-1070 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

[image: Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging
in Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis. Vanaf
dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel.]
<https://overheid.vlaanderen.be/mobiliteitsplan-herman-teirlinckgebouw>
Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in
Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000
Brussel.

///////////////////////////////////////////////////////////////////////////////////////////
<https://www.inbo.be>

2017-11-09 14:16 GMT+01:00 Massimo Bressan <massimo.bressan at
arpa.veneto.it>:
> Hello
>
> an update about my question: I worked out the following solution (with the
> package "dplyr")
>
> library(dplyr)
>
> mydf%>%
> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
> group_by(date_time,type) %>%
> summarise(
> sum_n_times_speed=sum(speed_vehicles),
> n_vehicles=sum(n_vehicles),
> vel=sum(speed_vehicles)/sum(n_vehicles)
> )
>
>
> In fact I was hoping to manage everything in a "one-go": i.e.
without the
> need to create the "intermediate" variable called
"speed_vehicles" and with
> the use of the function weighted.mean()
>
> any hints for a different approach much appreciated
>
> thanks
>
>
>
> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
> A: "r-help" <r-help at r-project.org>
> Inviato: Gioved?, 9 novembre 2017 12:20:52
> Oggetto: weighted average grouped by variables
>
> hi all
>
> I have this dataframe (created as a reproducible example)
>
> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class >
c("POSIXct", "POSIXt"), tzone = ""),
> direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label =
c("A", "B"),
> class = "factor"),
> type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car",
> "light_duty", "heavy_duty", "motorcycle"),
class = "factor"),
> avg_speed = c(41.1029082774049, 40.3333333333333, 40.3157894736842,
> 36.0869565217391, 33.4065155807365, 37.6222222222222, 35.5),
> n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
> .Names = c("date_time", "direction", "type",
"speed", "n_vehicles"),
> row.names = c(NA, -7L),
> class = "data.frame")
>
> mydf
>
> and I need to get to this final result
>
> mydf_final<-structure(list(date_time = structure(c(1508238000,
> 1508238000, 1508238000, 1508238000), class = c("POSIXct",
"POSIXt"), tzone
> = ""),
> type = structure(c(1L, 2L, 3L, 4L), .Label = c("car",
"light_duty",
> "heavy_duty", "motorcycle"), class =
"factor"),
> weighted_avg_speed = c(36.39029, 38.56521, 37.53333, 36.08696),
> n_vehicles = c(1153L,69L,45L,23L)),
> .Names = c("date_time", "type",
"weighted_avg_speed", "n_vehicles"),
> row.names = c(NA, -4L),
> class = "data.frame")
>
> mydf_final
>
>
> my question:
> how to compute a weighted mean i.e. "weighted_avg_speed"
> from "speed" (the values whose weighted mean is to be computed)
and
> "n_vehicles" (the weights)
> grouped by "date_time" and "type"?
>
> to be noted the complication of the case "motorcycle" (not
present in both
> directions)
>
> any help for that?
>
> thank you
>
> max
>
>
>
> --
>
> ------------------------------------------------------------
> Massimo Bressan
>
> ARPAV
> Agenzia Regionale per la Prevenzione e
> Protezione Ambientale del Veneto
>
> Dipartimento Provinciale di Treviso
> Via Santa Barbara, 5/a
> 31100 Treviso, Italy
>
> tel: +39 0422 558545
> fax: +39 0422 558516
> e-mail: massimo.bressan at arpa.veneto.it
> ------------------------------------------------------------
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Massimo Bressan

2017-Nov-09 14:45 UTC

head link

[R] weighted average grouped by variables

hi thierry 

thanks for your reply 

yes, you are right, your solution is more straightforward 

best 


Da: "Thierry Onkelinx" <thierry.onkelinx at inbo.be> 
A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
Cc: "r-help" <r-help at r-project.org> 
Inviato: Gioved?, 9 novembre 2017 15:17:31 
Oggetto: Re: [R] weighted average grouped by variables 

Dear Massimo, 

It seems straightforward to use weighted.mean() in a dplyr context 

library(dplyr) 
mydf %>% 
group_by(date_time, type) %>% 
summarise(vel = weighted.mean(speed, n_vehicles)) 

Best regards, 



ir. Thierry Onkelinx 
Statisticus / Statistician 

Vlaamse Overheid / Government of Flanders 
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance 
thierry.onkelinx at inbo.be 
Kliniekstraat 25, B-1070 Brussel 
www.inbo.be 

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more than
asking him to perform a post-mortem examination: he may be able to say what the
experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner 
The combination of some data and an aching desire for an answer does not ensure
that a reasonable answer can be extracted from a given body of data. ~ John
Tukey
///////////////////////////////////////////////////////////////////////////////////////////


Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in Brussel
naar het Herman Teirlinckgebouw op de site Thurn & Taxis.
Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel. 

///////////////////////////////////////////////////////////////////////////////////////////



-- 

------------------------------------------------------------ 
Massimo Bressan 

ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 

Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 

tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 

	[[alternative HTML version deleted]]

Apparently Analagous Threads

Search for more apparently analagous threads

R help - Nov 2017 - weighted average grouped by variables

[R] weighted average grouped by variables

[R] weighted average grouped by variables

Apparently Analagous Threads