Dear All, I would very much appreciate the help with following: I need to calculate the mean of different lat/long points that should be grouped. However I would like that r excludes taking values that are different in only last decimal. So instead 4 values in the group it would calculate the mean for only 3( excluding the ones that differs in only one decimal). # construct the dataframe `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163) LAT <- c(55.07496,55.07496,55.02495,55.02496 ,54.97496,54.92495,54.97496,54.92496) LON <- c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774) df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON) I would like to group the data and calculate means by group but in a way to exclude every number that differs in only last decimal. Also please see pdf. example-attached . Many thanks! Best wishes, Sasha -- Dr Sasha Kosanic Ecology Lab (Biology Department) Room M644 University of Konstanz Universit?tsstra?e 10 D-78464 Konstanz Phone: +49 7531 883321 & +49 (0)175 9172503 http://cms.uni-konstanz.de/vkleunen/ https://tinyurl.com/y8u5wyoj https://tinyurl.com/cgec6tu -------------- next part -------------- A non-text attachment was scrubbed... Name: dataset example.pdf Type: application/pdf Size: 236074 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20181115/128cc993/attachment.pdf>
Use round() with the appropriate "digits" argument. Then use unique() to define your groups. HTH, B.> On 2018-11-15, at 11:48, sasa kosanic <sasa.kosanic at gmail.com> wrote: > > Dear All, > > I would very much appreciate the help with following: > I need to calculate the mean of different lat/long points that should be > grouped. > However I would like that r excludes taking values that are different in > only last decimal. > So instead 4 values in the group it would calculate the mean for only 3( > excluding the ones that differs in only one decimal). > # construct the dataframe > `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163) > LAT <- c(55.07496,55.07496,55.02495,55.02496 > ,54.97496,54.92495,54.97496,54.92496) > LON <- > c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774) > df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON) > > > I would like to group the data and calculate means by group but in a way to > exclude every number that differs in only last decimal. > > > Also please see pdf. example-attached . > > Many thanks! > Best wishes, > Sasha > > -- > > Dr Sasha Kosanic > Ecology Lab (Biology Department) > Room M644 > University of Konstanz > Universit?tsstra?e 10 > D-78464 Konstanz > Phone: +49 7531 883321 & +49 (0)175 9172503 > > http://cms.uni-konstanz.de/vkleunen/ > https://tinyurl.com/y8u5wyoj > https://tinyurl.com/cgec6tu > <dataset example.pdf>______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Thu, Nov 15, 2018 at 10:40 AM Boris Steipe <boris.steipe at utoronto.ca> wrote:> > Use round() with the appropriate "digits" argument. Then use unique() to define your groups.No.> round(c(.124,.126),2)[1] 0.12 0.13 As I understand it, the OP said he wanted the last decimal to be ignored. The OP also did not specify what he wanted to calculate means of. I assume TK-QUADRANT. It is also not clear whether the calculations are to be done separately by latitude and longitude, or both together. I'll assume separately. In which case, the calculation of TK-QUADRANT means by e.g. grouped according to 4 decimal digit values of latitude could be done using(using the provided example data): (Note: ignore all that follows if my interpretation is incorrect)> with(df, tapply(TK.QUADRANT, floor(1e4*LAT),mean))549249 549749 550249 550749 10158.5 10156.5 9163.5 9161.5 ## Note that this assumes positive values of latitude, because:> floor(c(-1.2,1.2))[1] -2 1 This could be easily modifed if both positive and negative values were used: e.g.> x <-c(-1.2,1.2) > sign(x)*floor(abs(x))[1] -1 1 Confession: I suspect that this exponentiate and floor() procedure might fail with lots of decimal places due to the usual issues of binary representations of decimals. But maybe it fails even here. If so, I would appreciate someone pointing this out and, if possible, providing a better strategy. Cheers, Bert> > HTH, > B. > > > > On 2018-11-15, at 11:48, sasa kosanic <sasa.kosanic at gmail.com> wrote: > > > > Dear All, > > > > I would very much appreciate the help with following: > > I need to calculate the mean of different lat/long points that should be > > grouped. > > However I would like that r excludes taking values that are different in > > only last decimal. > > So instead 4 values in the group it would calculate the mean for only 3( > > excluding the ones that differs in only one decimal). > > # construct the dataframe > > `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163) > > LAT <- c(55.07496,55.07496,55.02495,55.02496 > > ,54.97496,54.92495,54.97496,54.92496) > > LON <- > > c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774) > > df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON) > > > > > > I would like to group the data and calculate means by group but in a way to > > exclude every number that differs in only last decimal. > > > > > > Also please see pdf. example-attached . > > > > Many thanks! > > Best wishes, > > Sasha > > > > -- > > > > Dr Sasha Kosanic > > Ecology Lab (Biology Department) > > Room M644 > > University of Konstanz > > Universit?tsstra?e 10 > > D-78464 Konstanz > > Phone: +49 7531 883321 & +49 (0)175 9172503 > > > > http://cms.uni-konstanz.de/vkleunen/ > > https://tinyurl.com/y8u5wyoj > > https://tinyurl.com/cgec6tu > > <dataset example.pdf>______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Anthoni, Peter (IMK)
2018-Nov-16 06:39 UTC
[R] help with grouping data and calculating the means
Hi Sasa, Those latitude look equidistant with a separation of 0.05. I guess you want to calculate the zonal mean along the latitude, right? #estimate the lower and upper latitude for the cut lat.dist=0.05 #equidistant spacing along latitude lat.min=min(df$LAT,na.rm=T)-lat.dist/2 lat.max=max(df$LAT,na.rm=T)+lat.dist/2 cat.lat=cut(df$LAT,breaks=seq(lat.min,lat.max,by=lat.dist));cat.lat #just show which indices are grouped tapply(df$TK.QUADRANT,cat.lat, paste,collapse=",") #calculate the mean of whatever column. The lat.mean will have NA for any latitude cell where the df column has no data lat.mean=tapply(df$TK.QUADRANT,cat.lat, mean) #if you need to remove any potential NAs lat.mean[!is.na(lat.mean)] cheers/beste Gr??e Peter On 15. Nov 2018, at 17:48, sasa kosanic <sasa.kosanic at gmail.com<mailto:sasa.kosanic at gmail.com>> wrote: `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163) LAT <- c(55.07496,55.07496,55.02495,55.02496 ,54.97496,54.92495,54.97496,54.92496) LON <- c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774) df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON) [[alternative HTML version deleted]]