Alessandra Bielli
2020-May-10 00:40 UTC
[R] predicting waste per capita - is a gaussian model correct?
Dear list, I am new to this list and I hope it is ok to post here even though I already posted this question on Cross Validated. I am trying to predict the daily amount of waste per person produced in the fishery sector. We surveyed fishing boats at the end of their fishing trip and the variables I have are duration of trip (days), number of fishers, waste category and waste weight (g), boat ID. For each fishing trip I calculated grams of waste per person per day, i.e. daily waste per capita. To predict daily waste per capita, I am using a gaussian mixed effect model with log(waste per capita) as response variable (I transformed it cause it was not normally distributed - and I'm not sure it's correct to do so). Explanatory variable is waste category and boat ID is a random effect. I use the predict function to estimate daily waste per capita for each category and then back transformed it with exp(...). My question is: is it correct to transform daily weight per capita to fit a gaussian model? Thanks so much for your advice! Alessandra [[alternative HTML version deleted]]
Jeff Newmiller
2020-May-10 01:23 UTC
[R] predicting waste per capita - is a gaussian model correct?
It could possibly be alright, except that: a) you included no reference to your other post b) you posted here using HTML format, which can severely corrupt what we see on this plain text only mailing list c) your question is off topic, as your question is about statistics (theory) rather than R (a syntax and semantics for implementing theory). So, no, not ok this time. On May 9, 2020 5:40:42 PM PDT, Alessandra Bielli <bielli.alessandra at gmail.com> wrote:>Dear list, > >I am new to this list and I hope it is ok to post here even though I >already posted this question on Cross Validated. > >I am trying to predict the daily amount of waste per person produced in >the >fishery sector. We surveyed fishing boats at the end of their fishing >trip >and the variables I have are duration of trip (days), number of >fishers, >waste category and waste weight (g), boat ID. > >For each fishing trip I calculated grams of waste per person per day, >i.e. >daily waste per capita. To predict daily waste per capita, I am using a >gaussian mixed effect model with log(waste per capita) as response >variable >(I transformed it cause it was not normally distributed - and I'm not >sure >it's correct to do so). Explanatory variable is waste category and boat >ID >is a random effect. I use the predict function to estimate daily waste >per >capita for each category and then back transformed it with exp(...). > >My question is: is it correct to transform daily weight per capita to >fit a >gaussian model? > >Thanks so much for your advice! > >Alessandra > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
John C Frain
2020-May-10 21:44 UTC
[R] predicting waste per capita - is a gaussian model correct?
On Sun, 10 May 2020 at 02:00, Alessandra Bielli <bielli.alessandra at gmail.com> wrote:> Dear list, > > I am new to this list and I hope it is ok to post here even though I > already posted this question on Cross Validated. > > I am trying to predict the daily amount of waste per person produced in the > fishery sector. We surveyed fishing boats at the end of their fishing trip > and the variables I have are duration of trip (days), number of fishers, > waste category and waste weight (g), boat ID. > > For each fishing trip I calculated grams of waste per person per day, i.e. > daily waste per capita. To predict daily waste per capita, I am using a > gaussian mixed effect model with log(waste per capita) as response variable > (I transformed it cause it was not normally distributed - and I'm not sure > it's correct to do so). Explanatory variable is waste category and boat ID > is a random effect. I use the predict function to estimate daily waste per > capita for each category and then back transformed it with exp(...). > > My question is: is it correct to transform daily weight per capita to fit a > gaussian model? > > Thanks so much for your advice! > > Alessandra >There is no requirement that the dependent variable in a "regression" type estimation follows a gaussian distribution. You need a model of the process and then use an estimation technique to estimate your model. If effects in your model are additive do not use a log transformation. If effects are multiplicative then use a log transformation. The choice should be determined by the mechanics of the problem and not by the statistics. If you do use a log transformation the trying to reverse the process using an exponential transformation will be biased. The extent of that bias depends on your problem and it would not be possible to estimate the significance of the bias without a much greater knowledge of the process and data. I would suggest that you consult a competent statistician. John C Frain 3 Aranleigh Park Rathfarnham Dublin 14 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com [[alternative HTML version deleted]]
Abby Spurdle
2020-May-10 22:56 UTC
[R] predicting waste per capita - is a gaussian model correct?
Well, this is 100% off-topic... And I wasn't planning to answer the OP's question. However, I disagree with your answer.> There is no requirement that the dependent variable in a "regression" type > estimation follows a gaussian distribution.False. It's depends on what type of '"regression" type estimation' one uses, among other things.> You need a model of the > process and then use an estimation technique to estimate your model. If > effects in your model are additive do not use a log transformation. If > effects are multiplicative then use a log transformation.The main question is, does the model satisfy the *assumptions*.> The choice > should be determined by the mechanics of the problem and not by the > statistics.While a mechanistic understanding is definitely valuable... If the criteria for a good model vs a bad model, was whether the model was consistent with mechanistic theory/understanding, then nearly every statistical model I've seen would be a bad model. I would say, a good model is one that is useful...> If you do use a log transformation the trying to reverse the > process using an exponential transformation will be biased. > The extent of > that bias depends on your problem and it would not be possible to estimate > the significance of the bias without a much greater knowledge of the > process and data.Never heard of this before... But I do note back-transformation is not trivial, and I'm not an expert on back-transformations.> I would suggest that you consult a competent > statistician.I agree on that part...