gj
2011-Oct-09 11:20 UTC
[R] help with statistics in R - how to measure the effect of users in groups
Hi, I'm a newbie to R. My knowledge of statistics is mostly self-taught. My problem is how to measure the effect of users in groups. I can calculate a particular attribute for a user in a group. But my hypothesis is that the user's attribute is not independent of each other and that the user's attribute depends on the group ie that user's behaviour change based on the group. Let me give an example: users*Group 1*Group 2*Group 3 u1*10*5*n/a u2*6*n/a*4 u3*5*2*3 For example, I want to be able to prove that u1 behaviour is different in group 1 than other groups and the particular thing about Group 1 is that users in Group 1 tend to have a higher value of the attribute under measurement. Hence, can use R to test my hypothesis. I'm willing to learn; so if this is very simple, just point me in the direction of any online resources about it. At the moment, I don't even how to define these class of problems? That will be a start. Regards Gawesh [[alternative HTML version deleted]]
Petr PIKAL
2011-Oct-10 08:32 UTC
[R] Odp: help with statistics in R - how to measure the effect of users in groups
Hi I do not understand much about your equations. I think you shall look to Practical Regression and Anova Using R from J.Faraway. Having data frame DF with columns - users, groups, results you could do fit <- lm(results~groups, data = DF) Regards Petr> > Hi, > > I'm a newbie to R. My knowledge of statistics is mostly self-taught. My > problem is how to measure the effect of users in groups. I can calculatea> particular attribute for a user in a group. But my hypothesis is thatthe> user's attribute is not independent of each other and that the user's > attribute depends on the group ie that user's behaviour change based onthe> group. > > Let me give an example: > > users*Group 1*Group 2*Group 3 > u1*10*5*n/a > u2*6*n/a*4 > u3*5*2*3 > > For example, I want to be able to prove that u1 behaviour is differentin> group 1 than other groups and the particular thing about Group 1 is that > users in Group 1 tend to have a higher value of the attribute under > measurement. > > > Hence, can use R to test my hypothesis. I'm willing to learn; so if thisis> very simple, just point me in the direction of any online resourcesabout> it. At the moment, I don't even how to define these class of problems?That> will be a start. > > Regards > Gawesh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
gj
2011-Oct-10 16:40 UTC
[R] help with statistics in R - how to measure the effect of users in groups
Hi Bert, The real situation is like what you suggested, user x group interactions. The users can be in more than one group. In fact, the data that I am trying to analyse constitute of users, online forums as groups and the attribute under measure is the number of posts made by each user in a particular forum. My hypothesis is that the number of posts a user makes to a forum is dependent on the forum. For example if the user is in a forum that is active he contributes more compared to when he is in a forum that is less active. I guess there will be some users who contribute the same irrespective of the forum. I hope this makes sense. Regards Gawesh On Mon, Oct 10, 2011 at 4:50 PM, Bert Gunter <gunter.berton@gene.com> wrote:> Yes, of course. But then one gets into additional problems with carryover > effects,etc. > Also, one then has a repeated measures problem (User is the experimental > unit) and my previous advice is nonsense, > > Like you, I have no idea what his real situation is. > > -- Bert > > > On Mon, Oct 10, 2011 at 8:39 AM, Anupam <anupamtg@gmail.com> wrote: > >> It is possible to give multiple treatments, one at a time, to same pool of >> patients. You are correct that interactions may be important in this >> problem. I am only trying to help him frame the problem using an analogy. >> **** >> >> ** ** >> >> Anupam.**** >> >> *From:* Bert Gunter [mailto:gunter.berton@gene.com] >> *Sent:* Monday, October 10, 2011 8:21 PM >> *To:* Anupam >> *Cc:* gj >> *Subject:* Re: [R] help with statistics in R - how to measure the effect >> of users in groups**** >> >> ** ** >> >> If that is the case, and each user can appear in only one group, there is >> no group x user interaction, the poster's question was nonsense, and one >> analyzes the group effect only, as originally shown >> >> -- Bert**** >> >> On Mon, Oct 10, 2011 at 7:43 AM, Anupam <anupamtg@gmail.com> wrote:**** >> >> Groups are different treatments given to Users for your Outcome >> (measurement) of interest. Take this idea forward and you will have an >> answer. >> >> Anupam. >> -----Original Message----- >> From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] >> On >> Behalf Of Bert Gunter >> Sent: Monday, October 10, 2011 7:36 PM >> To: gj >> Cc: r-help@r-project.org >> Subject: Re: [R] help with statistics in R - how to measure the effect of >> users in groups >> >> Assuming your data are in a data frame, yourdat, as: >> >> User Group Value >> u1 1 !0 >> u2 2 5 >> u3 3 NA >> ...(etc) >> >> where Group is **explicitly coerced to be a factor,** then you want the >> User >> x Group interaction, obtained from >> >> lm( Value ~ Group*User,data = yourdat) >> >> However, you'll get some kind of warning message if >> >> a) Not all Group x User combinations are present in the data >> >> b) Moreover, no statistics can be calculated if there are no replicates of >> UserxGroup combinations. >> >> If you do not know why either of these are the case, get local help or >> study >> any linear models (regression) text or online tutorial, as these last >> issues >> have nothing to do with R. >> >> -- Bert >> >> >> On Mon, Oct 10, 2011 at 3:48 AM, gj <gawesh@gmail.com> wrote: >> >> > Thanks Petr. I will try it on the real data. >> > >> > But that will only show that the groups are different or not. >> > Is there any way I can test if the users are different when they are >> > in different groups? >> > >> > Regards >> > Gawesh >> > >> > On Mon, Oct 10, 2011 at 11:17 AM, Petr PIKAL <petr.pikal@precheza.cz> >> > wrote: >> > >> > > > >> > > > Hi Petr, >> > > > >> > > > It's not an equation. It's my mistake; the * are meant to be field >> > > > separators for the example data. I should have just use blank >> > > > spaces as >> > > > follows: >> > > > >> > > > users Group1 Group2 Group3 >> > > > u1 10 5 N/A >> > > > u2 6 N/A 4 >> > > > u3 5 2 3 >> > > > >> > > > >> > > > Regards >> > > > Gawesh >> > > >> > > OK. You shall transform your data to long format to use lm >> > > >> > > test <- read.table("clipboard", header=T, na.strings="N/A") >> > > test.m<-melt(test) >> > > Using users as id variables >> > > fit<-lm(value~variable, data=test.m) >> > > summary(fit) >> > > >> > > Call: >> > > lm(formula = value ~ variable, data = test.m) >> > > >> > > Residuals: >> > > 1 2 3 4 6 8 9 >> > > 3.0 -1.0 -2.0 1.5 -1.5 0.5 -0.5 >> > > >> > > Coefficients: >> > > Estimate Std. Error t value Pr(>|t|) >> > > (Intercept) 7.000 1.258 5.563 0.00511 ** >> > > variableGroup2 -3.500 1.990 -1.759 0.15336 >> > > variableGroup3 -3.500 1.990 -1.759 0.15336 >> > > --- >> > > Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 >> > > >> > > Residual standard error: 2.179 on 4 degrees of freedom >> > > (2 observations deleted due to missingness) >> > > Multiple R-squared: 0.525, Adjusted R-squared: 0.2875 >> > > F-statistic: 2.211 on 2 and 4 DF, p-value: 0.2256 >> > > >> > > No difference among groups, but I am not sure if this is the correct >> > > way to evaluate. >> > > >> > > library(ggplot2) >> > > p<-ggplot(test.m, aes(x=variable, y=value, colour=users)) >> > > p+geom_point() >> > > >> > > There is some sign that user3 has lowest value in each group. >> > > However for including users to fit there is not enough data. >> > > >> > > Regards >> > > Petr >> > > >> > > >> > > > >> > > > >> > > > On Mon, Oct 10, 2011 at 9:32 AM, Petr PIKAL >> > > > <petr.pikal@precheza.cz> >> > > wrote: >> > > > >> > > > > Hi >> > > > > >> > > > > I do not understand much about your equations. I think you shall >> > > > > look >> > > to >> > > > > Practical Regression and Anova Using R from J.Faraway. >> > > > > >> > > > > Having data frame DF with columns - users, groups, results you >> > > > > could >> > > do >> > > > > >> > > > > fit <- lm(results~groups, data = DF) >> > > > > >> > > > > Regards >> > > > > Petr >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > > >> > > > > > Hi, >> > > > > > >> > > > > > I'm a newbie to R. My knowledge of statistics is mostly >> > self-taught. >> > > My >> > > > > > problem is how to measure the effect of users in groups. I can >> > > calculate >> > > > > a >> > > > > > particular attribute for a user in a group. But my hypothesis >> > > > > > is >> > > that >> > > > > the >> > > > > > user's attribute is not independent of each other and that the >> > > user's >> > > > > > attribute depends on the group ie that user's behaviour change >> > based >> > > on >> > > > > the >> > > > > > group. >> > > > > > >> > > > > > Let me give an example: >> > > > > > >> > > > > > users*Group 1*Group 2*Group 3 >> > > > > > u1*10*5*n/a >> > > > > > u2*6*n/a*4 >> > > > > > u3*5*2*3 >> > > > > > >> > > > > > For example, I want to be able to prove that u1 behaviour is >> > > different >> > > > > in >> > > > > > group 1 than other groups and the particular thing about Group >> > > > > > 1 is >> > > that >> > > > > > users in Group 1 tend to have a higher value of the attribute >> > > > > > under measurement. >> > > > > > >> > > > > > >> > > > > > Hence, can use R to test my hypothesis. I'm willing to learn; >> > > > > > so if >> > > this >> > > > > is >> > > > > > very simple, just point me in the direction of any online >> > > > > > resources >> > > > > about >> > > > > > it. At the moment, I don't even how to define these class of >> > > problems? >> > > > > That >> > > > > > will be a start. >> > > > > > >> > > > > > Regards >> > > > > > Gawesh >> > > > > > >> > > > > > [[alternative HTML version deleted]] >> > > > > > >> > > > > > ______________________________________________ >> > > > > > R-help@r-project.org mailing list >> > > > > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > > PLEASE do read the posting guide >> > > > > http://www.R-project.org/posting-guide.html >> > > > > > and provide commented, minimal, self-contained, reproducible >> code. >> > > > > >> > > > > >> > > > >> > > > [[alternative HTML version deleted]] >> > > > >> > > > ______________________________________________ >> > > > R-help@r-project.org mailing list >> > > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > > PLEASE do read the posting guide >> > > http://www.R-project.org/posting-guide.html >> > > > and provide commented, minimal, self-contained, reproducible code. >> > > >> > > >> > >> > [[alternative HTML version deleted]] >> > >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> > >> >> [[alternative HTML version deleted]] >> >> **** >> >> ** ** >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often be > impatient with elementary studies or fight shy of them. If it were possible > to reach the ultimate truths without the elementary studies usually prefixed > to them, these would not be preparatory studies but superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics > 467-7374 > > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm > > >[[alternative HTML version deleted]]