albechan
2010-Nov-18 16:10 UTC
[R] conditional mean between two data frames with different levels
Hi guys, I have two data frames: one referred to 2008 and one to 2009. Their structure is identical except for the different data in them. I need to create a vector alfa of the same length of the dataframe 2009 and fill each element with the mean of 2008$var1 conditional to the subgroup indicated by a factor variable in 2009$var2. In this case it would be easy to use the function alfa[i]<-ave(2008$var1,2008$var2==2009$var2[i],FUN=mean). The problem is that 2008$var2 and 2009$var2 contain both 20 levels each but only 18 of them are shared. So for those 18 I need to find the result that I`d get applying the above formula (which in any case doesnt work if the levels are not identical in the two data frames colmns) and for those two different levels in 2009$var2 to use the average of the whole column 2008$var2. Anybody has some ideas? Please help me... Hope it`s clear enough what I need. Thanks! alberto -- View this message in context: http://r.789695.n4.nabble.com/conditional-mean-between-two-data-frames-with-different-levels-tp3049010p3049010.html Sent from the R help mailing list archive at Nabble.com.
Joshua Wiley
2010-Nov-18 16:50 UTC
[R] conditional mean between two data frames with different levels
Hi Alberto, It would help if you could provide a small example. I might break the problem down into three parts: 1) create a vector that has the final subgroupings you want 2) find the conditional means by subgroup 3) replicate the means as needed. My first guess would be start with: "==" or "%in%" to compare or find levels from 2008 %in% 2009 by(Data, GroupingVar, FUN = mean) This is probably not the best way, but since "GroupingVar" should be a factor, I would be tempted to do: tmp <- (factor(GroupingVar, levels = levels(GroupingVar), labels ResultsofByCall) YourMeans <- as.numeric(levels(tmp))[tmp] as a way to map the means back to their appropriate subcondition replicated as many times as necessary. I'm sure you will get more detailed help if you can post a bit of sample data. HTH, Josh On Thu, Nov 18, 2010 at 8:10 AM, albechan <alberto.casetta at satt.biz> wrote:> > Hi guys, I have two data frames: one referred to 2008 and one to 2009. Their > structure is identical except for the different data in them. > I need to create a vector alfa of the same length of the dataframe 2009 and > fill each element with the mean of 2008$var1 conditional to the subgroup > indicated by a factor variable in 2009$var2. > In this case it would be easy to use the function > alfa[i]<-ave(2008$var1,2008$var2==2009$var2[i],FUN=mean). > The problem is that 2008$var2 and 2009$var2 contain both 20 levels each but > only 18 of them are shared. So for those 18 I need to find the result that > I`d get applying the above formula (which in any case doesnt work if the > levels are not identical in the two data frames colmns) and for those two > different levels in 2009$var2 to use the average of the whole column > 2008$var2. > Anybody has some ideas? Please help me... > Hope it`s clear enough what I need. > Thanks! > alberto > -- > View this message in context: http://r.789695.n4.nabble.com/conditional-mean-between-two-data-frames-with-different-levels-tp3049010p3049010.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/
albechan
2010-Nov-18 17:17 UTC
[R] conditional mean between two data frames with different levels
Thank you very much Josh, I guess you`re right. So this is an example: data frame 1 has 2 columns and 10 rows. The first column is "score" a variable indicating the number of goals scored by a football team score<-c(1,2,0,2,1,1,3,2,1,0), column 2 contains the football "teams " where teams<-c(a,b,c,d,e,a,b,c,d,e). Data frame 2 has the following variables: score<-c(2,3,1,0,0,0,4,2,1,2) and "teams"<-c(b,c,d,e,f,b,c,d,e,f). What I need is to create a vector "alfa"<-numeric(10) where the first element contains the mean of the number of goals scored by team b in the previous season, the second element contains the mean of the number of goals scored by team c in the previous season and so on. In correspondance of team f, the average of the whole score vector of the previous season. alfa should be (2.5, 1, 1.5, 0.5, 1.3, 2.5, 1, 1.5, 0.5, 1.3) The problem arises because "f" doesnt appear in the first data frame as it replaced "a". Hope the issue is more understandable now. Thanks a lot! alberto -- View this message in context: http://r.789695.n4.nabble.com/conditional-mean-between-two-data-frames-with-different-levels-tp3049010p3049171.html Sent from the R help mailing list archive at Nabble.com.