Hi, I have a very simple question, but I'm obviously not able to solve the problem on my own. I have a data.frame like sample(c("A","B","C"),size=20,replace = T)->type rnorm(20)->value data.frame(ty=type,val=value)->test There must be some built in functions, that will do some descriptive statistics with tabular output, in the end I like to have something like number of samples mean sd ............. A 5 B 9 C 6 So I need a function that counts the number of occurrences of factors in type and then does something like the *summary* function, but factor specific. I tried: vector()->Median vector()->SD vector()->Mean as.data.frame(table(type))->int for (count in c(1:(nrow(int)))) { subset(test, ty==as.character(int$type[count])) -> subtest median(subtest$val)->Median[count] sd(subtest$val)->SD[count] mean(subtest$val)->Mean[count] } cbind(int,Median,SD,Mean) This works, but: isn't this much too complicated, I bet there is such functionality embedded in the base packages, but I cannot find it. Maxim [[alternative HTML version deleted]]
There might be some package. But you can also do something like: results<-tapply(test$val,test$ty,function(x){ out1<-as.data.frame(length(x)) out2<-as.data.frame(mean(x)) out3<-as.data.frame(median(x)) out4<-as.data.frame(sd(x)) out<-cbind(out1,out2,out3,out4) return(out) }) for(i in 1:length(results)){ results[[i]]["Group.Name"]<-names(results)[i] } results<-do.call(rbind,results) results Dimitri On Fri, Apr 23, 2010 at 3:48 PM, Maxim <deeepersound at googlemail.com> wrote:> Hi, > > > I have a very simple question, but I'm obviously not able to solve the > problem on my own. > > > I have a data.frame like > > > sample(c("A","B","C"),size=20,replace = T)->type > > rnorm(20)->value > > data.frame(ty=type,val=value)->test > > > There must be some built in functions, that will do some descriptive > statistics with tabular output, in the end I like to have something like > > > ?number of samples mean sd ............. > > A 5 > > B 9 > > C 6 > > > > So I need a function that counts the number of ?occurrences of factors in > type and then does something like the *summary* function, but factor > specific. > > > I tried: > > > vector()->Median > > vector()->SD > > vector()->Mean > > > as.data.frame(table(type))->int > > > for (count in c(1:(nrow(int)))) > > ?{ > > subset(test, ty==as.character(int$type[count])) -> subtest > > median(subtest$val)->Median[count] > > sd(subtest$val)->SD[count] > > mean(subtest$val)->Mean[count] > > } > > > cbind(int,Median,SD,Mean) > > > This works, but: isn't this much too complicated, I bet there is such > functionality embedded in the base packages, but I cannot find it. > > > Maxim > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitri Liakhovitski Ninah.com Dimitri.Liakhovitski at ninah.com
On 04/24/2010 05:48 AM, Maxim wrote:> Hi, > > > I have a very simple question, but I'm obviously not able to solve the > problem on my own. > > > I have a data.frame like > > > sample(c("A","B","C"),size=20,replace = T)->type > > rnorm(20)->value > > data.frame(ty=type,val=value)->test > > > There must be some built in functions, that will do some descriptive > statistics with tabular output, in the end I like to have something like > > > number of samples mean sd ............. > > A 5 > > B 9 > > C 6 > > > > So I need a function that counts the number of occurrences of factors in > type and then does something like the *summary* function, but factor > specific. > > > I tried: > > > vector()->Median > > vector()->SD > > vector()->Mean > > > as.data.frame(table(type))->int > > > for (count in c(1:(nrow(int)))) > > { > > subset(test, ty==as.character(int$type[count])) -> subtest > > median(subtest$val)->Median[count] > > sd(subtest$val)->SD[count] > > mean(subtest$val)->Mean[count] > > } > > > cbind(int,Median,SD,Mean) > > > This works, but: isn't this much too complicated, I bet there is such > functionality embedded in the base packages, but I cannot find it. > >Hi Maxim, Look at: describe (psych) describe (Hmisc) describe (prettyR) and you will probably find something useful. Jim
You can use fBasics package sample(c("A","B","C"),size=20,replace = T) -> type rnorm(20) -> value data.frame(ty=type,val=value) -> test require(fBasics) nam <- rownames(basicStats(test$val)) result <- do.call("cbind", with(test, tapply(val, ty, basicStats))) rownames(result) <- nam result Bests. ----- ..ooo0 ................................................................................................... ..(....)... 0ooo... Walmes Zeviani ...\..(.....(.....)... Master in Statistics and Agricultural Experimentation ....\_)..... )../.... walmeszeviani at hotmail.com, Lavras - MG, Brasil ............ (_/............................................................................................ -- View this message in context: http://r.789695.n4.nabble.com/basic-table-statistics-tp2062829p2063832.html Sent from the R help mailing list archive at Nabble.com.
On Apr 23, 2010, at 3:48 PM, Maxim wrote:> I have a very simple question, but I'm obviously not able to solve the > problem on my own. > > I have a data.frame like > > sample(c("A","B","C"),size=20,replace = T)->type > > rnorm(20)->value > > data.frame(ty=type,val=value)->test > > There must be some built in functions, that will do some descriptive > statistics with tabular output, in the end I like to have something > like > > number of samples mean sd ............. > > A 5 > B 9 > C 6 > > So I need a function that counts the number of occurrences of > factors in > type and then does something like the *summary* function, but factor > specific. > > I tried: > vector()->Median > vector()->SD > vector()->Mean > > as.data.frame(table(type))->int > for (count in c(1:(nrow(int)))) > { > subset(test, ty==as.character(int$type[count])) -> subtest > median(subtest$val)->Median[count] > sd(subtest$val)->SD[count] > mean(subtest$val)->Mean[count] > } > > > cbind(int,Median,SD,Mean)> require(Design) # loads Hmisc which has ne of many version of describe() > describe(test) test 2 Variables 20 Observations ------------------------------------------------------------------------- ty n missing unique 20 0 3 A (4, 20%), B (5, 25%), C (11, 55%) ------------------------------------------------------------------------- val n missing unique Mean .05 .10 .25 20 0 20 0.07383 -0.865776 -0.815317 -0.707465 .50 .75 .90 .95 0.005735 0.634226 1.270066 1.771820 lowest : -1.7965 -0.8168 -0.8152 -0.8040 -0.7170 highest: 0.6790 1.0680 1.2149 1.7665 1.8729 ------------------------------------------------------------------------- > require(doBy) > summaryBy(value~ty, test, FUN=list(length, mean, min, max, sd, median)) ty value.length value.mean value.min value.max value.sd 1 A 4 -0.03442822 -0.8151531 1.766502 1.2258221 2 B 5 0.34541927 -0.8167919 1.214906 0.7647165 3 C 11 -0.01025352 -1.7964684 1.872865 1.0109676 value.median 1 -0.54453098 2 0.57020532 3 -0.06826249 The by() function which is an application of tapply can also be used. >> > > This works, but: isn't this much too complicated, I bet there is such > functionality embedded in the base packages, but I cannot find it. > > > MaximDavid Winsemius, MD West Hartford, CT