Hi,
I have a very simple question, but I'm obviously not able to solve the
problem on my own.
I have a data.frame like
sample(c("A","B","C"),size=20,replace =
T)->type
rnorm(20)->value
data.frame(ty=type,val=value)->test
There must be some built in functions, that will do some descriptive
statistics with tabular output, in the end I like to have something like
number of samples mean sd .............
A 5
B 9
C 6
So I need a function that counts the number of occurrences of factors in
type and then does something like the *summary* function, but factor
specific.
I tried:
vector()->Median
vector()->SD
vector()->Mean
as.data.frame(table(type))->int
for (count in c(1:(nrow(int))))
{
subset(test, ty==as.character(int$type[count])) -> subtest
median(subtest$val)->Median[count]
sd(subtest$val)->SD[count]
mean(subtest$val)->Mean[count]
}
cbind(int,Median,SD,Mean)
This works, but: isn't this much too complicated, I bet there is such
functionality embedded in the base packages, but I cannot find it.
Maxim
[[alternative HTML version deleted]]
There might be some package.
But you can also do something like:
results<-tapply(test$val,test$ty,function(x){
out1<-as.data.frame(length(x))
out2<-as.data.frame(mean(x))
out3<-as.data.frame(median(x))
out4<-as.data.frame(sd(x))
out<-cbind(out1,out2,out3,out4)
return(out)
})
for(i in 1:length(results)){
results[[i]]["Group.Name"]<-names(results)[i]
}
results<-do.call(rbind,results)
results
Dimitri
On Fri, Apr 23, 2010 at 3:48 PM, Maxim <deeepersound at googlemail.com>
wrote:> Hi,
>
>
> I have a very simple question, but I'm obviously not able to solve the
> problem on my own.
>
>
> I have a data.frame like
>
>
> sample(c("A","B","C"),size=20,replace =
T)->type
>
> rnorm(20)->value
>
> data.frame(ty=type,val=value)->test
>
>
> There must be some built in functions, that will do some descriptive
> statistics with tabular output, in the end I like to have something like
>
>
> ?number of samples mean sd .............
>
> A 5
>
> B 9
>
> C 6
>
>
>
> So I need a function that counts the number of ?occurrences of factors in
> type and then does something like the *summary* function, but factor
> specific.
>
>
> I tried:
>
>
> vector()->Median
>
> vector()->SD
>
> vector()->Mean
>
>
> as.data.frame(table(type))->int
>
>
> for (count in c(1:(nrow(int))))
>
> ?{
>
> subset(test, ty==as.character(int$type[count])) -> subtest
>
> median(subtest$val)->Median[count]
>
> sd(subtest$val)->SD[count]
>
> mean(subtest$val)->Mean[count]
>
> }
>
>
> cbind(int,Median,SD,Mean)
>
>
> This works, but: isn't this much too complicated, I bet there is such
> functionality embedded in the base packages, but I cannot find it.
>
>
> Maxim
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com
On 04/24/2010 05:48 AM, Maxim wrote:> Hi, > > > I have a very simple question, but I'm obviously not able to solve the > problem on my own. > > > I have a data.frame like > > > sample(c("A","B","C"),size=20,replace = T)->type > > rnorm(20)->value > > data.frame(ty=type,val=value)->test > > > There must be some built in functions, that will do some descriptive > statistics with tabular output, in the end I like to have something like > > > number of samples mean sd ............. > > A 5 > > B 9 > > C 6 > > > > So I need a function that counts the number of occurrences of factors in > type and then does something like the *summary* function, but factor > specific. > > > I tried: > > > vector()->Median > > vector()->SD > > vector()->Mean > > > as.data.frame(table(type))->int > > > for (count in c(1:(nrow(int)))) > > { > > subset(test, ty==as.character(int$type[count])) -> subtest > > median(subtest$val)->Median[count] > > sd(subtest$val)->SD[count] > > mean(subtest$val)->Mean[count] > > } > > > cbind(int,Median,SD,Mean) > > > This works, but: isn't this much too complicated, I bet there is such > functionality embedded in the base packages, but I cannot find it. > >Hi Maxim, Look at: describe (psych) describe (Hmisc) describe (prettyR) and you will probably find something useful. Jim
You can use fBasics package
sample(c("A","B","C"),size=20,replace = T) ->
type
rnorm(20) -> value
data.frame(ty=type,val=value) -> test
require(fBasics)
nam <- rownames(basicStats(test$val))
result <- do.call("cbind", with(test, tapply(val, ty, basicStats)))
rownames(result) <- nam
result
Bests.
-----
..ooo0
...................................................................................................
..(....)... 0ooo... Walmes Zeviani
...\..(.....(.....)... Master in Statistics and Agricultural
Experimentation
....\_)..... )../.... walmeszeviani at hotmail.com, Lavras - MG, Brasil
............
(_/............................................................................................
--
View this message in context:
http://r.789695.n4.nabble.com/basic-table-statistics-tp2062829p2063832.html
Sent from the R help mailing list archive at Nabble.com.
On Apr 23, 2010, at 3:48 PM, Maxim wrote:> I have a very simple question, but I'm obviously not able to solve the > problem on my own. > > I have a data.frame like > > sample(c("A","B","C"),size=20,replace = T)->type > > rnorm(20)->value > > data.frame(ty=type,val=value)->test > > There must be some built in functions, that will do some descriptive > statistics with tabular output, in the end I like to have something > like > > number of samples mean sd ............. > > A 5 > B 9 > C 6 > > So I need a function that counts the number of occurrences of > factors in > type and then does something like the *summary* function, but factor > specific. > > I tried: > vector()->Median > vector()->SD > vector()->Mean > > as.data.frame(table(type))->int > for (count in c(1:(nrow(int)))) > { > subset(test, ty==as.character(int$type[count])) -> subtest > median(subtest$val)->Median[count] > sd(subtest$val)->SD[count] > mean(subtest$val)->Mean[count] > } > > > cbind(int,Median,SD,Mean)> require(Design) # loads Hmisc which has ne of many version of describe() > describe(test) test 2 Variables 20 Observations ------------------------------------------------------------------------- ty n missing unique 20 0 3 A (4, 20%), B (5, 25%), C (11, 55%) ------------------------------------------------------------------------- val n missing unique Mean .05 .10 .25 20 0 20 0.07383 -0.865776 -0.815317 -0.707465 .50 .75 .90 .95 0.005735 0.634226 1.270066 1.771820 lowest : -1.7965 -0.8168 -0.8152 -0.8040 -0.7170 highest: 0.6790 1.0680 1.2149 1.7665 1.8729 ------------------------------------------------------------------------- > require(doBy) > summaryBy(value~ty, test, FUN=list(length, mean, min, max, sd, median)) ty value.length value.mean value.min value.max value.sd 1 A 4 -0.03442822 -0.8151531 1.766502 1.2258221 2 B 5 0.34541927 -0.8167919 1.214906 0.7647165 3 C 11 -0.01025352 -1.7964684 1.872865 1.0109676 value.median 1 -0.54453098 2 0.57020532 3 -0.06826249 The by() function which is an application of tapply can also be used. >> > > This works, but: isn't this much too complicated, I bet there is such > functionality embedded in the base packages, but I cannot find it. > > > MaximDavid Winsemius, MD West Hartford, CT