On 1/17/21 12:15 PM, Bernard McGarvey wrote:> I have a data frame that consists of several factor columns say A, B, C, D,
and E and several columns containing numerical data, say X1, X2, .... X10. I
would like to create statistics of some of the numerical columns by some of the
factor columns. For example,
>
> Calculate the mean, min, and max of variables X1 and X7, by factors A, and
E. The results should look like the table below:
>
> Factor A Factor E mean(X1) min(x1) max(X1) mean(X7) min(x7) max(X7)
mean(X10) min(x10) max(X10)
> A1 E1
> A1 E2
> A1 E3
> A2 E1
> A2 E2
> A2 E3
>
> I would like the results to be returned to a data frame or other object
that I can write out using the write.csv function. I have looked at the
summarize and numSummary functions but they do not appear to be flexible enough
to do the above.
The `aggregate` function will do the subsetting and function application.
> dfrm <- cbind(dfrm, matrix(rnorm(600), ncol=10 ) ); names(dfrm)[3:12]
<- paste0("X", 1:10)
> str(dfrm)
'data.frame':??? 60 obs. of? 12 variables:
?$ Factor_A: Factor w/ 2 levels "A1","A2": 1 1 1 2 2 2 1 1
1 2 ...
?$ Factor_B: Factor w/ 3 levels "E1","E2","E3": 1
2 3 1 2 3 1 2 3 1 ...
?$ X1????? : num? -0.02116 -0.00049 0.12875 -0.05412 0.51886 ...
?$ X2????? : num? 1.6799 -0.0963 -0.5727 -0.3638 -0.322 ...
?$ X3????? : num? -0.349 0.267 -0.666 -0.329 0.902 ...
?$ X4????? : num? 0.1125 -0.5384 0.0924 0.6849 -0.4194 ...
?$ X5????? : num? -0.421 0.372 1.316 1.323 -0.03 ...
?$ X6????? : num? -0.0767 1.4972 0.1967 -0.7092 -1.0943 ...
?$ X7????? : num? 0.1771 -0.2136 -1.0818 -0.0671 2.0015 ...
?$ X8????? : num? 1.456 -0.383 -0.47 0.965 0.569 ...
?$ X9????? : num? -1.795 -0.4546 0.0069 1.2245 -0.395 ...
?$ X10???? : num? -1.931 1.708 0.274 0.73 -0.995 ...
?aggregate(? dfrm[ ,? c("X1", "X7", "X10")],??? #
columns to analyze
? ? ? ? ? ? ? ? ? ? ? dfrm[ c("Factor_A", "Factor_B")],? #
classifying
columns
????????????????????? FUN=function (x) c(mn =mean(x), min=min(x),
max=max(x) ) )? # desired "summarizers"
#--- result----
? Factor_A Factor_B??????? X1.mn?????? X1.min?????? X1.max X7.mn?????
X7.min????? X7.max
1?????? A1?????? E1? 0.187513792 -0.866094155? 2.310960164 0.22489729
-0.91442493? 1.94095786
2?????? A2?????? E1? 0.078361707 -1.515410191? 1.382420050 -0.51309155
-1.67026123? 0.70869034
3?????? A1?????? E2 -0.267416858 -1.995131138? 1.392115793 -0.04772929
-2.45426692? 2.02225946
4?????? A2?????? E2 -0.069807208 -0.703073589? 1.879448658 -0.37770923
-2.66221239? 2.00152154
5?????? A1?????? E3 -0.007800886 -1.297561250? 1.216627848 -0.30395411
-1.08181218? 1.09764895
6?????? A2?????? E3 -0.054466856 -1.577891927? 1.674719118 0.35594015
-1.20865279? 2.25765422
????? X10.mn??? X10.min??? X10.max
1 -0.3458888 -2.0312811? 1.1483179
2 -0.1021727 -1.3230372? 0.8045472
3? 0.3514645 -3.2334010? 1.7075298
4 -0.4988984 -2.1091311? 0.5857192
5? 0.2297461 -1.1336967? 0.8483935
6? 0.3700621 -1.5609424? 2.2792024
--
David
>
> Any help would be appreciated,
>
> Thanks
>
> Bernard McGarvey
> Director, Fort Myers Beach Lions Foundation, Inc.
> Retired (Lilly Engineering Fellow).
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.