thr3ads.net - R help - [R] 2^k*r (with replications) experimental design question [Nov 2011]

If this information is useful, please help other people find it:
Share via:

Giovanni Azua

2011-Nov-14 00:33 UTC

[R] 2^k*r (with replications) experimental design question

Hello,

I have one replication (r=1 of the 2^k*r) of a 2^k experimental design in the
context of performance analysis i.e. my response variables are Throughput and
Response Time. I use the "aov" function and the results look ok:
> str(throughput)'data.frame':	286 obs. of  7 variables:
 $ Time          : int  6 7 8 9 10 11 12 13 14 15 ...
 $ Throughput    : int  42 44 33 41 43 40 37 40 42 37 ...
 $ No_databases  : Factor w/ 2 levels "1","4": 1 1 1 1 1 1 1
1 1 1 ...
 $ Partitioning  : Factor w/ 2 levels
"sharding","replication": 1 1 1 1 1 1 1 1 1 1 ...
 $ No_middlewares: Factor w/ 2 levels "2","4": 1 1 1 1 1 1 1
1 1 1 ...
 $ Queue_size    : Factor w/ 2 levels "40","100": 1 1 1 1 1
1 1 1 1 1 ...
 $ No_clients    : Factor w/ 1 level "128": 1 1 1 1 1 1 1 1 1 1
...> head(throughput)  Time Throughput No_databases Partitioning No_middlewares Queue_size 
1    6         42            1     sharding              2         40 
2    7         44            1     sharding              2         40
3    8         33            1     sharding              2         40
4    9         41            1     sharding              2         40
5   10         43            1     sharding              2         40
6   11         40            1     sharding              2        
40> 
> throughput.aov <-
aov(Throughput~No_databases+Partitioning+No_middlewares+Queue_size,data=throughput)
> summary(throughput.aov)                              Df    Sum Sq  Mean Sq F value    Pr(>F)    
No_databases       1    28488651 28488651 53.4981 2.713e-12 ***
Partitioning            1    71687    71687  0.1346  0.713966    
No_middlewares   1     5624454  5624454 10.5620  0.001295 ** 
Queue_size          1     50892    50892  0.0956  0.757443    
Residuals             281 149637226   532517                      
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> 
This is somehow what I expected and I am happy, it is saying that the Throughput
is significatively affected firstly by the number of database instances and
secondly by the number of middleware instances.

The problem is that I need to integrate multiple replications of this same 2^k
so I can also account for experimental error i.e. the _r_ of 2^k*r but I
can't see how to integrate the _r_ term into the data and into the aov
function parameters. Can anyone advice?

TIA,
Best regards,
Giovanni

Dennis Murphy

2011-Nov-14 01:38 UTC

head link

[R] 2^k*r (with replications) experimental design question

I'm guessing you have nine replicates of a 2^5 factorial design with a
couple of missing values. If so, define a variable to designate the
replicates and use it as a blocking factor in the ANOVA. If you want
to treat the replicates as a random rather than a fixed factor, then
look into the nlme or lme4 packages.

HTH,
Dennis

On Sun, Nov 13, 2011 at 4:33 PM, Giovanni Azua <bravegag at gmail.com>
wrote:> Hello,
>
> I have one replication (r=1 of the 2^k*r) of a 2^k experimental design in
the context of performance analysis i.e. my response variables are Throughput
and Response Time. I use the "aov" function and the results look ok:
>
>> str(throughput)
> 'data.frame': ? 286 obs. of ?7 variables:
> ?$ Time ? ? ? ? ?: int ?6 7 8 9 10 11 12 13 14 15 ...
> ?$ Throughput ? ?: int ?42 44 33 41 43 40 37 40 42 37 ...
> ?$ No_databases ?: Factor w/ 2 levels "1","4": 1 1 1 1
1 1 1 1 1 1 ...
> ?$ Partitioning ?: Factor w/ 2 levels
"sharding","replication": 1 1 1 1 1 1 1 1 1 1 ...
> ?$ No_middlewares: Factor w/ 2 levels "2","4": 1 1 1 1
1 1 1 1 1 1 ...
> ?$ Queue_size ? ?: Factor w/ 2 levels "40","100": 1 1 1
1 1 1 1 1 1 1 ...
> ?$ No_clients ? ?: Factor w/ 1 level "128": 1 1 1 1 1 1 1 1 1 1
...
>> head(throughput)
> ?Time Throughput No_databases Partitioning No_middlewares Queue_size
> 1 ? ?6 ? ? ? ? 42 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
> 2 ? ?7 ? ? ? ? 44 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
> 3 ? ?8 ? ? ? ? 33 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
> 4 ? ?9 ? ? ? ? 41 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
> 5 ? 10 ? ? ? ? 43 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
> 6 ? 11 ? ? ? ? 40 ? ? ? ? ? ?1 ? ? sharding ? ? ? ? ? ? ?2 ? ? ? ? 40
>>
>> throughput.aov <-
aov(Throughput~No_databases+Partitioning+No_middlewares+Queue_size,data=throughput)
>> summary(throughput.aov)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Df ? ?Sum Sq ?Mean Sq F value ? ?Pr(>F)
> No_databases ? ? ? 1 ? ?28488651 28488651 53.4981 2.713e-12 ***
> Partitioning ? ? ? ? ? ?1 ? ?71687 ? ?71687 ?0.1346 ?0.713966
> No_middlewares ? 1 ? ? 5624454 ?5624454 10.5620 ?0.001295 **
> Queue_size ? ? ? ? ?1 ? ? 50892 ? ?50892 ?0.0956 ?0.757443
> Residuals ? ? ? ? ? ? 281 149637226 ? 532517
> ---
> Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
>>
>
> This is somehow what I expected and I am happy, it is saying that the
Throughput is significatively affected firstly by the number of database
instances and secondly by the number of middleware instances.
>
> The problem is that I need to integrate multiple replications of this same
2^k so I can also account for experimental error i.e. the _r_ of 2^k*r but I
can't see how to integrate the _r_ term into the data and into the aov
function parameters. Can anyone advice?
>
> TIA,
> Best regards,
> Giovanni
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Maybe Matching Threads

Search for more reasonably related threads

R help - Nov 2011 - 2^k*r (with replications) experimental design question

[R] 2^k*r (with replications) experimental design question

[R] 2^k*r (with replications) experimental design question

Maybe Matching Threads