Hello, One of my colleagues sent me a csv file with 12 columns and a lot of rows. Column1 to Column10 are factors with 2 to 6 levels. Column11 and Column12 are experimental results.I'm a bit lost with all these data. I would like- to determine which factors have the most impact, and in which way, on Column10 (which has to be as high as possible) while Column 11, at the same time, has to be as low as possible (I hope it is clear for at least one of you ...).- to find a nice way to plot trends as there are several factors. Below is a small data.frame from the SixSigma package (4 columns of factors and 2 columns of values). I don't know if it can help you to show me how to "play" with my data.If not, a package name or a tutorial can also be hepful. Thanks in advance,Ptit Bleu. df_test<-read.table(text="pc.col pc.filler pc.batch pc.op pc.volume pc.densityC 1 1 A 16.7533110462178 1.25341925113923C 2 1 B 18.0143546656987 1.11243453179479C 3 1 C 15.6448655396281 1.14110454507519C 1 1 D 18.0281678426422 1.09177192905336C 2 2 A 13.7831255488576 1.1465474843639C 3 2 B 16.758396178001 1.12333920013556C 1 2 C 14.6938147409883 1.34554594406146C 2 2 D 15.1974804312962 1.18442400447752C 3 3 A 14.2077591655389 1.45756680703941C 1 3 B 15.9579675459773 1.18602487934004C 2 3 C 18.1500426178447 1.27641549258776C 3 3 D 14.2297691617968 1.28052785172529B 1 4 A 16.8646535945654 1.30119623444795B 2 4 B 14.2798441018389 1.11228530554194B 3 4 C 16.1341256681412 1.27060268503477B 1 4 D 15.9241734353476 1.34131613229472B 2 5 A 16.8583005443759 1.19909272287678B 3 5 B 16.3449003481023 1.19954395487512B 1 5 C 15.4175473098922 1.54100836814473B 2 5 D 16.7861703759254 1.31241568978863B 3 6 A 15.3079007135867 1.14210653222791B 1 6 B 14.8169564636873 1.21694093094929B 2 6 C 17.2688507060631 1.2603211001675B 3 6 D 15.6874484539888 1.32986554107345", header=T) [[alternative HTML version deleted]]
Richard M. Heiberger
2020-Mar-18 19:04 UTC
[R] How to evaluate impact of factors on parameters
Unfortunately, sending HTML mail scrambled the correct use of dput. Please use "plain text mode" for all R mailing lists. I unscrambled it here df_test <- structure(list(pc.col = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("B", "C"), class = "factor"), pc.filler = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1", "2", "3"), class = "factor"), pc.batch = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L ), .Label = c("1", "2", "3", "4", "5", "6"), class = "factor"), pc.op = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), pc.volume = c(16.7533110462178, 18.0143546656987, 15.6448655396281, 18.0281678426422, 13.7831255488576, 16.758396178001, 14.6938147409883, 15.1974804312962, 14.2077591655389, 15.9579675459773, 18.1500426178447, 14.2297691617968, 16.8646535945654, 14.2798441018389, 16.1341256681412, 15.9241734353476, 16.8583005443759, 16.3449003481023, 15.4175473098922, 16.7861703759254, 15.3079007135867, 14.8169564636873, 17.2688507060631, 15.6874484539888), pc.density = c(1.25341925113923, 1.11243453179479, 1.14110454507519, 1.09177192905336, 1.1465474843639, 1.12333920013556, 1.34554594406146, 1.18442400447752, 1.45756680703941, 1.18602487934004, 1.27641549258776, 1.28052785172529, 1.30119623444795, 1.11228530554194, 1.27060268503477, 1.34131613229472, 1.19909272287678, 1.19954395487512, 1.54100836814473, 1.31241568978863, 1.14210653222791, 1.21694093094929, 1.2603211001675, 1.32986554107345)), row.names = c(NA, -24L), class = "data.frame") Your factor structure 2x3x4x6=144 has 144 cells, but there are only 24 data points. I am plotting both response variables against one factor at a time. library(lattice) ?xyplot bwplot(pc.volume + pc.density ~ pc.col + pc.filler + pc.batch + pc.op, data=df_test, outer=TRUE) bwplot(pc.volume + pc.density ~ pc.col, data=df_test, outer=TRUE) bwplot(pc.volume + pc.density ~ pc.filler, data=df_test, outer=TRUE) bwplot(pc.volume + pc.density ~ pc.batch, data=df_test, outer=TRUE) bwplot(pc.volume + pc.density ~ pc.op, data=df_test, outer=TRUE) More information about the experiment is needed before anything else can be attempted. Rich On Wed, Mar 18, 2020 at 10:11 AM lionel sicot via R-help <r-help at r-project.org> wrote:> > Hello, > One of my colleagues sent me a csv file with 12 columns and a lot of rows. Column1 to Column10 are factors with 2 to 6 levels. Column11 and Column12 are experimental results.I'm a bit lost with all these data. > I would like- to determine which factors have the most impact, and in which way, on Column10 (which has to be as high as possible) while Column 11, at the same time, has to be as low as possible (I hope it is clear for at least one of you ...).- to find a nice way to plot trends as there are several factors. > > Below is a small data.frame from the SixSigma package (4 columns of factors and 2 columns of values). I don't know if it can help you to show me how to "play" with my data.If not, a package name or a tutorial can also be hepful. > Thanks in advance,Ptit Bleu. > df_test<-read.table(text="pc.col pc.filler pc.batch pc.op pc.volume pc.densityC 1 1 A 16.7533110462178 1.25341925113923C 2 1 B 18.0143546656987 1.11243453179479C 3 1 C 15.6448655396281 1.14110454507519C 1 1 D 18.0281678426422 1.09177192905336C 2 2 A 13.7831255488576 1.1465474843639C 3 2 B 16.758396178001 1.12333920013556C 1 2 C 14.6938147409883 1.34554594406146C 2 2 D 15.1974804312962 1.18442400447752C 3 3 A 14.2077591655389 1.45756680703941C 1 3 B 15.9579675459773 1.18602487934004C 2 3 C 18.1500426178447 1.27641549258776C 3 3 D 14.2297691617968 1.28052785172529B 1 4 A 16.8646535945654 1.30119623444795B 2 4 B 14.2798441018389 1.11228530554194B 3 4 C 16.1341256681412 1.27060268503477B 1 4 D 15.9241734353476 1.34131613229472B 2 5 A 16.8583005443759 1.19909272287678B 3 5 B 16.3449003481023 1.19954395487512B 1 5 C 15.4175473098922 1.54100836814473B 2 5 D 16.7861703759254 1.31241568978863B 3 6 A 15.3079007135867 1.14210653222791B 1 6 B 14.8169564636873 1.21694093094929B 2 6 C 17.268850706 > 0631 1.2603211001675B 3 6 D 15.6874484539888 1.32986554107345", header=T) > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.