michael.eisenring at agroscope.admin.ch
2017-Sep-18 08:51 UTC
[R] Data arrangement for PLSDA using the ropls package
Hello, I would like to do a partial least square discriminant analysis (PLSDA) in R using the package "ropls" Which is in R available via the R command : source("https://bioconductor.org/biocLite.R") I try to do a PLSDA to illustrate the impact of two genders (AP,C) on 5 compounds measured in persons (samples) should be illustrated. When I try to do a PLSDA I get the warning message: "Single component model: only 'overview' and 'permutation' (in case of single response (O)PLS(-DA)) plots available" I assume it has something to do with the way I arrange my data into R. I tried to do it in a similar way as it has been done in the example of the package using the sacurine data set (bioconductor.org/packages/release/bioc/vignettes/ropls/inst/doc/ropls-vignette.pdf) Can somebody maybe tell me how I correctly have to arrange my data in order to perfom a PLSDA using the "ropls" package? Thank you very much, Mike Please find my code and an example data set below: CODE: #Input data and convert to data frame and define "Sample" as row dta<-read.csv("Demo.csv",sep=";",header=T) rownames(dta)<-dta$Sample dta #Remove non-numeric "Sample" and "Gender" rows and convert to matrix dta.exp<-dta[,c(-1,-7)] matrix<-as.matrix(dta.exp) str(matrix) matrix #create vector with "gender" as y-component dta.treatments<-dta[,7] dta.treatments dta.factor<-as.factor(dta.treatments) dta.plsda <- opls(matrix, dta.factor) DATA:> dput(dta)structure(list(Sample = structure(c(1L, 12L, 23L, 34L, 36L, 37L, 38L, 39L, 40L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 31L, 32L, 33L, 35L), .Label = c("sa1", "sa10", "sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18", "sa19", "sa2", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25", "sa26", "sa27", "sa28", "sa29", "sa3", "sa30", "sa31", "sa32", "sa33", "sa34", "sa35", "sa36", "sa37", "sa38", "sa39", "sa4", "sa40", "sa5", "sa6", "sa7", "sa8", "sa9"), class = "factor"), Comp1 = c(1.7686, 0.6873, 1.2322, 1.4874, 1.8986, 1.3484, 1.0959, 0.583, 1.039, 1.6133, 0.9595, 1.6377, 1.4538, 0.8737, 1.3363, 1.7881, 2.3604, 1.1239, 2.1281, 2.037, 0.5314, 0.7147, 0.5917, 0.6671, 0.6645, 0.9865, 1.019, 0.9664, 0.6966, 0.679, 0.7976, 0.8503, 1.2566, 0.5881, 0.8838, 0.6657, 0.7399, 0.5778, 0.7121, 1.1909), Comp2 = c(0.0284, 0.9064, 0, 0.7053, 0.7695, 0.337, 1.0418, 0.8346, 0.3884, 1.9946, 1.3296, 0.119, 0.0106, 0.7872, 1.0174, 0.0704, 0.0854, 0.4259, 0.0395, 0.0549, 2.4471, 1.8418, 2.9805, 1.1181, 0.5403, 2.7181, 1.4835, 0.875, 2.2205, 2.4106, 1.1967, 0.303, 0.1129, 2.5432, 2.328, 0.9839, 2.3583, 1.9589, 1.9918, 1.2232), Comp3 = c(2.9976, 1.6201, 0.7497, 1.371, 2.7035, 0.4533, 0.9927, 1.0973, 1.6702, 1.3696, 0.3392, 1.1489, 2.1086, 1.1586, 1.3645, 1.6008, 2.9567, 1.5721, 2.9633, 2.4623, 0.1103, 0.3137, 0.313, 0.2969, 0.5148, 0.7419, 0.5641, 0.7871, 0.7362, 0.8754, 0.4883, 0.8504, 1.4582, 0.1934, 0.764, 0.7515, 0.7143, 0.2139, 0.5743, 1.7305), Comp4 = c(0, 0, 0.603, 0, 1.6524, 0, 0, 0, 0, 1.1056, 0, 0, 0, 0, 0, 0, 5.7848, 0, 0, 0, 0, 0, 0, 0, 0, 0.7895, 3.4641, 0, 0, 1.7446, 0, 0, 1.5165, 0, 5.9645, 4.1878, 0.7313, 5.7994, 3.0168, 0), Comp5 = c(18.6058, 5.6489, 12.0842, 4.2708, 3.8489, 10.2139, 6.1149, 11.3373, 8.9013, 5.8342, 18.532, 17.9267, 8.7386, 6.9455, 7.3044, 19.0811, 10.8809, 10.7149, 4.7057, 0, 10.3088, 5.1514, 19.1218, 21.1768, 8.3797, 2.7146, 8.7405, 14.4817, 8.6571, 17.4254, 17.5725, 5.1233, 13.7539, 6.7396, 2.1342, 14.4216, 9.2952, 19.9525, 2.2317, 16.501), Gender = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("AP", "C" ), class = "factor")), .Names = c("Sample", "Comp1", "Comp2", "Comp3", "Comp4", "Comp5", "Gender"), class = "data.frame", row.names = c("sa1", "sa2", "sa3", "sa4", "sa5", "sa6", "sa7", "sa8", "sa9", "sa10", "sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18", "sa19", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25", "sa26", "sa27", "sa28", "sa29", "sa30", "sa31", "sa32", "sa33", "sa34", "sa35", "sa36", "sa37", "sa38", "sa39", "sa40")) Eisenring Michael, Dr. Federal Department of Economic Affairs, Education and Research EAER Agroecology and Environment Biosafety Reckenholzstrasse 191, CH-8046 Z?rich Tel. +41 58 468 7181 Fax +41 58 468 7201 michael.eisenring at agroscope.admin.ch<mailto:michael.eisenring at agroscope.admin.ch> www.agroscope.ch<http://www.agroscope.ch/> [[alternative HTML version deleted]]
If this is a bioconductor package, why do you not post on the bioconductor list? -- Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Sep 18, 2017 at 1:51 AM, <michael.eisenring at agroscope.admin.ch> wrote:> Hello, > I would like to do a partial least square discriminant analysis (PLSDA) in > R using the package "ropls" > Which is in R available via the R command : > > source("https://bioconductor.org/biocLite.R") > > I try to do a PLSDA to illustrate the impact of two genders (AP,C) on 5 > compounds measured in persons (samples) should be illustrated. When I try > to do a PLSDA I get the warning message: > > "Single component model: only 'overview' and 'permutation' (in case of > single response (O)PLS(-DA)) plots available" > > > > I assume it has something to do with the way I arrange my data into R. I > tried to do it in a similar way as it has been done in the example of the > package using the sacurine data set (bioconductor.org/packages/ > release/bioc/vignettes/ropls/inst/doc/ropls-vignette.pdf) > > > > Can somebody maybe tell me how I correctly have to arrange my data in > order to perfom a PLSDA using the "ropls" package? > > > > Thank you very much, > > Mike > > > > Please find my code and an example data set below: > > > > CODE: > > > > #Input data and convert to data frame and define "Sample" as row > > dta<-read.csv("Demo.csv",sep=";",header=T) > > rownames(dta)<-dta$Sample > > dta > > > > #Remove non-numeric "Sample" and "Gender" rows and convert to matrix > > dta.exp<-dta[,c(-1,-7)] > > matrix<-as.matrix(dta.exp) > > str(matrix) > > matrix > > > > #create vector with "gender" as y-component > > dta.treatments<-dta[,7] > > dta.treatments > > > > dta.factor<-as.factor(dta.treatments) > > > > dta.plsda <- opls(matrix, dta.factor) > > > > > DATA: > > > dput(dta) > > structure(list(Sample = structure(c(1L, 12L, 23L, 34L, 36L, 37L, > > 38L, 39L, 40L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L, > > 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 24L, 25L, 26L, 27L, > > 28L, 29L, 30L, 31L, 32L, 33L, 35L), .Label = c("sa1", "sa10", > > "sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18", > > "sa19", "sa2", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25", > > "sa26", "sa27", "sa28", "sa29", "sa3", "sa30", "sa31", "sa32", > > "sa33", "sa34", "sa35", "sa36", "sa37", "sa38", "sa39", "sa4", > > "sa40", "sa5", "sa6", "sa7", "sa8", "sa9"), class = "factor"), > > Comp1 = c(1.7686, 0.6873, 1.2322, 1.4874, 1.8986, 1.3484, > > 1.0959, 0.583, 1.039, 1.6133, 0.9595, 1.6377, 1.4538, 0.8737, > > 1.3363, 1.7881, 2.3604, 1.1239, 2.1281, 2.037, 0.5314, 0.7147, > > 0.5917, 0.6671, 0.6645, 0.9865, 1.019, 0.9664, 0.6966, 0.679, > > 0.7976, 0.8503, 1.2566, 0.5881, 0.8838, 0.6657, 0.7399, 0.5778, > > 0.7121, 1.1909), Comp2 = c(0.0284, 0.9064, 0, 0.7053, 0.7695, > > 0.337, 1.0418, 0.8346, 0.3884, 1.9946, 1.3296, 0.119, 0.0106, > > 0.7872, 1.0174, 0.0704, 0.0854, 0.4259, 0.0395, 0.0549, 2.4471, > > 1.8418, 2.9805, 1.1181, 0.5403, 2.7181, 1.4835, 0.875, 2.2205, > > 2.4106, 1.1967, 0.303, 0.1129, 2.5432, 2.328, 0.9839, 2.3583, > > 1.9589, 1.9918, 1.2232), Comp3 = c(2.9976, 1.6201, 0.7497, > > 1.371, 2.7035, 0.4533, 0.9927, 1.0973, 1.6702, 1.3696, 0.3392, > > 1.1489, 2.1086, 1.1586, 1.3645, 1.6008, 2.9567, 1.5721, 2.9633, > > 2.4623, 0.1103, 0.3137, 0.313, 0.2969, 0.5148, 0.7419, 0.5641, > > 0.7871, 0.7362, 0.8754, 0.4883, 0.8504, 1.4582, 0.1934, 0.764, > > 0.7515, 0.7143, 0.2139, 0.5743, 1.7305), Comp4 = c(0, 0, > > 0.603, 0, 1.6524, 0, 0, 0, 0, 1.1056, 0, 0, 0, 0, 0, 0, 5.7848, > > 0, 0, 0, 0, 0, 0, 0, 0, 0.7895, 3.4641, 0, 0, 1.7446, 0, > > 0, 1.5165, 0, 5.9645, 4.1878, 0.7313, 5.7994, 3.0168, 0), > > Comp5 = c(18.6058, 5.6489, 12.0842, 4.2708, 3.8489, 10.2139, > > 6.1149, 11.3373, 8.9013, 5.8342, 18.532, 17.9267, 8.7386, > > 6.9455, 7.3044, 19.0811, 10.8809, 10.7149, 4.7057, 0, 10.3088, > > 5.1514, 19.1218, 21.1768, 8.3797, 2.7146, 8.7405, 14.4817, > > 8.6571, 17.4254, 17.5725, 5.1233, 13.7539, 6.7396, 2.1342, > > 14.4216, 9.2952, 19.9525, 2.2317, 16.501), Gender = structure(c(1L, > > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > > 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("AP", "C" > > ), class = "factor")), .Names = c("Sample", "Comp1", "Comp2", > > "Comp3", "Comp4", "Comp5", "Gender"), class = "data.frame", row.names > c("sa1", > > "sa2", "sa3", "sa4", "sa5", "sa6", "sa7", "sa8", "sa9", "sa10", > > "sa11", "sa12", "sa13", "sa14", "sa15", "sa16", "sa17", "sa18", > > "sa19", "sa20", "sa21", "sa22", "sa23", "sa24", "sa25", "sa26", > > "sa27", "sa28", "sa29", "sa30", "sa31", "sa32", "sa33", "sa34", > > "sa35", "sa36", "sa37", "sa38", "sa39", "sa40")) > > > > > Eisenring Michael, Dr. > > Federal Department of Economic Affairs, Education and Research > EAER > Agroecology and Environment > Biosafety > > Reckenholzstrasse 191, CH-8046 Z?rich > Tel. +41 58 468 7181 > Fax +41 58 468 7201 > michael.eisenring at agroscope.admin.ch<mailto:michael. > eisenring at agroscope.admin.ch> > www.agroscope.ch<http://www.agroscope.ch/> > > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]