Hi, I need to remove collinear variables to my Near-Infrared table of spectra. What package can I use? Something simple, because I am a novice about statistic. Thank you. Best regards, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200.html Sent from the R help mailing list archive at Nabble.com.
On 05.08.2012 05:27, Roberto wrote:> Hi, > I need to remove collinear variables to my Near-Infrared table of spectra. > > What package can I use? > > Something simple, because I am a novice about statistic.Remove those where isTRUE(all.equal(cor(x, y), 1)) is TRUE? Uwe Ligges> > Thank you. > > Best regards, > Roberto > > > > -- > View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
I do not know, because I tried to use rfe function (Backwards Feature Selection, Caret Package) to select wavelengths useful for a prediction model. Otherwise, rfe function give me back a lot of warning messages about collinearity between variables. So, I do not know if your script can be useful. I tried to use VIF-Regression to select variables, but rfe function advise me with the same warning messages again. What do you think about that? Thank you very much for your help. Best, Roberto -- View this message in context: http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html Sent from the R help mailing list archive at Nabble.com.
There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does. FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe. Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer. Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help. --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Roberto <rmoscetti at unitus.it> wrote:>I do not know, because I tried to use rfe function (Backwards Feature >Selection, Caret Package) to select wavelengths useful for a prediction >model. Otherwise, rfe function give me back a lot of warning messages >about >collinearity between variables. > >So, I do not know if your script can be useful. >I tried to use VIF-Regression to select variables, but rfe function >advise >me with the same warning messages again. > >What do you think about that? > >Thank you very much for your help. > >Best, >Roberto > > > >-- >View this message in context: >http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html >Sent from the R help mailing list archive at Nabble.com. > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Hi, thank you for your help. I know, I need to learn enough statistics to understand how to process my data. The reason because of I write on this forum is to ask to people a way to learn. I am a postharvest researcher and statistic is not my main field, so I try to do my best. Do you know a book (or literature) than can help me? Thank you very much for your time and suggestions. Best regards, Roberto Il 05/08/2012 12:55, Jeff Newmiller ha scritto:> There is no "magic bullet" (package) for your problem. You must either learn enough statistics to understand how to analyze your data, or consult with someone who does. > > FWIW collinearity is not in general amenable to automatic removal. However, you can identify which inputs are collinear with each other, and omit the redundant ones next iteration of your analysis, using (for example) the approach suggested by Uwe. Deciding WHICH of the redundant inputs is most appropriate to keep is the part computers are not so good at... that is where you must be smarter or more creative than the computer. > > Also, it would help you get responses if you included the context (earlier discussion) in your replies.. most people do not use Nabble here. Reading and following the requests in the footer of every message will also help. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > Roberto <rmoscetti at unitus.it> wrote: > >> I do not know, because I tried to use rfe function (Backwards Feature >> Selection, Caret Package) to select wavelengths useful for a prediction >> model. Otherwise, rfe function give me back a lot of warning messages >> about >> collinearity between variables. >> >> So, I do not know if your script can be useful. >> I tried to use VIF-Regression to select variables, but rfe function >> advise >> me with the same warning messages again. >> >> What do you think about that? >> >> Thank you very much for your help. >> >> Best, >> Roberto >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >
On Sat, Aug 4, 2012 at 11:27 PM, Roberto <rmoscetti at unitus.it> wrote:> Hi, > I need to remove collinear variables to my Near-Infrared table of spectra. > > What package can I use? > > Something simple, because I am a novice about statistic. >There many methods of assessing multicollinearlity but to pick one that has a good help page try vif in the HH package. (There are also other packages that have implemented vif or variations of it.) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com
For background have a look at http://en.wikipedia.org/wiki/Multicollinearity. I have also used Regression Diagnostics: Identifying Influential Data and Sources of Collinearity (Wiley Series in Probability and Statistics) by David A. Belsley, Edwin Kuh and Roy E. Welsch Sections 1.9 to 1.12 of Hands-On Intermediate Econometrics Using R: Templates for Extending Dozens of Practical Examples [With CDROM] by Hrishikesh D. Vinod (2008) Basically how you proceed depends a lot on what you are trying to achieve. Best Regards John On 5 August 2012 23:04, Roberto Moscetti <rmoscetti at unitus.it> wrote:> Hi, > thank you for your help. I know, I need to learn enough statistics to > understand how to process my data. The reason because of I write on this > forum is to ask to people a way to learn. > I am a postharvest researcher and statistic is not my main field, so I try > to do my best. > > Do you know a book (or literature) than can help me? > > Thank you very much for your time and suggestions. > > Best regards, > Roberto > > Il 05/08/2012 12:55, Jeff Newmiller ha scritto: > >> There is no "magic bullet" (package) for your problem. You must either >> learn enough statistics to understand how to analyze your data, or consult >> with someone who does. >> >> FWIW collinearity is not in general amenable to automatic removal. >> However, you can identify which inputs are collinear with each other, and >> omit the redundant ones next iteration of your analysis, using (for example) >> the approach suggested by Uwe. Deciding WHICH of the redundant inputs is >> most appropriate to keep is the part computers are not so good at... that is >> where you must be smarter or more creative than the computer. >> >> Also, it would help you get responses if you included the context (earlier >> discussion) in your replies.. most people do not use Nabble here. Reading >> and following the requests in the footer of every message will also help. >> >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go >> Live... >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> Go... >> Live: OO#.. Dead: OO#.. Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. >> rocks...1k >> >> --------------------------------------------------------------------------- >> Sent from my phone. Please excuse my brevity. >> >> Roberto <rmoscetti at unitus.it> wrote: >> >>> I do not know, because I tried to use rfe function (Backwards Feature >>> Selection, Caret Package) to select wavelengths useful for a prediction >>> model. Otherwise, rfe function give me back a lot of warning messages >>> about >>> collinearity between variables. >>> >>> So, I do not know if your script can be useful. >>> I tried to use VIF-Regression to select variables, but rfe function >>> advise >>> me with the same warning messages again. >>> >>> What do you think about that? >>> >>> Thank you very much for your help. >>> >>> Best, >>> Roberto >>> >>> >>> >>> -- >>> View this message in context: >>> >>> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html >>> Sent from the R help mailing list archive at Nabble.com. >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- John C Frain Economics Department Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:frainj at tcd.ie mailto:frainj at gmail.com
Thank you very much for your help. Really appreciated. Best regards, Roberto Il 07/08/2012 12:18, John C Frain ha scritto:> For background have a look at http://en.wikipedia.org/wiki/Multicollinearity. > > I have also used > > Regression Diagnostics: Identifying Influential Data and Sources of > Collinearity (Wiley Series in Probability and Statistics) by David A. > Belsley, Edwin Kuh and Roy E. Welsch > > Sections 1.9 to 1.12 of > Hands-On Intermediate Econometrics Using R: Templates for Extending > Dozens of Practical Examples [With CDROM] by Hrishikesh D. Vinod > (2008) > > Basically how you proceed depends a lot on what you are trying to achieve. > > Best Regards John > > On 5 August 2012 23:04, Roberto Moscetti <rmoscetti at unitus.it> wrote: >> Hi, >> thank you for your help. I know, I need to learn enough statistics to >> understand how to process my data. The reason because of I write on this >> forum is to ask to people a way to learn. >> I am a postharvest researcher and statistic is not my main field, so I try >> to do my best. >> >> Do you know a book (or literature) than can help me? >> >> Thank you very much for your time and suggestions. >> >> Best regards, >> Roberto >> >> Il 05/08/2012 12:55, Jeff Newmiller ha scritto: >> >>> There is no "magic bullet" (package) for your problem. You must either >>> learn enough statistics to understand how to analyze your data, or consult >>> with someone who does. >>> >>> FWIW collinearity is not in general amenable to automatic removal. >>> However, you can identify which inputs are collinear with each other, and >>> omit the redundant ones next iteration of your analysis, using (for example) >>> the approach suggested by Uwe. Deciding WHICH of the redundant inputs is >>> most appropriate to keep is the part computers are not so good at... that is >>> where you must be smarter or more creative than the computer. >>> >>> Also, it would help you get responses if you included the context (earlier >>> discussion) in your replies.. most people do not use Nabble here. Reading >>> and following the requests in the footer of every message will also help. >>> >>> --------------------------------------------------------------------------- >>> Jeff Newmiller The ..... ..... Go >>> Live... >>> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >>> Go... >>> Live: OO#.. Dead: OO#.. Playing >>> Research Engineer (Solar/Batteries O.O#. #.O#. with >>> /Software/Embedded Controllers) .OO#. .OO#. >>> rocks...1k >>> >>> --------------------------------------------------------------------------- >>> Sent from my phone. Please excuse my brevity. >>> >>> Roberto <rmoscetti at unitus.it> wrote: >>> >>>> I do not know, because I tried to use rfe function (Backwards Feature >>>> Selection, Caret Package) to select wavelengths useful for a prediction >>>> model. Otherwise, rfe function give me back a lot of warning messages >>>> about >>>> collinearity between variables. >>>> >>>> So, I do not know if your script can be useful. >>>> I tried to use VIF-Regression to select variables, but rfe function >>>> advise >>>> me with the same warning messages again. >>>> >>>> What do you think about that? >>>> >>>> Thank you very much for your help. >>>> >>>> Best, >>>> Roberto >>>> >>>> >>>> >>>> -- >>>> View this message in context: >>>> >>>> http://r.789695.n4.nabble.com/Package-to-remove-collinear-variables-tp4639200p4639226.html >>>> Sent from the R help mailing list archive at Nabble.com. >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >-- Roberto Moscetti PhD Student University of Tuscia Viterbo, Italy ---------------- Mobile +39 346 8041267 Phone +39 0761 357415