Marco Saerens
2002-Oct-10 12:36 UTC
[R] Correspondence analysis/optimal scaling with ordinal variable
Dear R specialists, I have a multivariate statistics question that I want to submit to the R community (which conveys a very good statistical knowledge). I need to perform an optimal scaling based on a discrete variable and an ordinal variable. The discrete variable, Area, defines a geographical area. The ordinal variable, EducationLevel, describes the education level of individuals (the ordinal factors are "VeryLow", "Low, "Medium", "Large", "VeryLarge"). I have a data set specifying, for each area (rows), the number of individuals in this area having a given education level (columns). It looks like: Area VeryLow Low Medium Large VeryLarge A1 6 21 15 11 0 A2 2 4 8 17 9 etc Meaning that in area A1 there are 6 individuals with very low education level, 21 with low education level, etc. I need to compute a score for each area that reflects the education level in this area. This can be done by using correspondence analysis: The scores on the first factor represent an optimal scaling in a certain sense (see the book of Greenacre (1984) "Theory and applications of correspondence analysis" for instance). In other words, I have to transform my ordinal variable "EducationLevel" into a continuous variable "EducationScore". However, this procedure does not account for the fact that one of my variables (EducationLevel) is ordinal. For instance, the weights obtained after performing the correspondence analysis could be non-monotically increasing (weights used in order to compute the projection on the first factor). In summary, the question is: (1) Are there statistical procedures that account for the ordinal nature of the Level variable (so that the weights are monotically increasing: order constraints on the weights) ? (2) Are these procedures implemented in R or S-Plus ? Please, feel free to answer to "saerens at ulb.ac.be". Many Thanks !! Marco Saerens -- """ ? ? _oOO-(_)-OOo______________________________________________________________ Prof. Marco Saerens Information Systems Research Unit (ISYS) IAG Universit? Catholique de Louvain Tel: +32(0)10.47.92.46. Place des Doyens 1 Fax: +32(0)10.47.83.24. B-1348 Louvain-la-Neuve Email: saerens at isys.ucl.ac.be BELGIUM __________________________________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Christian Hennig
2002-Oct-10 14:02 UTC
[R] Correspondence analysis/optimal scaling with ordinal variable
Dear Marco, although I am not really an expert in this, it may be helpful to consider the Wiley-book A. Gifi "Nonlinear multivariate analysis" (1990) or perhaps the paper by Michailidis and de Leeuw about the "Gifi system": http://citeseer.nj.nec.com/cache/papers/cs/14429/http:zSzzSzwww.stat.ucla.eduzSzpaperszSzpreprintszSz204zSz204.pdf/michailidis98gifi.pdf The keyword should be "nonlinear principal components". I do not know about implementations in R. Best, Christian On Thu, 10 Oct 2002, Marco Saerens wrote:> Dear R specialists, > > I have a multivariate statistics question that I want to submit to > the R community (which conveys a very good statistical knowledge). > > I need to perform an optimal scaling based on a discrete variable and > an ordinal variable. The discrete variable, Area, defines a > geographical area. The ordinal variable, EducationLevel, describes > the education level of individuals (the ordinal factors are > "VeryLow", "Low, "Medium", "Large", "VeryLarge"). > > I have a data set specifying, for each area (rows), the number of > individuals in this area having a given education level (columns). It > looks like: > > Area VeryLow Low Medium Large VeryLarge > A1 6 21 15 11 0 > A2 2 4 8 17 9 > etc > > Meaning that in area A1 there are 6 individuals with very low > education level, 21 with low education level, etc. > > I need to compute a score for each area that reflects the education > level in this area. This can be done by using correspondence > analysis: The scores on the first factor represent an optimal scaling > in a certain sense (see the book of Greenacre (1984) "Theory and > applications of correspondence analysis" for instance). In other > words, I have to transform my ordinal variable "EducationLevel" into > a continuous variable "EducationScore". > > However, this procedure does not account for the fact that one of my > variables (EducationLevel) is ordinal. For instance, the weights > obtained after performing the correspondence analysis could be > non-monotically increasing (weights used in order to compute the > projection on the first factor). > > In summary, the question is: > > (1) Are there statistical procedures that account for the ordinal > nature of the Level variable (so that the weights are monotically > increasing: order constraints on the weights) ? > > (2) Are these procedures implemented in R or S-Plus ? > > Please, feel free to answer to "saerens at ulb.ac.be". > > Many Thanks !! > > Marco Saerens >-- *********************************************************************** Christian Hennig Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (current) and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/ hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ ####################################################################### ich empfehle www.boag.de -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jan de Leeuw
2002-Oct-10 15:43 UTC
[R] Correspondence analysis/optimal scaling with ordinal variable
ftp://gifi.stat.ucla.edu/pub/homalsR.tar.gz does "correspondence analysis" for an arbitrary number of variables (not just 2), each of which can be treated as numerical, ordinal, or nominal. In fact, it can do much more. It has a nice tcl/tk interface, and the documentation is abysmal. On Thursday, October 10, 2002, at 05:36 AM, Marco Saerens wrote:> Dear R specialists, > > I have a multivariate statistics question that I want to submit to the > R community (which conveys a very good statistical knowledge). > > I need to perform an optimal scaling based on a discrete variable and > an ordinal variable. The discrete variable, Area, defines a > geographical area. The ordinal variable, EducationLevel, describes the > education level of individuals (the ordinal factors are "VeryLow", > "Low, "Medium", "Large", "VeryLarge"). > > I have a data set specifying, for each area (rows), the number of > individuals in this area having a given education level (columns). It > looks like: > > Area VeryLow Low Medium Large VeryLarge > A1 6 21 15 11 0 > A2 2 4 8 17 9 > etc > > Meaning that in area A1 there are 6 individuals with very low > education level, 21 with low education level, etc. > > I need to compute a score for each area that reflects the education > level in this area. This can be done by using correspondence analysis: > The scores on the first factor represent an optimal scaling in a > certain sense (see the book of Greenacre (1984) "Theory and > applications of correspondence analysis" for instance). In other > words, I have to transform my ordinal variable "EducationLevel" into a > continuous variable "EducationScore". > > However, this procedure does not account for the fact that one of my > variables (EducationLevel) is ordinal. For instance, the weights > obtained after performing the correspondence analysis could be > non-monotically increasing (weights used in order to compute the > projection on the first factor). > > In summary, the question is: > > (1) Are there statistical procedures that account for the ordinal > nature of the Level variable (so that the weights are monotically > increasing: order constraints on the weights) ? > > (2) Are these procedures implemented in R or S-Plus ? > > Please, feel free to answer to "saerens at ulb.ac.be". > > Many Thanks !! > > Marco Saerens > -- > > """ > ? ? > _oOO-(_)- > OOo______________________________________________________________ > Prof. Marco Saerens > Information Systems Research Unit (ISYS) > IAG > Universit? Catholique de Louvain Tel: +32(0)10.47.92.46. > Place des Doyens 1 Fax: +32(0)10.47.83.24. > B-1348 Louvain-la-Neuve Email: > saerens at isys.ucl.ac.be > BELGIUM > _______________________________________________________________________ > ___ > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > .-.-.-.-.- > r-help mailing list -- Read > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: > r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > ._._._._ > >==Jan de Leeuw; Professor and Chair, UCLA Department of Statistics; Editor: Journal of Multivariate Analysis, Journal of Statistical Software US mail: 9432 Boelter Hall, Box 951554, Los Angeles, CA 90095-1554 phone (310)-825-9550; fax (310)-206-5658; email: deleeuw at stat.ucla.edu homepage: http://gifi.stat.ucla.edu ------------------------------------------------------------------------ ------------------------- No matter where you go, there you are. --- Buckaroo Banzai http://gifi.stat.ucla.edu/sounds/nomatter.au ------------------------------------------------------------------------ ------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jan de Leeuw
2002-Oct-10 16:59 UTC
[R] Correspondence analysis/optimal scaling with ordinal variable
The paper is in Statistical Science, 1998, 13, 307-336 The homals program in R (ftp://gifi.stat.ucla.edu/pub/homalsR.tar.gz) has, in the Gifi terminology, homals and princals and overals (and a few new extensions). It will be packaged soon, I hope. On Thursday, October 10, 2002, at 07:02 AM, Christian Hennig wrote:> Dear Marco, > > although I am not really an expert in this, it may be helpful to > consider > the Wiley-book A. Gifi "Nonlinear multivariate analysis" (1990) or > perhaps > the paper by Michailidis and de Leeuw about the "Gifi system": > > http://citeseer.nj.nec.com/cache/papers/cs/14429/ > http:zSzzSzwww.stat.ucla.eduzSzpaperszSzpreprintszSz204zSz204.pdf/ > michailidis98gifi.pdf > > The keyword should be "nonlinear principal components". > I do not know about implementations in R. > > Best, > Christian > > On Thu, 10 Oct 2002, Marco Saerens wrote: > >> Dear R specialists, >> >> I have a multivariate statistics question that I want to submit to >> the R community (which conveys a very good statistical knowledge). >> >> I need to perform an optimal scaling based on a discrete variable and >> an ordinal variable. The discrete variable, Area, defines a >> geographical area. The ordinal variable, EducationLevel, describes >> the education level of individuals (the ordinal factors are >> "VeryLow", "Low, "Medium", "Large", "VeryLarge"). >> >> I have a data set specifying, for each area (rows), the number of >> individuals in this area having a given education level (columns). It >> looks like: >> >> Area VeryLow Low Medium Large VeryLarge >> A1 6 21 15 11 0 >> A2 2 4 8 17 9 >> etc >> >> Meaning that in area A1 there are 6 individuals with very low >> education level, 21 with low education level, etc. >> >> I need to compute a score for each area that reflects the education >> level in this area. This can be done by using correspondence >> analysis: The scores on the first factor represent an optimal scaling >> in a certain sense (see the book of Greenacre (1984) "Theory and >> applications of correspondence analysis" for instance). In other >> words, I have to transform my ordinal variable "EducationLevel" into >> a continuous variable "EducationScore". >> >> However, this procedure does not account for the fact that one of my >> variables (EducationLevel) is ordinal. For instance, the weights >> obtained after performing the correspondence analysis could be >> non-monotically increasing (weights used in order to compute the >> projection on the first factor). >> >> In summary, the question is: >> >> (1) Are there statistical procedures that account for the ordinal >> nature of the Level variable (so that the weights are monotically >> increasing: order constraints on the weights) ? >> >> (2) Are these procedures implemented in R or S-Plus ? >> >> Please, feel free to answer to "saerens at ulb.ac.be". >> >> Many Thanks !! >> >> Marco Saerens >> > > -- > *********************************************************************** > Christian Hennig > Seminar fuer Statistik, ETH-Zentrum (LEO), CH-8092 Zuerich (current) > and Fachbereich Mathematik-SPST/ZMS, Universitaet Hamburg > hennig at stat.math.ethz.ch, http://stat.ethz.ch/~hennig/ > hennig at math.uni-hamburg.de, http://www.math.uni-hamburg.de/home/hennig/ > ####################################################################### > ich empfehle www.boag.de > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > .-.-.-.-.- > r-help mailing list -- Read > http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: > r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ > ._._._._ > >==Jan de Leeuw; Professor and Chair, UCLA Department of Statistics; Editor: Journal of Multivariate Analysis, Journal of Statistical Software US mail: 9432 Boelter Hall, Box 951554, Los Angeles, CA 90095-1554 phone (310)-825-9550; fax (310)-206-5658; email: deleeuw at stat.ucla.edu homepage: http://gifi.stat.ucla.edu ------------------------------------------------------------------------ ------------------------- No matter where you go, there you are. --- Buckaroo Banzai http://gifi.stat.ucla.edu/sounds/nomatter.au ------------------------------------------------------------------------ ------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Andrew Criswell
2002-Oct-11 04:25 UTC
[R] Correspondence analysis/optimal scaling with ordinal variable
Dear Marco: Alan Agresti in "Categorical Data Analysis," p. 291-3 describes the use of correspondence analysis and provides an example. Laura Thompson in "Splus Manual to Accompany Agresti's Categorical Data Analysis" implements Agresti's example on p. 45-6. Some slight modifications will render the example in R. Her paper can be found on http://math.cl.uh.edu/~thompsonla/5537/Splusdiscrete.PDF This might be of benefit to your work. Best wishes, ANDREW Andrew Criswell Professor of Finance Graduate School, Bangkok University -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._