Michael Friendly
2011-Mar-22 15:27 UTC
[R] how to convert a data.frame to a list of dist objects for individual differences MDS?
I have a 45 x 16 data frame consisting of dissimilarities among 10 colors, giving in each column the 45 = 10*9/2 pairwise judgments for one of 16 subjects. The rownames identify each pair of colors, e.g, "AC" = ("A","C"), and the pairs are ordered by columns in the lower triangle of each distance matrix. > helm.raw <- read.table("http://euclid.psych.yorku.ca/datavis/Private/mdshelm.dat", header=TRUE, row.names=1) > head(helm.raw) N1 N2 N3 N4 N5 N6a N6b N7 N8 N9 N10 CD1 CD2a CD2b CD3 CD4 AC 6.8 5.9 7.1 7.5 6.6 5.2 5.8 6.2 7.5 6.0 9.2 11.5 9.3 9.0 10.4 9.9 AE 12.5 11.1 10.2 10.3 10.5 9.4 10.5 10.8 9.1 9.4 10.8 13.1 10.7 10.0 12.4 13.2 AG 13.8 18.8 11.1 10.7 10.2 11.4 13.4 9.9 10.2 9.5 9.7 12.6 10.7 10.4 12.8 12.3 AI 14.2 17.3 12.5 11.6 9.6 13.3 14.0 11.1 12.1 9.5 10.1 10.6 11.9 10.0 13.7 11.1 AK 12.5 16.6 11.8 10.6 10.8 12.0 13.2 10.3 12.5 9.8 10.3 10.6 11.0 9.3 11.8 8.7 AM 11.0 16.5 9.9 9.7 9.7 12.3 11.7 8.8 9.7 8.7 9.7 10.8 9.8 8.6 4.3 5.6 > row.names(helm.raw) [1] "AC" "AE" "AG" "AI" "AK" "AM" "AO" "AQ" "AS" "CE" "CG" "CI" "CK" "CM" "CO" "CQ" "CS" "EG" "EI" "EK" [21] "EM" "EO" "EQ" "ES" "GI" "GK" "GM" "GO" "GQ" "GS" "IK" "IM" "IO" "IQ" "IS" "KM" "KO" "KQ" "KS" "MO" [41] "MQ" "MS" "OQ" "OS" "QS" > To analyse this (with individual differences MDS, e.g., smacofDiff()), I need to: (a) convert this to a list of objects of class "dist", one for each column of helm.raw (b) rename the 1-letter codes to color name abbreviations as row/col labels for each distance matrix, according to: 'A'='RPur' 'C'='Red' 'E'='Yel' 'G'='Gy1' 'I'='Gy2' 'K'='Green' 'M'='Blue' 'O'='BlP' 'Q'='Pur1' 'S'='Pur2' I've done this in SAS, but I don't know how to do it in R because neither dist() nor as.dist() seem to be able to work with data in this format. I could try brute-force, but maybe there is an easier way. Can someone help? As a distance matrix, the column helm.raw$CD1 for subject CD1 should appear something like shown below (without the Obs column, where stim is the rowname) --------------------------------- Subject=CD1 ---------------------------------- Obs stim RPur Red Yel Gy1 Gy2 Green Blue BlP Pur1 Pur2 1 RPur . . . . . . . . . . 2 Red 11.5 . . . . . . . . . 3 Yel 13.1 6.0 . . . . . . . . 4 Gy1 12.6 7.9 6.2 . . . . . . . 5 Gy2 10.6 8.4 8.4 5.2 . . . . . . 6 Green 10.6 9.4 9.9 6.5 4.1 . . . . . 7 Blue 10.8 10.2 10.3 8.8 7.0 6.4 . . . . 8 BlP 7.3 11.3 12.7 11.2 10.4 9.9 4.2 . . . 9 Pur1 5.4 11.5 12.9 11.7 10.8 9.4 8.4 4.5 . . 10 Pur2 5.0 11.5 10.7 10.2 10.6 10.1 8.1 6.4 3 . -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street Web: http://www.datavis.ca Toronto, ONT M3J 1P3 CANADA
Phil Spector
2011-Mar-22 16:47 UTC
[R] how to convert a data.frame to a list of dist objects for individual differences MDS?
Michael - I think this does what you want: helm.raw <- read.table("http://euclid.psych.yorku.ca/datavis/Private/mdshelm.dat",header=TRUE, row.names=1) trans = c('A'='RPur','C'='Red','E'='Yel','G'='Gy1','I'='Gy2','K'='Green','M'='Blue','O'='BlP','Q'='Pur1','S'='Pur2') cnames = do.call(rbind,strsplit(rownames(helm.raw), "")) cnames = apply(cnames,2,function(x)trans[x]) uu = unique(as.vector(cnames)) onecol = function(col){ themat = matrix(NA,10,10) dimnames(themat) = list(uu,uu) themat[cnames] = col as.dist(t(themat)) } result = lapply(as.data.frame(helm.raw),onecol)> result$CD1RPur Red Yel Gy1 Gy2 Green Blue BlP Pur1 Red 11.5 Yel 13.1 6.0 Gy1 12.6 7.9 6.2 Gy2 10.6 8.4 8.4 5.2 Green 10.6 9.4 9.9 6.5 4.1 Blue 10.8 10.2 10.3 8.8 7.0 6.4 BlP 7.3 11.3 12.7 11.2 10.4 9.9 4.2 Pur1 5.4 11.5 12.9 11.7 10.8 9.4 8.4 4.5 Pur2 5.0 11.5 10.7 10.2 10.6 10.1 8.1 6.4 3.0 - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Tue, 22 Mar 2011, Michael Friendly wrote:> I have a 45 x 16 data frame consisting of dissimilarities among 10 colors, > giving in each > column the 45 = 10*9/2 pairwise judgments for one of 16 subjects. The > rownames > identify each pair of colors, e.g, "AC" = ("A","C"), and the pairs are > ordered by columns > in the lower triangle of each distance matrix. > >> helm.raw <- > read.table("http://euclid.psych.yorku.ca/datavis/Private/mdshelm.dat", > header=TRUE, row.names=1) >> head(helm.raw) > N1 N2 N3 N4 N5 N6a N6b N7 N8 N9 N10 CD1 CD2a CD2b CD3 > CD4 > AC 6.8 5.9 7.1 7.5 6.6 5.2 5.8 6.2 7.5 6.0 9.2 11.5 9.3 9.0 10.4 > 9.9 > AE 12.5 11.1 10.2 10.3 10.5 9.4 10.5 10.8 9.1 9.4 10.8 13.1 10.7 10.0 12.4 > 13.2 > AG 13.8 18.8 11.1 10.7 10.2 11.4 13.4 9.9 10.2 9.5 9.7 12.6 10.7 10.4 12.8 > 12.3 > AI 14.2 17.3 12.5 11.6 9.6 13.3 14.0 11.1 12.1 9.5 10.1 10.6 11.9 10.0 13.7 > 11.1 > AK 12.5 16.6 11.8 10.6 10.8 12.0 13.2 10.3 12.5 9.8 10.3 10.6 11.0 9.3 11.8 > 8.7 > AM 11.0 16.5 9.9 9.7 9.7 12.3 11.7 8.8 9.7 8.7 9.7 10.8 9.8 8.6 4.3 > 5.6 >> row.names(helm.raw) > [1] "AC" "AE" "AG" "AI" "AK" "AM" "AO" "AQ" "AS" "CE" "CG" "CI" "CK" "CM" > "CO" "CQ" "CS" "EG" "EI" "EK" > [21] "EM" "EO" "EQ" "ES" "GI" "GK" "GM" "GO" "GQ" "GS" "IK" "IM" "IO" "IQ" > "IS" "KM" "KO" "KQ" "KS" "MO" > [41] "MQ" "MS" "OQ" "OS" "QS" >> > > To analyse this (with individual differences MDS, e.g., smacofDiff()), I need > to: > > (a) convert this to a list of objects of class "dist", one for each column of > helm.raw > (b) rename the 1-letter codes to color name abbreviations as row/col labels > for each distance matrix, > according to: > 'A'='RPur' > 'C'='Red' > 'E'='Yel' > 'G'='Gy1' > 'I'='Gy2' > 'K'='Green' > 'M'='Blue' > 'O'='BlP' > 'Q'='Pur1' > 'S'='Pur2' > > I've done this in SAS, but I don't know how to do it in R because neither > dist() nor > as.dist() seem to be able to work with data in this format. I could try > brute-force, > but maybe there is an easier way. Can someone help? > > As a distance matrix, the column helm.raw$CD1 for subject CD1 should appear > something like > shown below (without the Obs column, where stim is the rowname) > > --------------------------------- Subject=CD1 > ---------------------------------- > > Obs stim RPur Red Yel Gy1 Gy2 Green Blue BlP Pur1 Pur2 > > 1 RPur . . . . . . . . . . > 2 Red 11.5 . . . . . . . . . > 3 Yel 13.1 6.0 . . . . . . . . > 4 Gy1 12.6 7.9 6.2 . . . . . . . > 5 Gy2 10.6 8.4 8.4 5.2 . . . . . . > 6 Green 10.6 9.4 9.9 6.5 4.1 . . . . . > 7 Blue 10.8 10.2 10.3 8.8 7.0 6.4 . . . . > 8 BlP 7.3 11.3 12.7 11.2 10.4 9.9 4.2 . . . > 9 Pur1 5.4 11.5 12.9 11.7 10.8 9.4 8.4 4.5 . . > 10 Pur2 5.0 11.5 10.7 10.2 10.6 10.1 8.1 6.4 3 . > > > -- > Michael Friendly Email: friendly AT yorku DOT ca > Professor, Psychology Dept. > York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street Web: http://www.datavis.ca > Toronto, ONT M3J 1P3 CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >