Hi All: I was recently working with a dataset on arsenic poisoning. Among the variables in the dataset, I used the following three variables to produce crosstabulations (variable names: FOLSTAT, GENDER, ASBIN; all three were categorical variables, FOLSTAT denoted follow up status for the subjects and had seven levels, GENDER denoted sex (two levels: male,female), and ASBIN denoted binarized arsenic concentrations (two levels: "<0.05", ">0.05" denoting less than 0.05 mg/L and more than 0.05 mg/L respectively). To illustrate, I used the following code for crosstabulation: x <- table(FOLSTAT,GENDER,ASBIN) # from the results, I then wanted to subset a table for the ASBIN value ">0.05" I used the following code to subset the table: y <- x[,,ASBIN=">0.05"] # When I do this, R throws an error message stating "subscript out of range". However, it runs fine if I change the labels for my ASBIN variable from "<0.05" and ">=0.05" to words like "Nonexposed" and "Exposed" respectively. I searched the archives and the documentations for this, but could not find a solution. I understand that sometimes it is more expressive to use expressions like "<0.05" (or something similar) as headings in cross-tabulations. What was I doing incorrectly? Would greatly appreciate your insight. I use: R version: 1.7.1 OS: Windows XP Home TIA, Arin Basu
arinbasu at softhome.net wrote:> Hi All: > I was recently working with a dataset on arsenic poisoning. Among the > variables in the dataset, I used the following three variables to > produce crosstabulations (variable names: FOLSTAT, GENDER, ASBIN; all > three were categorical variables, FOLSTAT denoted follow up status for > the subjects and had seven levels, GENDER denoted sex (two levels: > male,female), and ASBIN denoted binarized arsenic concentrations (two > levels: "<0.05", ">0.05" denoting less than 0.05 mg/L and more than 0.05 > mg/L respectively). > To illustrate, I used the following code for crosstabulation: > x <- table(FOLSTAT,GENDER,ASBIN) > # from the results, I then wanted to subset a table for the ASBIN > value ">0.05" > I used the following code to subset the table: > y <- x[,,ASBIN=">0.05"]Two errors. 1) Your logical index won't work. For the second level of ASBIN, use x[,,2]. Since the third dimension of the table x has only 2 elements (one for each level of ASBIN), sending it a logical vector that is as long as your number of subjects (N) is only going to confuse it. It's going to run out of levels of table. And it did - "subscript out of ..." 1) Is this a cut-and-paste error? > y <- x[,,ASBIN=">0.05"] It won't work anyway, but for future reference, logical "equals" is ==, not =. In other words, "==" is a question, "=" is an assignment. Cheers Jason -- Indigo Industrial Controls Ltd. 64-21-343-545 jasont at indigoindustrial.co.nz
On Thu, 21 Aug 2003 arinbasu at softhome.net wrote:> I was recently working with a dataset on arsenic poisoning. Among the > variables in the dataset, I used the following three variables to produce > crosstabulations (variable names: FOLSTAT, GENDER, ASBIN; all three were > categorical variables, FOLSTAT denoted follow up status for the subjects and > had seven levels, GENDER denoted sex (two levels: male,female), and ASBIN > denoted binarized arsenic concentrations (two levels: "<0.05", ">0.05" > denoting less than 0.05 mg/L and more than 0.05 mg/L respectively). > > To illustrate, I used the following code for crosstabulation: > > x <- table(FOLSTAT,GENDER,ASBIN) > > # from the results, I then wanted to subset a table for the ASBIN > value ">0.05" > > I used the following code to subset the table: > > y <- x[,,ASBIN=">0.05"] > > # When I do this, R throws an error message stating "subscript out of > range". However, it runs fine if I change the labels for my ASBIN variable > from "<0.05" and ">=0.05" to words like "Nonexposed" and "Exposed" > respectively.I've tried to reproduce this behavior using R-1.7.1 on redhat 8.0 linux. I can come close: I can get an error message: "subscript out of bounds", not "out of range". Seems to me that the variable name and equals sign are ignored. I can use any variable name I like, even ones that aren't in my workspace. Seems that matching is done by position in the string of commas, not by the variable name, and I can only get the error message "subscript out of bounds" when I've goofed up the matching by position - for example, when I try to subscript the third dimension using a string value from dimnames(x)[[2]] or dimnames(x)[[1]]. Please try this again and see whether all the commas are there for matching by position. I think R ignores the presence of "ASBIN=" in your example above. - tom blackwell - u michigan medical school - ann arbor -> I searched the archives and the documentations for this, but could not find > a solution. I understand that sometimes it is more expressive to use > expressions like "<0.05" (or something similar) as headings in > cross-tabulations. What was I doing incorrectly? > > Would greatly appreciate your insight. > > I use: > R version: 1.7.1 > OS: Windows XP Home > > TIA, > Arin Basu >