On Aug 15, 2010, at 4:32 AM, Nicola Spotorno wrote:
> Dear all,
> I'm quite new in R and I have a problem with the function which.
> When I use it to select a subset of a dataframe it works well but
> somewhere R takes trace of the past dataframe and this creates
> problems with following operations.
> For example:
>
> sentences <- read.xls("frasi.tot.march.3.xls", header=TRUE)
>
> head(sentences)
> fam subjID Cond Code reg total first second
> 1 f 30 an fDan1 1 0.2812500 0.2812500 0.0000000
> 2 f 30 an fDan1 2 1.7851562 0.5390625 1.2460938
> 3 f 30 an fDan1 3 1.2304688 0.6679688 0.5625000
> 4 f 30 an fDan1 4 0.6289062 0.4375000 0.1914062
> 5 f 30 an fDan2 1 0.1367188 0.1367188 0.0000000
> 6 f 30 an fDan2 2 0.8632812 0.6679688 0.1953125
>
> str(sentences)
> 'data.frame': 4799 obs. of 8 variables:
> $ fam : Factor w/ 2 levels "f","uf": 1 1 1 1 1 1 1 1
1 1 ...
> $ subjID: int 30 30 30 30 30 30 30 30 30 30 ...
> $ Cond : Factor w/ 4 levels
"an","fi","le",..: 1 1 1 1 1 1 1 1 1
> 1 ...
> $ Code : Factor w/ 126 levels "fAan1","fAan2",..: 72
72 72 72 73 73
> 73 73 74 74 ...
> $ reg : int 1 2 3 4 1 2 3 4 1 2 ...
> $ total : num 0.281 1.785 1.23 0.629 0.137 ...
> $ first : num 0.281 0.539 0.668 0.438 0.137 ...
> $ second: num 0 1.246 0.562 0.191 0 ...
>
> # If you look the variable "Cond" you see that it has 4 levels
>
> sentences_trial <- sentences[which(sentences$Cond!= "an"),]
>
> > str(sentences)
> 'data.frame': 4799 obs. of 8 variables:
> $ fam : Factor w/ 2 levels "f","uf": 1 1 1 1 1 1 1 1
1 1 ...
> $ subjID: int 30 30 30 30 30 30 30 30 30 30 ...
> $ Cond : Factor w/ 4 levels
"an","fi","le",..: 1 1 1 1 1 1 1 1 1
> 1 ...
> $ Code : Factor w/ 126 levels "fAan1","fAan2",..: 72
72 72 72 73 73
> 73 73 74 74 ...
> $ reg : int 1 2 3 4 1 2 3 4 1 2 ...
> $ total : num 0.281 1.785 1.23 0.629 0.137 ...
> $ first : num 0.281 0.539 0.668 0.438 0.137 ...
> $ second: num 0 1.246 0.562 0.191 0 ...
>
> # Now variable "Cond" still has 4 levels but with which I have
> excluded one level!
You showed us two copies of str(sentences). How can we possibly know
what sentences_trial looks like?
> #Whether I apply at this point interaction plot, the graph
> considers 4 levels of which.
If you want to remove factor levels from a column just use factor() on
it again:
sentences_trial <- factor(sentences_trial$Cond)
Or to short-circuit that two-step process use subset with drop =TRUE:
sentences_trial <- subset( sentences, Cond!= "an" , drop=TRUE
>
> attach(sentence_trial)
> x11()
> interaction.plot(Cond,fam,total)
>
> # Where is the problem?
>
I think I identified it, but it was without a reproducible example so
it remains only an attractive theory.
David Winsemius, MD
West Hartford, CT