Hello all, I am trying to perform an ANOVA on some data in a data frame, but when I try to use lm(), I get the following error: Error in storage.mode(y) <- "double" : invalid to change the storage mode of a factor In addition: Warning message: In model.response(mf, "numeric") : using type="numeric" with a factor response will be ignored Here is a subset of my data: island_id refseq_better total refseq_proportion fiveprime threeprime island_size 1a 29 57 0.508771929824561 11127 201378 190251 1c 27 90 0.3 6303879 6600994 297115 1d 33 115 0.28695652173913 7001283 7421591 420308 2b 11 42 0.261904761904762 4381375 4520137 138762 2d 27 81 0.333333333333333 5203929 5497271 293342 3a 44 141 0.312056737588652 28792 552044 523252 When I use sapply(dataframe,mode) to check the modes of the columns, this is my output: island_id refseq_better total refseq_proportion "numeric" "numeric" "numeric" "numeric" fiveprime threeprime island_size "numeric" "numeric" "numeric" How do I need to change the format of my data frame so that I do not get the above error? Thanks! Alison Callahan PhD candidate Department of Biology Carleton University
On Oct 8, 2010, at 12:46 PM, Alison Callahan wrote:> Hello all, > > I am trying to perform an ANOVA on some data in a data frame, but when > I try to use lm(), I get the following error: > > Error in storage.mode(y) <- "double" : > invalid to change the storage mode of a factor > In addition: Warning message: > In model.response(mf, "numeric") : > using type="numeric" with a factor response will be ignored > > Here is a subset of my data: > > island_id refseq_better total refseq_proportion > fiveprime threeprime island_size > 1a 29 57 0.508771929824561 > 11127 201378 190251 > 1c 27 90 0.3 > 6303879 6600994 297115 > 1d 33 115 0.28695652173913 > 7001283 7421591 420308 > 2b 11 42 0.261904761904762 > 4381375 4520137 138762 > 2d 27 81 0.333333333333333 > 5203929 5497271 293342 > 3a 44 141 0.312056737588652 > 28792 552044 523252 > > When I use sapply(dataframe,mode) to check the modes of the columns, > this is my output: > > island_id refseq_better total > refseq_proportion > "numeric" "numeric" "numeric" "numeric" > fiveprime threeprime island_size > "numeric" "numeric" "numeric"So? > mode(factor(1:10)) [1] "numeric" Storage mode is not the right question. "class" is the correct question. Best would be to try is.factor()> How do I need to change the format of my data frame so that I do not > get the above error?Figure out which of your columns are factors and apply the FAQ 7.12 -- David.> > Thanks! > > Alison Callahan > PhD candidateDavid Winsemius, MD West Hartford, CT
See below. On Fri, Oct 8, 2010 at 1:01 PM, David Winsemius <dwinsemius at comcast.net> wrote:> > On Oct 8, 2010, at 12:46 PM, Alison Callahan wrote: > >> Hello all, >> >> I am trying to perform an ANOVA on some data in a data frame, but when >> I try to use lm(), I get the following error: >> >> Error in storage.mode(y) <- "double" : >> ?invalid to change the storage mode of a factor >> In addition: Warning message: >> In model.response(mf, "numeric") : >> ?using type="numeric" with a factor response will be ignored >> >> Here is a subset of my data: >> >> ?island_id ? ? refseq_better ? ? total ? ? ? ?refseq_proportion >> fiveprime ? threeprime ? ? ? ?island_size >> ? ? ?1a ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? ?57 ? ? ? ? ?0.508771929824561 >> ? 11127 ? ? 201378 ? ? ? ? ? ? ? 190251 >> ? ? ?1c ? ? ? ? ? ?27 ? ? ? ? ? ? ? ? ?90 ? ? ? ? ? ? ? 0.3 >> ? ? ? ? ? ?6303879 ? ?6600994 ? ? ? ? ? ?297115 >> ? ? ?1d ? ? ? ? ? ?33 ? ? ? ? ? ? ? ? 115 ? ? ? ? 0.28695652173913 >> ?7001283 ? ?7421591 ? ? ? ? ? ?420308 >> ? ? ?2b ? ? ? ? ? ?11 ? ? ? ? ? ? ? ? ?42 ? ? ? ? ?0.261904761904762 >> ?4381375 ? ?4520137 ? ? ? ? ? ?138762 >> ? ? ?2d ? ? ? ? ? ?27 ? ? ? ? ? ? ? ? ?81 ? ? ? ? ?0.333333333333333 >> ?5203929 ? ?5497271 ? ? ? ? ? ?293342 >> ? ? ?3a ? ? ? ? ? ?44 ? ? ? ? ? ? ? ? 141 ? ? ? ? 0.312056737588652 >> ?28792 ? ? ?552044 ? ? ? ? ? ? ?523252 >> >> When I use sapply(dataframe,mode) to check the modes of the columns, >> this is my output: >> >> ? ? ?island_id ? ? refseq_better ? ? ? ? ? ? total >> ?refseq_proportion >> ? ? ? "numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" >> ? ? ? fiveprime ? ? ? ?threeprime ? ? ? island_size >> ? ? ? "numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" > > So? > >> mode(factor(1:10)) > [1] "numeric" > > Storage mode is not the right question. "class" is the correct question. > Best would be to try is.factor()Looking at the error message, the problem does appear to be with storage mode ... is this affected by whether the variables are factors? Note this part of the error: Error in storage.mode(y) <- "double" : invalid to change the storage mode of a factor> > >> How do I need to change the format of my data frame so that I do not >> get the above error? > > Figure out which of your columns are factors and apply the FAQ 7.12 > > -- > David. >> >> Thanks! >> >> Alison Callahan >> PhD candidate > > > David Winsemius, MD > West Hartford, CT > >Thanks, Alison
On Oct 8, 2010, at 1:09 PM, Alison Callahan wrote:> See below. > > On Fri, Oct 8, 2010 at 1:01 PM, David Winsemius <dwinsemius at comcast.net > > wrote: >> >> On Oct 8, 2010, at 12:46 PM, Alison Callahan wrote: >> >>> Hello all, >>> >>> I am trying to perform an ANOVA on some data in a data frame, but >>> when >>> I try to use lm(), I get the following error:The question that should have been posed was how did you use lm()? It could be that the inappropriate use of a factor as a dependent variable was the cause. Did you use something like: lm(island_id ~ refseq_better, data=dfrm) # ?? Example: > dfrm <- data.frame(a=factor(rnorm(10)), b=rnorm(10)) > lm(a ~ b, data=dfrm) Error in storage.mode(y) <- "double" : invalid to change the storage mode of a factor In addition: Warning message: In model.response(mf, "numeric") : using type="numeric" with a factor response will be ignored>>> >>> Error in storage.mode(y) <- "double" : >>> invalid to change the storage mode of a factor >>> In addition: Warning message: >>> In model.response(mf, "numeric") : >>> using type="numeric" with a factor response will be ignored >>> >>> Here is a subset of my data: >>> >>> island_id refseq_better total refseq_proportion >>> fiveprime threeprime island_size >>> 1a 29 57 0.508771929824561 >>> 11127 201378 190251 >>> 1c 27 90 0.3 >>> 6303879 6600994 297115 >>> 1d 33 115 0.28695652173913 >>> 7001283 7421591 420308 >>> 2b 11 42 0.261904761904762 >>> 4381375 4520137 138762 >>> 2d 27 81 0.333333333333333 >>> 5203929 5497271 293342 >>> 3a 44 141 0.312056737588652 >>> 28792 552044 523252 >>> >>> When I use sapply(dataframe,mode) to check the modes of the columns, >>> this is my output: >>> >>> island_id refseq_better total >>> refseq_proportion >>> "numeric" "numeric" "numeric" >>> "numeric" >>> fiveprime threeprime island_size >>> "numeric" "numeric" "numeric" >> >> So? >> >>> mode(factor(1:10)) >> [1] "numeric" >> >> Storage mode is not the right question. "class" is the correct >> question. >> Best would be to try is.factor() > > Looking at the error message, the problem does appear to be with > storage mode ... is this affected by whether the variables are > factors? Note this part of the error: > > Error in storage.mode(y) <- "double" : > invalid to change the storage mode of a factor > >> >> >>> How do I need to change the format of my data frame so that I do not >>> get the above error? >> >> Figure out which of your columns are factors and apply the FAQ 7.12 >> >> -- >> David. >>> >>> Thanks! >>> >>> Alison Callahan >>> PhD candidate >> >> >> David Winsemius, MD >> West Hartford, CT >> >> > Thanks, > > AlisonDavid Winsemius, MD West Hartford, CT
Hi again On Fri, Oct 8, 2010 at 1:13 PM, David Winsemius <dwinsemius at comcast.net> wrote:> > On Oct 8, 2010, at 1:09 PM, Alison Callahan wrote: > >> See below. >> >> On Fri, Oct 8, 2010 at 1:01 PM, David Winsemius <dwinsemius at comcast.net> >> wrote: >>> >>> On Oct 8, 2010, at 12:46 PM, Alison Callahan wrote: >>> >>>> Hello all, >>>> >>>> I am trying to perform an ANOVA on some data in a data frame, but when >>>> I try to use lm(), I get the following error: > > The question that should have been posed was how did you use lm()? > > It could be that the inappropriate use of a factor as a dependent variable > was the cause. Did you use something like: > > lm(island_id ~ refseq_better, data=dfrm) ?# ?? >My usage of lm() was as follows: lmout <- lm(refseq_proportion~fiveprime,data=props_df)> Example: >> dfrm <- data.frame(a=factor(rnorm(10)), b=rnorm(10)) >> lm(a ~ b, data=dfrm) > Error in storage.mode(y) <- "double" : > ?invalid to change the storage mode of a factor > In addition: Warning message: > In model.response(mf, "numeric") : > ?using type="numeric" with a factor response will be ignored > > >>>> >>>> Error in storage.mode(y) <- "double" : >>>> ?invalid to change the storage mode of a factor >>>> In addition: Warning message: >>>> In model.response(mf, "numeric") : >>>> ?using type="numeric" with a factor response will be ignored >>>> >>>> Here is a subset of my data: >>>> >>>> ?island_id ? ? refseq_better ? ? total ? ? ? ?refseq_proportion >>>> fiveprime ? threeprime ? ? ? ?island_size >>>> ? ? 1a ? ? ? ? ? ?29 ? ? ? ? ? ? ? ? ?57 ? ? ? ? ?0.508771929824561 >>>> ?11127 ? ? 201378 ? ? ? ? ? ? ? 190251 >>>> ? ? 1c ? ? ? ? ? ?27 ? ? ? ? ? ? ? ? ?90 ? ? ? ? ? ? ? 0.3 >>>> ? ? ? ? ? 6303879 ? ?6600994 ? ? ? ? ? ?297115 >>>> ? ? 1d ? ? ? ? ? ?33 ? ? ? ? ? ? ? ? 115 ? ? ? ? 0.28695652173913 >>>> ?7001283 ? ?7421591 ? ? ? ? ? ?420308 >>>> ? ? 2b ? ? ? ? ? ?11 ? ? ? ? ? ? ? ? ?42 ? ? ? ? ?0.261904761904762 >>>> ?4381375 ? ?4520137 ? ? ? ? ? ?138762 >>>> ? ? 2d ? ? ? ? ? ?27 ? ? ? ? ? ? ? ? ?81 ? ? ? ? ?0.333333333333333 >>>> ?5203929 ? ?5497271 ? ? ? ? ? ?293342 >>>> ? ? 3a ? ? ? ? ? ?44 ? ? ? ? ? ? ? ? 141 ? ? ? ? 0.312056737588652 >>>> ?28792 ? ? ?552044 ? ? ? ? ? ? ?523252 >>>> >>>> When I use sapply(dataframe,mode) to check the modes of the columns, >>>> this is my output: >>>> >>>> ? ? island_id ? ? refseq_better ? ? ? ? ? ? total >>>> ?refseq_proportion >>>> ? ? ?"numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" >>>> ? ? ?fiveprime ? ? ? ?threeprime ? ? ? island_size >>>> ? ? ?"numeric" ? ? ? ? "numeric" ? ? ? ? "numeric" >>> >>> So? >>> >>>> mode(factor(1:10)) >>> >>> [1] "numeric" >>> >>> Storage mode is not the right question. "class" is the correct question. >>> Best would be to try is.factor() >> >> Looking at the error message, the problem does appear to be with >> storage mode ... is this affected by whether the variables are >> factors? Note this part of the error: >> >> Error in storage.mode(y) <- "double" : >> invalid to change the storage mode of a factor >> >>> >>> >>>> How do I need to change the format of my data frame so that I do not >>>> get the above error? >>> >>> Figure out which of your columns are factors and apply the FAQ 7.12 >>> >>> -- >>> David. >>>> >>>> Thanks! >>>> >>>> Alison Callahan >>>> PhD candidate >>> >>> >>> David Winsemius, MD >>> West Hartford, CT >>> >>> >> Thanks, >> >> Alison > > David Winsemius, MD > West Hartford, CT > >
On Oct 8, 2010, at 1:17 PM, Alison Callahan wrote:> Hi again > > On Fri, Oct 8, 2010 at 1:13 PM, David Winsemius <dwinsemius at comcast.net > > wrote: >> >> On Oct 8, 2010, at 1:09 PM, Alison Callahan wrote: >> >>> See below. >>> >>> On Fri, Oct 8, 2010 at 1:01 PM, David Winsemius <dwinsemius at comcast.net >>> > >>> wrote: >>>> >>>> On Oct 8, 2010, at 12:46 PM, Alison Callahan wrote: >>>> >>>>> Hello all, >>>>> >>>>> I am trying to perform an ANOVA on some data in a data frame, >>>>> but when >>>>> I try to use lm(), I get the following error: >> >> The question that should have been posed was how did you use lm()? >> >> It could be that the inappropriate use of a factor as a dependent >> variable >> was the cause. Did you use something like: >> >> lm(island_id ~ refseq_better, data=dfrm) # ?? >> > My usage of lm() was as follows: > > lmout <- lm(refseq_proportion~fiveprime,data=props_df)I think all the relevant questions (and answer) are on the table which means the final question is : ....what does this return? is.factor(props_df$refseq_proportion) # I predict "factor" -- David.> >> Example: >>> dfrm <- data.frame(a=factor(rnorm(10)), b=rnorm(10)) >>> lm(a ~ b, data=dfrm) >> Error in storage.mode(y) <- "double" : >> invalid to change the storage mode of a factor >> In addition: Warning message: >> In model.response(mf, "numeric") : >> using type="numeric" with a factor response will be ignored >> >> >>>>> >>>>> Error in storage.mode(y) <- "double" : >>>>> invalid to change the storage mode of a factor >>>>> In addition: Warning message: >>>>> In model.response(mf, "numeric") : >>>>> using type="numeric" with a factor response will be ignored >>>>> >>>>> Here is a subset of my data: >>>>> >>>>> island_id refseq_better total refseq_proportion >>>>> fiveprime threeprime island_size >>>>> 1a 29 57 >>>>> 0.508771929824561 >>>>> 11127 201378 190251 >>>>> 1c 27 90 0.3 >>>>> 6303879 6600994 297115 >>>>> 1d 33 115 0.28695652173913 >>>>> 7001283 7421591 420308 >>>>> 2b 11 42 >>>>> 0.261904761904762 >>>>> 4381375 4520137 138762 >>>>> 2d 27 81 >>>>> 0.333333333333333 >>>>> 5203929 5497271 293342 >>>>> 3a 44 141 0.312056737588652 >>>>> 28792 552044 523252 >>>>> >>>>> When I use sapply(dataframe,mode) to check the modes of the >>>>> columns, >>>>> this is my output: >>>>> >>>>> island_id refseq_better total >>>>> refseq_proportion >>>>> "numeric" "numeric" "numeric" >>>>> "numeric" >>>>> fiveprime threeprime island_size >>>>> "numeric" "numeric" "numeric" >>>> >>>> So? >>>> >>>>> mode(factor(1:10)) >>>> >>>> [1] "numeric" >>>> >>>> Storage mode is not the right question. "class" is the correct >>>> question. >>>> Best would be to try is.factor() >>> >>> Looking at the error message, the problem does appear to be with >>> storage mode ... is this affected by whether the variables are >>> factors? Note this part of the error: >>> >>> Error in storage.mode(y) <- "double" : >>> invalid to change the storage mode of a factor >>> >>>> >>>> >>>>> How do I need to change the format of my data frame so that I do >>>>> not >>>>> get the above error? >>>> >>>> Figure out which of your columns are factors and apply the FAQ 7.12 >>>> >>>> -- >>>> David. >>>>> >>>>> Thanks! >>>>> >>>>> Alison Callahan >>>>> PhD candidate >>>> >>>> >>>> David Winsemius, MD >>>> West Hartford, CT >>>> >>>> >>> Thanks, >>> >>> Alison >> >> David Winsemius, MD >> West Hartford, CT >> >>David Winsemius, MD West Hartford, CT