Sammy Zee
2011-Nov-12 19:47 UTC
[R] With an example - Re: rbind.data.frame drops attributes for factor variables
When I use rbind() or rbind.data.frame() to add a row to an existing dataframe, it appears that attributes for the column of type "factor" are dropped. See the sample example below to reproduce the problem. Please suggest How I can fix this. Thanks, Sammy a=c("Male", "Male", "Female", "Male") b=c(1,2,3,4) c=c("great", "bad", "good", "bad") dataset<- data.frame (gender = a, count = b, answer = c)> datasetgender count answer 1 Male 1 great 2 Male 2 bad 3 Female 3 good 4 Male 4 bad> attributes(dataset$answer)$levels [1] "bad" "good" "great" $class [1] "factor" Now adding some custom attributes to column dataset$answer attributes(dataset$answer)<-c(attributes(dataset$answer),list(newattr1="custom-attr1")) attributes(dataset$answer)<-c(attributes(dataset$answer),list(newattr2="custom-attr2"))> attributes(dataset$answer)$levels [1] "bad" "good" "great" $class [1] "factor" $newattr1 [1] "custom-attr1" $newattr2 [1] "custom-attr2" However as soon as I add a row to this data frame ("dataset") by rbind(), it loses the custom attributes ("newattr1" and "newattr2") I have just added newrow = c(gender="Female", count = 5, answer = "great") dataset <- rbind(dataset, newrow)> attributes(dataset$answer)$levels [1] "bad" "good" "great" $class [1] "factor" the two custom attributes are dropped!! Any suggestion why this is happening. On Fri, Nov 11, 2011 at 11:44 AM, Jeff Newmiller <jdnewmil@dcn.davis.ca.us>wrote:> As the doctor says, if it hurts "don't do that". > > A factor is a sequence of integers with a corresponding list of character > strings. Factors in two separate vectors can and usually do map the same > integer to different strings, and R cannot tell how you want that resolved. > > Convert these columns to character before combining them, and only convert > to factor when you have all of your possibilities present (or you specify > them in the creation of the factor vector). > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > Sammy Zee <szee2007@gmail.com> wrote: > > >Hi all, > > > >When I use rbind() or rbind.data.frame() to add a row to an existing > >dataframe, it appears that attributes for the column of type "factor" > >are > >dropped. I see the following post with same problem. However i did not > >see > >any reply to the following posting offering a solution. Could someone > >please help. > > > > > http://r.789695.n4.nabble.com/rbind-data-frame-drops-attributes-for-factor-variables-td919575.html > > > >Thanks, > >Sammy > > > > [[alternative HTML version deleted]] > > > >______________________________________________ > >R-help@r-project.org mailing list > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
David Winsemius
2011-Nov-12 23:17 UTC
[R] With an example - Re: rbind.data.frame drops attributes for factor variables
On Nov 12, 2011, at 2:47 PM, Sammy Zee wrote:> When I use rbind() or rbind.data.frame() to add a row to an existing > dataframe, it appears that attributes for the column of type > "factor" are > dropped. See the sample example below to reproduce the problem. Please > suggest How I can fix this. >> Thanks, > Sammy > > a=c("Male", "Male", "Female", "Male") > b=c(1,2,3,4) > c=c("great", "bad", "good", "bad") > dataset<- data.frame (gender = a, count = b, answer = c) > >> dataset > > gender count answer > 1 Male 1 great > 2 Male 2 bad > 3 Female 3 good > 4 Male 4 bad > > >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > Now adding some custom attributes to column dataset$answer > > attributes(dataset$answer)<-c(attributes(dataset > $answer),list(newattr1="custom-attr1")) > attributes(dataset$answer)<-c(attributes(dataset > $answer),list(newattr2="custom-attr2"))If you look through the code of rbind.data.frame you see that column values are processed with the 'factor' function. > attributes(dataset$answer) $levels [1] "bad" "good" "great" $class [1] "factor" $newattr1 [1] "custom-attr1" $newattr2 [1] "custom-attr2" > attributes(factor(dataset$answer)) $levels [1] "bad" "good" "great" $class [1] "factor" So I think you are out of luck. You will need to restore the "special attributes" yourself. -- David.> >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > $newattr1 > [1] "custom-attr1" > > $newattr2 > [1] "custom-attr2" > > However as soon as I add a row to this data frame ("dataset") by > rbind(), > it loses the custom > attributes ("newattr1" and "newattr2") I have just added > > newrow = c(gender="Female", count = 5, answer = "great") > > dataset <- rbind(dataset, newrow) > >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > the two custom attributes are dropped!! Any suggestion why this is > happening. > > On Fri, Nov 11, 2011 at 11:44 AM, Jeff Newmiller > <jdnewmil at dcn.davis.ca.us>wrote: > >> As the doctor says, if it hurts "don't do that". >> >> A factor is a sequence of integers with a corresponding list of >> character >> strings. Factors in two separate vectors can and usually do map the >> same >> integer to different strings, and R cannot tell how you want that >> resolved. >> >> Convert these columns to character before combining them, and only >> convert >> to factor when you have all of your possibilities present (or you >> specify >> them in the creation of the factor vector). >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go >> Live... >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> Go... >> Live: OO#.. Dead: OO#.. >> Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. >> rocks...1k >> --------------------------------------------------------------------------- >> Sent from my phone. Please excuse my brevity. >> >> Sammy Zee <szee2007 at gmail.com> wrote: >> >>> Hi all, >>> >>> When I use rbind() or rbind.data.frame() to add a row to an existing >>> dataframe, it appears that attributes for the column of type >>> "factor" >>> are >>> dropped. I see the following post with same problem. However i did >>> not >>> see >>> any reply to the following posting offering a solution. Could >>> someone >>> please help. >>> >>> >> http://r.789695.n4.nabble.com/rbind-data-frame-drops-attributes-for-factor-variables-td919575.html >>> >>> Thanks, >>> Sammy >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT