Sammy Zee
2011-Nov-12 19:47 UTC
[R] With an example - Re: rbind.data.frame drops attributes for factor variables
When I use rbind() or rbind.data.frame() to add a row to an existing
dataframe, it appears that attributes for the column of type "factor"
are
dropped. See the sample example below to reproduce the problem. Please
suggest How I can fix this.
Thanks,
Sammy
a=c("Male", "Male", "Female", "Male")
b=c(1,2,3,4)
c=c("great", "bad", "good", "bad")
dataset<- data.frame (gender = a, count = b, answer = c)
> dataset
gender count answer
1 Male 1 great
2 Male 2 bad
3 Female 3 good
4 Male 4 bad
> attributes(dataset$answer)
$levels
[1] "bad" "good" "great"
$class
[1] "factor"
Now adding some custom attributes to column dataset$answer
attributes(dataset$answer)<-c(attributes(dataset$answer),list(newattr1="custom-attr1"))
attributes(dataset$answer)<-c(attributes(dataset$answer),list(newattr2="custom-attr2"))
> attributes(dataset$answer)
$levels
[1] "bad" "good" "great"
$class
[1] "factor"
$newattr1
[1] "custom-attr1"
$newattr2
[1] "custom-attr2"
However as soon as I add a row to this data frame ("dataset") by
rbind(),
it loses the custom
attributes ("newattr1" and "newattr2") I have just added
newrow = c(gender="Female", count = 5, answer = "great")
dataset <- rbind(dataset, newrow)
> attributes(dataset$answer)
$levels
[1] "bad" "good" "great"
$class
[1] "factor"
the two custom attributes are dropped!! Any suggestion why this is
happening.
On Fri, Nov 11, 2011 at 11:44 AM, Jeff Newmiller
<jdnewmil@dcn.davis.ca.us>wrote:
> As the doctor says, if it hurts "don't do that".
>
> A factor is a sequence of integers with a corresponding list of character
> strings. Factors in two separate vectors can and usually do map the same
> integer to different strings, and R cannot tell how you want that resolved.
>
> Convert these columns to character before combining them, and only convert
> to factor when you have all of your possibilities present (or you specify
> them in the creation of the factor vector).
> ---------------------------------------------------------------------------
> Jeff Newmiller The ..... ..... Go Live...
> DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live
> Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> Sammy Zee <szee2007@gmail.com> wrote:
>
> >Hi all,
> >
> >When I use rbind() or rbind.data.frame() to add a row to an existing
> >dataframe, it appears that attributes for the column of type
"factor"
> >are
> >dropped. I see the following post with same problem. However i did not
> >see
> >any reply to the following posting offering a solution. Could someone
> >please help.
> >
> >
>
http://r.789695.n4.nabble.com/rbind-data-frame-drops-attributes-for-factor-variables-td919575.html
> >
> >Thanks,
> >Sammy
> >
> > [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
David Winsemius
2011-Nov-12 23:17 UTC
[R] With an example - Re: rbind.data.frame drops attributes for factor variables
On Nov 12, 2011, at 2:47 PM, Sammy Zee wrote:> When I use rbind() or rbind.data.frame() to add a row to an existing > dataframe, it appears that attributes for the column of type > "factor" are > dropped. See the sample example below to reproduce the problem. Please > suggest How I can fix this. >> Thanks, > Sammy > > a=c("Male", "Male", "Female", "Male") > b=c(1,2,3,4) > c=c("great", "bad", "good", "bad") > dataset<- data.frame (gender = a, count = b, answer = c) > >> dataset > > gender count answer > 1 Male 1 great > 2 Male 2 bad > 3 Female 3 good > 4 Male 4 bad > > >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > Now adding some custom attributes to column dataset$answer > > attributes(dataset$answer)<-c(attributes(dataset > $answer),list(newattr1="custom-attr1")) > attributes(dataset$answer)<-c(attributes(dataset > $answer),list(newattr2="custom-attr2"))If you look through the code of rbind.data.frame you see that column values are processed with the 'factor' function. > attributes(dataset$answer) $levels [1] "bad" "good" "great" $class [1] "factor" $newattr1 [1] "custom-attr1" $newattr2 [1] "custom-attr2" > attributes(factor(dataset$answer)) $levels [1] "bad" "good" "great" $class [1] "factor" So I think you are out of luck. You will need to restore the "special attributes" yourself. -- David.> >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > $newattr1 > [1] "custom-attr1" > > $newattr2 > [1] "custom-attr2" > > However as soon as I add a row to this data frame ("dataset") by > rbind(), > it loses the custom > attributes ("newattr1" and "newattr2") I have just added > > newrow = c(gender="Female", count = 5, answer = "great") > > dataset <- rbind(dataset, newrow) > >> attributes(dataset$answer) > $levels > [1] "bad" "good" "great" > > $class > [1] "factor" > > the two custom attributes are dropped!! Any suggestion why this is > happening. > > On Fri, Nov 11, 2011 at 11:44 AM, Jeff Newmiller > <jdnewmil at dcn.davis.ca.us>wrote: > >> As the doctor says, if it hurts "don't do that". >> >> A factor is a sequence of integers with a corresponding list of >> character >> strings. Factors in two separate vectors can and usually do map the >> same >> integer to different strings, and R cannot tell how you want that >> resolved. >> >> Convert these columns to character before combining them, and only >> convert >> to factor when you have all of your possibilities present (or you >> specify >> them in the creation of the factor vector). >> --------------------------------------------------------------------------- >> Jeff Newmiller The ..... ..... Go >> Live... >> DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live >> Go... >> Live: OO#.. Dead: OO#.. >> Playing >> Research Engineer (Solar/Batteries O.O#. #.O#. with >> /Software/Embedded Controllers) .OO#. .OO#. >> rocks...1k >> --------------------------------------------------------------------------- >> Sent from my phone. Please excuse my brevity. >> >> Sammy Zee <szee2007 at gmail.com> wrote: >> >>> Hi all, >>> >>> When I use rbind() or rbind.data.frame() to add a row to an existing >>> dataframe, it appears that attributes for the column of type >>> "factor" >>> are >>> dropped. I see the following post with same problem. However i did >>> not >>> see >>> any reply to the following posting offering a solution. Could >>> someone >>> please help. >>> >>> >> http://r.789695.n4.nabble.com/rbind-data-frame-drops-attributes-for-factor-variables-td919575.html >>> >>> Thanks, >>> Sammy >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT