drflxms
2011-Dec-29 09:44 UTC
[R] sorting a data.frame (df) by a vector (which is not contained in the df) - unexpected behaviour of match and factor
Dear R colleagues,
consider my data.frame named "df" with 3 columns - being level,
prevalence and sensitivity - and 7 rows of data (see dump below).
df <-
structure(list(level = structure(1:7, .Label = c("0", "1",
"10",
"100", "1010", "11", "110"), class =
"factor"), prevalence structure(c(4L,
2L, 3L, 5L, 6L, 1L, 7L), .Label = c("0.488", "0.5",
"0.754",
"0.788", "0.803", "0.887", "0.905"),
class = "factor"), sensitivity structure(c(6L,
1L, 5L, 4L, 3L, 2L, 1L), .Label = c("0", "0.05",
"0.091", "0.123",
"0.327", "0.933"), class = "factor")), .Names =
c("level", "prevalence",
"sensitivity"), class = "data.frame", row.names = c(NA,
-7L))
I'd like to order df by a vector which is NOT contained in the
data.frame. Let's call this vector desiredOrder (see dump below).
desiredOrder <- c("0", "1", "10",
"100", "11", "110", "1010")
So after sorting, the order of the level column (df$level) should be in
the order of the vector desiredOrder (as well a the associated data in
the other columns).
I know that this is not an easy task to achieve by order(...) as the
order of desiredOrder isn't a natural one. But I would expect both of
the following to work:
## using match
df[match(df$level,desiredOrder),]
## using factor
df[factor(df$level,levels=desiredOrder),]
Unfortunately the result isn't what I expected: I get a data.frame with
the level column in the order 0,1,10,100,110,1010,11 instead of the
order in desiredOrder (0,1,10,100,11,110,1010).
Does anybody see, what I am doing wrong?
I'd appreciate any kind of help very much!
Best regards, Felix
Jeff Newmiller
2011-Dec-29 09:58 UTC
[R] sorting a data.frame (df) by a vector (which is not contained in the df) - unexpected behaviour of match and factor
Your desiredOrder vector is a vector of strings. Convert it to numeric and it
should work.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
drflxms <drflxms at googlemail.com> wrote:
>Dear R colleagues,
>
>consider my data.frame named "df" with 3 columns - being level,
>prevalence and sensitivity - and 7 rows of data (see dump below).
>
>df <-
>structure(list(level = structure(1:7, .Label = c("0",
"1", "10",
>"100", "1010", "11", "110"), class =
"factor"), prevalence >structure(c(4L,
>2L, 3L, 5L, 6L, 1L, 7L), .Label = c("0.488", "0.5",
"0.754",
>"0.788", "0.803", "0.887", "0.905"),
class = "factor"), sensitivity >structure(c(6L,
>1L, 5L, 4L, 3L, 2L, 1L), .Label = c("0", "0.05",
"0.091", "0.123",
>"0.327", "0.933"), class = "factor")), .Names
= c("level",
>"prevalence",
>"sensitivity"), class = "data.frame", row.names = c(NA,
-7L))
>
>I'd like to order df by a vector which is NOT contained in the
>data.frame. Let's call this vector desiredOrder (see dump below).
>
>desiredOrder <- c("0", "1", "10",
"100", "11", "110", "1010")
>
>So after sorting, the order of the level column (df$level) should be in
>the order of the vector desiredOrder (as well a the associated data in
>the other columns).
>I know that this is not an easy task to achieve by order(...) as the
>order of desiredOrder isn't a natural one. But I would expect both of
>the following to work:
>
>## using match
>df[match(df$level,desiredOrder),]
>
>## using factor
>df[factor(df$level,levels=desiredOrder),]
>
>Unfortunately the result isn't what I expected: I get a data.frame with
>the level column in the order 0,1,10,100,110,1010,11 instead of the
>order in desiredOrder (0,1,10,100,11,110,1010).
>
>Does anybody see, what I am doing wrong?
>I'd appreciate any kind of help very much!
>Best regards, Felix
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
Berend Hasselman
2011-Dec-29 10:21 UTC
[R] sorting a data.frame (df) by a vector (which is not contained in the df) - unexpected behaviour of match and factor
drflxms wrote> > Dear R colleagues, > > consider my data.frame named "df" with 3 columns - being level, > prevalence and sensitivity - and 7 rows of data (see dump below). > > df <- > structure(list(level = structure(1:7, .Label = c("0", "1", "10", > "100", "1010", "11", "110"), class = "factor"), prevalence > structure(c(4L, > 2L, 3L, 5L, 6L, 1L, 7L), .Label = c("0.488", "0.5", "0.754", > "0.788", "0.803", "0.887", "0.905"), class = "factor"), sensitivity > structure(c(6L, > 1L, 5L, 4L, 3L, 2L, 1L), .Label = c("0", "0.05", "0.091", "0.123", > "0.327", "0.933"), class = "factor")), .Names = c("level", "prevalence", > "sensitivity"), class = "data.frame", row.names = c(NA, -7L)) > > I'd like to order df by a vector which is NOT contained in the > data.frame. Let's call this vector desiredOrder (see dump below). > > desiredOrder <- c("0", "1", "10", "100", "11", "110", "1010") > > So after sorting, the order of the level column (df$level) should be in > the order of the vector desiredOrder (as well a the associated data in > the other columns). > I know that this is not an easy task to achieve by order(...) as the > order of desiredOrder isn't a natural one. But I would expect both of > the following to work: > > ## using match > df[match(df$level,desiredOrder),] > > ## using factor > df[factor(df$level,levels=desiredOrder),] > > Unfortunately the result isn't what I expected: I get a data.frame with > the level column in the order 0,1,10,100,110,1010,11 instead of the > order in desiredOrder (0,1,10,100,11,110,1010). > > Does anybody see, what I am doing wrong? >Try this: df[match(desiredOrder,df$level),] Berend -- View this message in context: http://r.789695.n4.nabble.com/sorting-a-data-frame-df-by-a-vector-which-is-not-contained-in-the-df-unexpected-behaviour-of-match-ar-tp4242326p4242392.html Sent from the R help mailing list archive at Nabble.com.