On Sun, 22-Mar-2015 at 08:06AM -0800, John Kane wrote: |> Well, first off, you have no variable called "Name". You have lost |> the state names as they are rownames in the matrix state.x77 and |> not a variable. |> |> Try this. It's ugly and I have no idea why I had to do a cbind() You don't have to use cbind |> but it seems to work. Personally I find subset easier to read than |> the indexing approach. |> state <- rownames(state.x77) |> all.states <- as.data.frame(state.x77) |> all.states <- cbind(state, all.states) ### ????? You don't have to use cbind() all.states <- within(as.data.frame(state.x77), state <- rownames(state.x77)) but I think cbind is simpler to read. |> |> coldstates <- subset(all.states, all.states$Frost > 50, |> select = c("state","Frost") ) Tidier, even more so than subset(): require(dplyr) coldstates <- all.states %>% filter(Frost > 150) %>% select(state, Frost) Or, easier to see what's happening: coldstates <- all.states %>% filter(Frost > 150) %>% select(state, Frost) |> |> |> John Kane |> Kingston ON Canada |> |> |> > -----Original Message----- |> > From: yoursurrogategod at gmail.com |> > Sent: Sun, 22 Mar 2015 10:39:03 -0400 |> > To: r-help at r-project.org |> > Subject: [R] Why can't I access this type? |> > |> > Hi, I'm just learning my way around R. I got a bunch of states and would |> > like to access to get all of the ones where it's cold. But when I do the |> > following, I will get the following error: |> > |> >> all.states <- as.data.frame(state.x77) |> >> cold.states <- all.states[all.states$Frost > 150, c("Name", "Frost")] |> > Error in `[.data.frame`(all.states, all.states$Frost > 150, c("Name", : |> > undefined columns selected |> > |> > I don't get it. When I look at all.states, this is what I see: |> > |> >> str(all.states) |> > 'data.frame': 50 obs. of 8 variables: |> > $ Population: num 3615 365 2212 2110 21198 ... |> > $ Income : num 3624 6315 4530 3378 5114 ... |> > $ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ... |> > $ Life Exp : num 69 69.3 70.5 70.7 71.7 ... |> > $ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ... |> > $ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ... |> > $ Frost : num 20 152 15 65 20 166 139 103 11 60 ... |> > $ Area : num 50708 566432 113417 51945 156361 ... |> > |> > What am I messing up? |> > |> > [[alternative HTML version deleted]] |> > |> > ______________________________________________ |> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see |> > https://stat.ethz.ch/mailman/listinfo/r-help |> > PLEASE do read the posting guide |> > http://www.R-project.org/posting-guide.html |> > and provide commented, minimal, self-contained, reproducible code. |> |> ____________________________________________________________ |> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! |> Visit http://www.inbox.com/photosharing to find out more! |> |> ______________________________________________ |> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see |> https://stat.ethz.ch/mailman/listinfo/r-help |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html |> and provide commented, minimal, self-contained, reproducible code. -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
On 2015-03-25 09:40, Patrick Connolly wrote:> On Sun, 22-Mar-2015 at 08:06AM -0800, John Kane wrote: > > |> Well, first off, you have no variable called "Name". You have lost > |> the state names as they are rownames in the matrix state.x77 and > |> not a variable. > |> > |> Try this. It's ugly and I have no idea why I had to do a cbind() > > You don't have to use cbind > > |> but it seems to work. Personally I find subset easier to read than > |> the indexing approach. > > |> state <- rownames(state.x77) > |> all.states <- as.data.frame(state.x77) > |> all.states <- cbind(state, all.states) ### ????? > > You don't have to use cbind() > > all.states <- within(as.data.frame(state.x77), state <- rownames(state.x77)) > > but I think cbind is simpler to read. > > |> > |> coldstates <- subset(all.states, all.states$Frost > 50, > |> select = c("state","Frost") )I find the indexing approach coldstates <- all.states[all.states$Frost > 150, c("state","Frost")] to be the most direct and obvious solution.> Tidier, even more so than subset(): > > require(dplyr) > coldstates <- all.states %>% filter(Frost > 150) %>% select(state, Frost) > > Or, easier to see what's happening: > > coldstates <- all.states %>% > filter(Frost > 150) %>% > select(state, Frost)Well... Opinions may perhaps differ, but apart from '%>%' being butt-ugly it's also fairly slow: > library("microbenchmark") > microbenchmark( + subset(all.states, all.states$Frost > 150, select = c("state","Frost")), + all.states[all.states$Frost > 150, c("state","Frost")], + all.states %>% filter(Frost > 150) %>% select(state, Frost), + times = 1000L + ) Unit: microseconds expr subset(all.states, all.states$Frost > 150, select = c("state", "Frost")) all.states[all.states$Frost > 150, c("state", "Frost")] all.states %>% filter(Frost > 150) %>% select(state, Frost) min lq mean median uq max neval cld 139.112 148.673 163.3960 159.1760 170.7895 1763.200 1000 b 104.039 111.973 127.2138 120.4395 128.6640 1381.809 1000 a 1010.076 1033.519 1133.1469 1107.8480 1175.1800 2932.206 1000 c Of course, this doesn't matter for interactive one-off use. But lately I've seen examples of the '%>%' operator creeping into functions in packages. However, it would be nice to see a fast pipe operator as part of base R. Henric Winell> > > |> > |> > |> John Kane > |> Kingston ON Canada > |> > |> > |> > -----Original Message----- > |> > From: yoursurrogategod at gmail.com > |> > Sent: Sun, 22 Mar 2015 10:39:03 -0400 > |> > To: r-help at r-project.org > |> > Subject: [R] Why can't I access this type? > |> > > |> > Hi, I'm just learning my way around R. I got a bunch of states and would > |> > like to access to get all of the ones where it's cold. But when I do the > |> > following, I will get the following error: > |> > > |> >> all.states <- as.data.frame(state.x77) > |> >> cold.states <- all.states[all.states$Frost > 150, c("Name", "Frost")] > |> > Error in `[.data.frame`(all.states, all.states$Frost > 150, c("Name", : > |> > undefined columns selected > |> > > |> > I don't get it. When I look at all.states, this is what I see: > |> > > |> >> str(all.states) > |> > 'data.frame': 50 obs. of 8 variables: > |> > $ Population: num 3615 365 2212 2110 21198 ... > |> > $ Income : num 3624 6315 4530 3378 5114 ... > |> > $ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ... > |> > $ Life Exp : num 69 69.3 70.5 70.7 71.7 ... > |> > $ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ... > |> > $ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ... > |> > $ Frost : num 20 152 15 65 20 166 139 103 11 60 ... > |> > $ Area : num 50708 566432 113417 51945 156361 ... > |> > > |> > What am I messing up? > |> > > |> > [[alternative HTML version deleted]] > |> > > |> > ______________________________________________ > |> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > |> > https://stat.ethz.ch/mailman/listinfo/r-help > |> > PLEASE do read the posting guide > |> > http://www.R-project.org/posting-guide.html > |> > and provide commented, minimal, self-contained, reproducible code. > |> > |> ____________________________________________________________ > |> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! > |> Visit http://www.inbox.com/photosharing to find out more! > |> > |> ______________________________________________ > |> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > |> https://stat.ethz.ch/mailman/listinfo/r-help > |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > |> and provide commented, minimal, self-contained, reproducible code. >
Patrick Connolly
2015-Mar-26 06:48 UTC
[R] Using and abusing %>% (was Re: Why can't I access this type?)
On Wed, 25-Mar-2015 at 03:14PM +0100, Henric Winell wrote: ... |> Well... Opinions may perhaps differ, but apart from '%>%' being |> butt-ugly it's also fairly slow: Beauty, it is said, is in the eye of the beholder. I'm impressed by the way using %>% reduces or eliminates complicated nested brackets. In this tiny example it's not obvious but it's very clear if the objective is to sort the dataframe by three or four columns and various lots of aggregation then returning a largish number of consecutive columns, omitting the rest. It's very easy to see what's going on without the need for intermediate objects. |> |> ..... |> Unit: microseconds |> |> expr |> subset(all.states, all.states$Frost > 150, select = c("state", |> "Frost")) |> all.states[all.states$Frost > 150, |> c("state", "Frost")] |> all.states %>% filter(Frost > 150) %>% |> select(state, Frost) |> min lq mean median uq max neval cld |> 139.112 148.673 163.3960 159.1760 170.7895 1763.200 1000 b |> 104.039 111.973 127.2138 120.4395 128.6640 1381.809 1000 a |> 1010.076 1033.519 1133.1469 1107.8480 1175.1800 2932.206 1000 c It's no surprise that instructing a computer in something closer to human language is an order of magnitude slower. I'm sure you'd get something even quicker using machine code. I spend 3 or 4 orders of magnitude more time writing code than running it. It's much more important to me to be able to read and modify than it is to have it run at optimum speed. |> |> Of course, this doesn't matter for interactive one-off use. But |> lately I've seen examples of the '%>%' operator creeping into |> functions in packages. That could indicate that %>% is seductively easy to use. It's probably true that there are places where it should be done the hard way. |> However, it would be nice to see a fast pipe operator as part of |> base R. |> |> |> Henric Winell |> -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
> On Mar 25, 2015, at 10:14, Henric Winell <nilsson.henric at gmail.com> wrote: > >> On 2015-03-25 09:40, Patrick Connolly wrote: >> >> On Sun, 22-Mar-2015 at 08:06AM -0800, John Kane wrote: >> >> |> Well, first off, you have no variable called "Name". You have lost >> |> the state names as they are rownames in the matrix state.x77 and >> |> not a variable. >> |> >> |> Try this. It's ugly and I have no idea why I had to do a cbind() >> >> You don't have to use cbind >> >> |> but it seems to work. Personally I find subset easier to read than >> |> the indexing approach. >> >> |> state <- rownames(state.x77) >> |> all.states <- as.data.frame(state.x77) >> |> all.states <- cbind(state, all.states) ### ????? >> >> You don't have to use cbind() >> >> all.states <- within(as.data.frame(state.x77), state <- rownames(state.x77)) >> >> but I think cbind is simpler to read. >> >> |> >> |> coldstates <- subset(all.states, all.states$Frost > 50, >> |> select = c("state","Frost") ) > > I find the indexing approach > > coldstates <- all.states[all.states$Frost > 150, c("state","Frost")] > > to be the most direct and obvious solution. > >> Tidier, even more so than subset(): >> >> require(dplyr) >> coldstates <- all.states %>% filter(Frost > 150) %>% select(state, Frost) >> >> Or, easier to see what's happening: >> >> coldstates <- all.states %>% >> filter(Frost > 150) %>% >> select(state, Frost) > > Well... Opinions may perhaps differ, but apart from '%>%' being butt-ugly it's also fairly slow: > > > library("microbenchmark") > > microbenchmark( > + subset(all.states, all.states$Frost > 150, select = c("state","Frost")), > + all.states[all.states$Frost > 150, c("state","Frost")], > + all.states %>% filter(Frost > 150) %>% select(state, Frost), > + times = 1000L > + ) > Unit: microseconds > expr > subset(all.states, all.states$Frost > 150, select = c("state", "Frost")) > all.states[all.states$Frost > 150, c("state", "Frost")] > all.states %>% filter(Frost > 150) %>% select(state, Frost) > min lq mean median uq max neval cld > 139.112 148.673 163.3960 159.1760 170.7895 1763.200 1000 b > 104.039 111.973 127.2138 120.4395 128.6640 1381.809 1000 a > 1010.076 1033.519 1133.1469 1107.8480 1175.1800 2932.206 1000 c > > Of course, this doesn't matter for interactive one-off use. But lately I've seen examples of the '%>%' operator creeping into functions in packages. However, it would be nice to see a fast pipe operator as part of base R. > > > Henric Winell > > > >> >> >> |> >> |> >> |> John Kane >> |> Kingston ON Canada >> |> >> |> >> |> > -----Original Message----- >> |> > From: yoursurrogategod at gmail.com >> |> > Sent: Sun, 22 Mar 2015 10:39:03 -0400 >> |> > To: r-help at r-project.org >> |> > Subject: [R] Why can't I access this type? >> |> > >> |> > Hi, I'm just learning my way around R. I got a bunch of states and would >> |> > like to access to get all of the ones where it's cold. But when I do the >> |> > following, I will get the following error: >> |> > >> |> >> all.states <- as.data.frame(state.x77) >> |> >> cold.states <- all.states[all.states$Frost > 150, c("Name", "Frost")] >> |> > Error in `[.data.frame`(all.states, all.states$Frost > 150, c("Name", : >> |> > undefined columns selected >> |> > >> |> > I don't get it. When I look at all.states, this is what I see: >> |> > >> |> >> str(all.states) >> |> > 'data.frame': 50 obs. of 8 variables: >> |> > $ Population: num 3615 365 2212 2110 21198 ... >> |> > $ Income : num 3624 6315 4530 3378 5114 ... >> |> > $ Illiteracy: num 2.1 1.5 1.8 1.9 1.1 0.7 1.1 0.9 1.3 2 ... >> |> > $ Life Exp : num 69 69.3 70.5 70.7 71.7 ... >> |> > $ Murder : num 15.1 11.3 7.8 10.1 10.3 6.8 3.1 6.2 10.7 13.9 ... >> |> > $ HS Grad : num 41.3 66.7 58.1 39.9 62.6 63.9 56 54.6 52.6 40.6 ... >> |> > $ Frost : num 20 152 15 65 20 166 139 103 11 60 ... >> |> > $ Area : num 50708 566432 113417 51945 156361 ... >> |> > >> |> > What am I messing up? >> |> > >> |> > [[alternative HTML version deleted]] >> |> > >> |> > ______________________________________________ >> |> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> |> > https://stat.ethz.ch/mailman/listinfo/r-help >> |> > PLEASE do read the posting guide >> |> > http://www.R-project.org/posting-guide.html >> |> > and provide commented, minimal, self-contained, reproducible code. >> |> >> |> ____________________________________________________________ >> |> FREE ONLINE PHOTOSHARING - Share your photos online with your friends and family! >> |> Visit http://www.inbox.com/photosharing to find out more! >> |> >> |> ______________________________________________ >> |> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> |> https://stat.ethz.ch/mailman/listinfo/r-help >> |> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> |> and provide commented, minimal, self-contained, reproducible code.I agree with you on the indexing approach. But even after using within, I still get the same error.>