thanks for all the answers. I think also ggplot2 requires data.frames.If you want to add variable to data.frame you have to use attach, detach. Right?Any more links that discuss thoe two different approaches?Alex On Wednesday, September 14, 2016 5:34 PM, Bert Gunter <bgunter.4567 at gmail.com> wrote: This is partially a matter of subjectve opinion, and so pointless; but I would point out that data frames are the canonical structure for a great many of R's modeling and graphics functions, e.g. lm, xyplot, etc. As for mutate() etc., that's about UI's and user friendliness, and imho my ho is meaningless. Best, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org> wrote:> Hi all,I have seen data.frames and operations from the mutate package getting really popular. In the last years I have been using extensively lists, is there any reason to not use lists and use other data types for data manipulation and storage? > Any article that describe their differences? I would like to thank you for your replyRegardsAlex >? ? ? ? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
"If you want to add variable to data.frame you have to use attach, detach. Right?" Not quite. Use it like a list to add a variable to a data.frame e.g. df = list() df$var1 = 1:10 df = as.data.frame(df) df$var2 = 1:10 df[["var3"]] = 1:10 df df = as.list(df) df$var4 = 1:10 as.data.frame(df) Ironically the primary reason to use a data.frame in my head is to signal that you are thinking of your data as a row-oriented tabular storage. "Ironic" because in technical detail that is not a requirement to be a data.frame, but when I reflect on the typical way a seasoned R programmer approaches list and data.frames that is basically what they are communicating. I was going to post that a reason to use data.frames is to take advantages of optimizations and syntax sugar for data.frames, but in reality if code does not assume a row-oriented data structure in a data.frame there is not much I can think of that exists in the way of optimization. For example, we could point to "subset" and say that is a reason to use data.frames and not list, but that only works if you use data.frame in a conventional way. In the end, my advice to you is if it is a table make it a data.frame and if it is not easily thought of as a table or row-oriented data structure keep it as a list. Thanks, Jeremiah On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org> wrote:> thanks for all the answers. I think also ggplot2 requires data.frames.If > you want to add variable to data.frame you have to use attach, detach. > Right?Any more links that discuss thoe two different approaches?Alex > > On Wednesday, September 14, 2016 5:34 PM, Bert Gunter < > bgunter.4567 at gmail.com> wrote: > > > This is partially a matter of subjectve opinion, and so pointless; but > I would point out that data frames are the canonical structure for a > great many of R's modeling and graphics functions, e.g. lm, xyplot, > etc. > > As for mutate() etc., that's about UI's and user friendliness, and > imho my ho is meaningless. > > Best, > Bert > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org> > wrote: > > Hi all,I have seen data.frames and operations from the mutate package > getting really popular. In the last years I have been using extensively > lists, is there any reason to not use lists and use other data types for > data manipulation and storage? > > Any article that describe their differences? I would like to thank you > for your replyRegardsAlex > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
There is also this syntax for adding variables df[, "var5"] = 1:10 and the syntax sugar for row-oriented storage: df[1:5,] On Wed, Sep 14, 2016 at 11:40 AM, jeremiah rounds <roundsjeremiah at gmail.com> wrote:> "If you want to add variable to data.frame you have to use attach, detach. > Right?" > > Not quite. Use it like a list to add a variable to a data.frame > > e.g. > df = list() > df$var1 = 1:10 > df = as.data.frame(df) > df$var2 = 1:10 > df[["var3"]] = 1:10 > df > df = as.list(df) > df$var4 = 1:10 > as.data.frame(df) > > Ironically the primary reason to use a data.frame in my head is to signal > that you are thinking of your data as a row-oriented tabular storage. > "Ironic" because in technical detail that is not a requirement to be a > data.frame, but when I reflect on the typical way a seasoned R programmer > approaches list and data.frames that is basically what they are > communicating. > > I was going to post that a reason to use data.frames is to take advantages > of optimizations and syntax sugar for data.frames, but in reality if code > does not assume a row-oriented data structure in a data.frame there is not > much I can think of that exists in the way of optimization. For example, > we could point to "subset" and say that is a reason to use data.frames and > not list, but that only works if you use data.frame in a conventional way. > > In the end, my advice to you is if it is a table make it a data.frame and > if it is not easily thought of as a table or row-oriented data structure > keep it as a list. > > Thanks, > Jeremiah > > > > > > On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org> > wrote: > >> thanks for all the answers. I think also ggplot2 requires data.frames.If >> you want to add variable to data.frame you have to use attach, detach. >> Right?Any more links that discuss thoe two different approaches?Alex >> >> On Wednesday, September 14, 2016 5:34 PM, Bert Gunter < >> bgunter.4567 at gmail.com> wrote: >> >> >> This is partially a matter of subjectve opinion, and so pointless; but >> I would point out that data frames are the canonical structure for a >> great many of R's modeling and graphics functions, e.g. lm, xyplot, >> etc. >> >> As for mutate() etc., that's about UI's and user friendliness, and >> imho my ho is meaningless. >> >> Best, >> Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org> >> wrote: >> > Hi all,I have seen data.frames and operations from the mutate package >> getting really popular. In the last years I have been using extensively >> lists, is there any reason to not use lists and use other data types for >> data manipulation and storage? >> > Any article that describe their differences? I would like to thank you >> for your replyRegardsAlex >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posti >> ng-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > >[[alternative HTML version deleted]]
On 14/09/2016 2:40 PM, jeremiah rounds wrote:> "If you want to add variable to data.frame you have to use attach, detach. > Right?" > > Not quite. Use it like a list to add a variable to a data.frame > > e.g. > df = list() > df$var1 = 1:10 > df = as.data.frame(df) > df$var2 = 1:10 > df[["var3"]] = 1:10 > df > df = as.list(df) > df$var4 = 1:10 > as.data.frame(df) > > Ironically the primary reason to use a data.frame in my head is to signal > that you are thinking of your data as a row-oriented tabular storage. > "Ironic" because in technical detail that is not a requirement to be a > data.frame, but when I reflect on the typical way a seasoned R programmer > approaches list and data.frames that is basically what they are > communicating.I believe it is intended to be a requirement. You can construct things with class "data.frame" that don't have that structure, but lots of stuff will go wrong if you do. Duncan Murdoch> > I was going to post that a reason to use data.frames is to take advantages > of optimizations and syntax sugar for data.frames, but in reality if code > does not assume a row-oriented data structure in a data.frame there is not > much I can think of that exists in the way of optimization. For example, > we could point to "subset" and say that is a reason to use data.frames and > not list, but that only works if you use data.frame in a conventional way. > > In the end, my advice to you is if it is a table make it a data.frame and > if it is not easily thought of as a table or row-oriented data structure > keep it as a list. > > Thanks, > Jeremiah > > > > > > On Wed, Sep 14, 2016 at 11:15 AM, Alaios via R-help <r-help at r-project.org> > wrote: > > > thanks for all the answers. I think also ggplot2 requires data.frames.If > > you want to add variable to data.frame you have to use attach, detach. > > Right?Any more links that discuss thoe two different approaches?Alex > > > > On Wednesday, September 14, 2016 5:34 PM, Bert Gunter < > > bgunter.4567 at gmail.com> wrote: > > > > > > This is partially a matter of subjectve opinion, and so pointless; but > > I would point out that data frames are the canonical structure for a > > great many of R's modeling and graphics functions, e.g. lm, xyplot, > > etc. > > > > As for mutate() etc., that's about UI's and user friendliness, and > > imho my ho is meaningless. > > > > Best, > > Bert > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Wed, Sep 14, 2016 at 6:01 AM, Alaios via R-help <r-help at r-project.org> > > wrote: > > > Hi all,I have seen data.frames and operations from the mutate package > > getting really popular. In the last years I have been using extensively > > lists, is there any reason to not use lists and use other data types for > > data manipulation and storage? > > > Any article that describe their differences? I would like to thank you > > for your replyRegardsAlex > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/ > > posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
>If you want > to add variable to data.frame you have to use attach, detach. Right?I'd have said "not at all", not "not quite". attach and detach have almost exactly nothing to do with adding to a data frame. You can add to a data frame using dfrm$newvar <- <something> dfrm['newvar'] <- <something> cbind(dfrm, newvar=<something>) #adds a new variable called 'newvar' rbind #to add rows merge #to add columns and/or rows from another data frame ... and a few other things. The only relevance of attach/detach is to do with the behaviour of attached objects, not to do with adding to data frames. If you have attach()ed something, changing the original object does not automatically update the copy of its variables in the current environment, or vice versa, because attach(), as documented, creates a _copy_. So _if_ you have attach()ed a data frame - or a list - you can't change the copy by changing the original object and you can't change the original object by changing the copy. Only if you need to change both do you need to detach and reattach. As a rule, I generally avoid attach() for that and other reasons (most of which are listed in ?attach). attach()is only sensible if you have already completed all the manipulation needed on the attached object first. Even then, using with() is safer. S Ellison ******************************************************************* This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmaster at lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
On 15/09/16 14:04, S Ellison wrote:> > >> If you want >> to add variable to data.frame you have to use attach, detach. Right? > > I'd have said "not at all", not "not quite". attach and detach have > almost exactly nothing to do with adding to a data frame. You can add > to a data frame using dfrm$newvar <- <something> dfrm['newvar'] <- > <something> cbind(dfrm, newvar=<something>) #adds a new variable > called 'newvar' rbind #to add rows merge #to add columns and/or rows > from another data frame ... and a few other things. > > The only relevance of attach/detach is to do with the behaviour of > attached objects, not to do with adding to data frames. If you have > attach()ed something, changing the original object does not > automatically update the copy of its variables in the current > environment, or vice versa, because attach(), as documented, creates > a _copy_. So _if_ you have attach()ed a data frame - or a list - you > can't change the copy by changing the original object and you can't > change the original object by changing the copy. Only if you need to > change both do you need to detach and reattach. > > As a rule, I generally avoid attach() for that and other reasons > (most of which are listed in ?attach). attach()is only sensible if > you have already completed all the manipulation needed on the > attached object first. Even then, using with() is safer.Extremely well and clearly put. This is one of those "I wish *I* had said that!" posts. cheers, Rolf -- Technical Editor ANZJS Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276