thr3ads.net - R help - [R] Splitting a dataframe by character vector [Apr 2012]

If this information is useful, please help other people find it:
Share via:

Ben Neal

2012-Apr-21 01:52 UTC

[R] Splitting a dataframe by character vector

I am just trying to split a dataframe of 750 observations of 29 variables by
"Site", which is a vector in the dataframe with five text names (ex.
PtaCaracol). I just want to generate summary statistics for my other variables
for each site individually.

I know this should be simple, and I did read up on options, choosing to use
"subset", but I am honestly confounded that this does not result in a
new dataframe split by this factor. I tried "subset" with an argument
based on my various other numeric vectors, and it worked just fine. My very
simple code is below, and thank you for any responses. I apologize for asking
such a simple question, but I wrote this out exactly as found it in an exactly
similar question, which indicated this should work. Now I am confused! Thank you
for any assistance. Ben Neal

# Load data from CSV file
Cover = read.csv ("/Users/benjaminneal/Documents/1110_Panama/Transect
series /1210_BocasTransectSummary.csv",
                           header=T)
# Divide dataframe by Site names
Site1 <- subset(Cover, Site = "PtaCaracol")

PS The csv loads fine, gives normal summary stats, and does not seem to be the
issue. I thought about just renaming by hand all the sites in the file (i.e.
"PtaCaracol"=1 . . but this seems like a weak solution).

Rolf Turner

2012-Apr-21 04:16 UTC

head link

[R] Splitting a dataframe by character vector

On 21/04/12 13:52, Ben Neal wrote:> I am just trying to split a dataframe of 750 observations of 29 variables
by "Site", which is a vector in the dataframe with five text names
(ex. PtaCaracol). I just want to generate summary statistics for my other
variables for each site individually.
>
> I know this should be simple, and I did read up on options, choosing to use
"subset", but I am honestly confounded that this does not result in a
new dataframe split by this factor. I tried "subset" with an argument
based on my various other numeric vectors, and it worked just fine. My very
simple code is below, and thank you for any responses. I apologize for asking
such a simple question, but I wrote this out exactly as found it in an exactly
similar question, which indicated this should work. Now I am confused! Thank you
for any assistance. Ben Neal
>
> # Load data from CSV file
> Cover = read.csv ("/Users/benjaminneal/Documents/1110_Panama/Transect
series /1210_BocasTransectSummary.csv",
>                             header=T)
> # Divide dataframe by Site names
> Site1<- subset(Cover, Site = "PtaCaracol")
>
> PS The csv loads fine, gives normal summary stats, and does not seem to be
the issue. I thought about just renaming by hand all the sites in the file (i.e.
"PtaCaracol"=1 . . but this seems like a weak solution).
What about a ***reproducible*** example?

What isn't working?  What (if any) error message did you get?

I am at least 99% confident that numeric versus character values in 
Cover$Site
is completely irrelevant.

My ***guess*** (and it can only be a guess, given the vagueness and
lack of detail in your question) is that you need a double ("logical")
equals sign.  I.e. I suspect that

Site1<- subset(Cover, Site == "PtaCaracol")

will work.

That being said --- since you want to *split* your  data frame by
"Site" ---
why the <expletive deleted> don't you use split()???

E.g.:

     Splitz <- split(Cover,Cover$Site)

     cheers,

         Rolf Turner

Ben Neal

2012-Apr-21 11:50 UTC

head link

[R] Splitting a dataframe by character vector

Thank you Jorge. 

I did get subset to work on a character vector last night, the only difference
being I used two equals signs:

(not working) > Site1 <- subset(Cover, Site = "PtaCaracol")
(working) > Site1 <- subset(Cover, Site = = "PtaCaracol")

I am unsure why this is, but it worked. I believe this was the error, causing my
returned dataframes to contain all the original entries.

Thanks again; I will explore the plyr package as well. Ben 


-----Original Message-----
From: Jorge I Velez [mailto:jorgeivanvelez at gmail.com]
Sent: Fri 4/20/2012 8:17 PM
To: Ben Neal
Subject: Re: [R] Splitting a dataframe by character vector
 
Dear Ben,

Check ?tapply, ?aggregate, ?ave for some ways to accomplish this using the
base package.  Also, check and check the plyr package at
http://plyr.had.co.nz/ along with the examples and accompanying paper.

HTH,
Jorge.-


On Fri, Apr 20, 2012 at 9:52 PM, Ben Neal <> wrote:
> I am just trying to split a dataframe of 750 observations of 29 variables
> by "Site", which is a vector in the dataframe with five text
names (ex.
> PtaCaracol). I just want to generate summary statistics for my other
> variables for each site individually.
>
> I know this should be simple, and I did read up on options, choosing to
> use "subset", but I am honestly confounded that this does not
result in a
> new dataframe split by this factor. I tried "subset" with an
argument based
> on my various other numeric vectors, and it worked just fine. My very
> simple code is below, and thank you for any responses. I apologize for
> asking such a simple question, but I wrote this out exactly as found it in
> an exactly similar question, which indicated this should work. Now I am
> confused! Thank you for any assistance. Ben Neal
>
> # Load data from CSV file
> Cover = read.csv ("/Users/benjaminneal/Documents/1110_Panama/Transect
> series /1210_BocasTransectSummary.csv",
>                           header=T)
> # Divide dataframe by Site names
> Site1 <- subset(Cover, Site = "PtaCaracol")
>
> PS The csv loads fine, gives normal summary stats, and does not seem to be
> the issue. I thought about just renaming by hand all the sites in the file
> (i.e. "PtaCaracol"=1 . . but this seems like a weak solution).
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

S Ellison

2012-Apr-23 10:19 UTC

head link

[R] Splitting a dataframe by character vector

> -----Original Message-----
> I am just trying to split a dataframe of 750 observations of 
> 29 variables by "Site", which is a vector in the dataframe 
> with five text names (ex. PtaCaracol). A couple of methods.
i) First, look up ?split, which chops your data frame into a list of five data
frames. Then use lapply on the list to get a list of summaries, or sapply to get
something that will (if the results of your summary are a simple number or
vector) look like an array.

ii) look up ?by, which will do something like lapply (and returns a list with
extra features) and will print a tidier result.
> I tried "subset" ....
> # Divide dataframe by Site names
> Site1 <- subset(Cover, Site = "PtaCaracol")
Check your syntax again; that should have been
 Site1 <- subset(Cover, Site == "PtaCaracol")

'=' is a pairwise link or an assignment operator, not the equality test
that subset would be looking for.

Steve E
*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}

Apparently Analagous Threads

Search for more reasonably related threads

R help - Apr 2012 - Splitting a dataframe by character vector

[R] Splitting a dataframe by character vector

[R] Splitting a dataframe by character vector

[R] Splitting a dataframe by character vector

[R] Splitting a dataframe by character vector

Apparently Analagous Threads