Marco Visser
2005-Nov-27 17:49 UTC
[R] Counting the occurence of each unique "charecter string"
LS, I would really like to know how to count the frequency/occurrence of chachters inside a dataset. I am working with extreemly large datasets of forest inventory data with a large variety of different species inside it. Each row inside the dataframe represents one individual tree and the simplified dataframe looks something like this: num species dbh 1 sp1 30 2 sp1 20 3 sp2 30 4 sp1 40 I need to be able to count the number of individuals per species, so I need a command that will return for each unique species its occurence inside the dataframe; [sp1] 3 [sp2] 1 After a long search through help.search() and the web I found very little and any alternative like exporting the dataset to another program(excel) is not really an option because the dataset is far to large. I am using R 2.2.0 in Windows and if anyone knows a solution please help! Many sincere thanks in advance, Marco --------------------------------- [[alternative HTML version deleted]]
ronggui
2005-Nov-27 18:02 UTC
[R] Counting the occurence of each unique "charecter string"
use table() to get what you want. see ?table ======= 2005-11-28 01:49:19 伳侜佋佢伬伌佇伵佒佇佇伌伒伬仯伜======>LS, > > I would really like to know how to count the frequency/occurrence of chachters inside a dataset. I am working with extreemly large datasets of forest inventory data with a large variety of different species inside it. > Each row inside the dataframe represents one individual tree and the simplified dataframe looks something like this: > > num species dbh > 1 sp1 30 > 2 sp1 20 > 3 sp2 30 > 4 sp1 40 > > I need to be able to count the number of individuals per species, so I need a command that will return for each unique species its occurence inside the dataframe; > > [sp1] 3 > [sp2] 1 > > After a long search through help.search() and the web I found very little and any alternative like exporting the dataset to another program(excel) is not really an option because the dataset is far to large. > > I am using R 2.2.0 in Windows and if anyone knows a solution please help! > > Many sincere thanks in advance, > > Marco > > > > >--------------------------------- > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html= = = = = = = = = = = = = = = = = = = 2005-11-28 ------ Deparment of Sociology Fudan University My new mail addres is ronggui.huang at gmail.com Blog:http://sociology.yculblog.com
Chuck Cleland
2005-Nov-27 18:06 UTC
[R] Counting the occurence of each unique "charecter string"
?table table(mydata$species) Marco Visser wrote:> LS, > > I would really like to know how to count the frequency/occurrence of chachters inside a dataset. I am working with extreemly large datasets of forest inventory data with a large variety of different species inside it. > Each row inside the dataframe represents one individual tree and the simplified dataframe looks something like this: > > num species dbh > 1 sp1 30 > 2 sp1 20 > 3 sp2 30 > 4 sp1 40 > > I need to be able to count the number of individuals per species, so I need a command that will return for each unique species its occurence inside the dataframe; > > [sp1] 3 > [sp2] 1 > > After a long search through help.search() and the web I found very little and any alternative like exporting the dataset to another program(excel) is not really an option because the dataset is far to large. > > I am using R 2.2.0 in Windows and if anyone knows a solution please help! > > Many sincere thanks in advance, > > Marco > > > > > --------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 452-1424 (M, W, F) fax: (917) 438-0894
(Ted Harding)
2005-Nov-27 18:27 UTC
[R] Counting the occurence of each unique "charecter string"
On 27-Nov-05 Marco Visser wrote:> LS, > > I would really like to know how to count the frequency/occurrence of > chachters inside a dataset. I am working with extreemly large datasets > of forest inventory data with a large variety of different species > inside it. > Each row inside the dataframe represents one individual tree and the > simplified dataframe looks something like this: > > num species dbh > 1 sp1 30 > 2 sp1 20 > 3 sp2 30 > 4 sp1 40 > > I need to be able to count the number of individuals per species, so > I need a command that will return for each unique species its > occurence inside the dataframe; > > [sp1] 3 > [sp2] 1Does the following help? (Using an artificial example a bit more complicated than yours). The dataframe "trees" consists of a list of species names under "Species", and values of a numeric variable under "X". > trees Species X 1 Larix decidua 203 2 Pinus sylvestris 303 3 Larix decidua 202 4 Pinus sylvestris 301 5 Picea abies 102 6 Picea abies 103 7 Pinus sylvestris 302 8 Picea abies 101 9 Larix decidua 201 10 Picea abies 104 11 Picea abies 105 12 Pinus sylvestris 304 > freqs<-as.data.frame(table(trees$Species)) > colnames(freqs)<-c("Species","Counts") > freqs Species Counts 1 Larix decidua 3 2 Picea abies 5 3 Pinus sylvestris 4 > mean(freqs$Counts) [1] 4 > sd(freqs$Counts) [1] 1 Just using table() would give you the same information, but converting it to a dataframe makes that information more readily accessible by familiar methods. Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 27-Nov-05 Time: 18:27:10 ------------------------------ XFMail ------------------------------
Gabor Grothendieck
2005-Nov-27 19:16 UTC
[R] Counting the occurence of each unique "charecter string"
On 11/27/05, Ted Harding <Ted.Harding at nessie.mcc.ac.uk> wrote:> On 27-Nov-05 Marco Visser wrote: > > LS, > > > > I would really like to know how to count the frequency/occurrence of > > chachters inside a dataset. I am working with extreemly large datasets > > of forest inventory data with a large variety of different species > > inside it. > > Each row inside the dataframe represents one individual tree and the > > simplified dataframe looks something like this: > > > > num species dbh > > 1 sp1 30 > > 2 sp1 20 > > 3 sp2 30 > > 4 sp1 40 > > > > I need to be able to count the number of individuals per species, so > > I need a command that will return for each unique species its > > occurence inside the dataframe; > > > > [sp1] 3 > > [sp2] 1 > > Does the following help? (Using an artificial example a bit more > complicated than yours). The dataframe "trees" consists of a list > of species names under "Species", and values of a numeric variable > under "X". > > > > trees > Species X > 1 Larix decidua 203 > 2 Pinus sylvestris 303 > 3 Larix decidua 202 > 4 Pinus sylvestris 301 > 5 Picea abies 102 > 6 Picea abies 103 > 7 Pinus sylvestris 302 > 8 Picea abies 101 > 9 Larix decidua 201 > 10 Picea abies 104 > 11 Picea abies 105 > 12 Pinus sylvestris 304 > > > > freqs<-as.data.frame(table(trees$Species)) > > colnames(freqs)<-c("Species","Counts") > > freqs > Species Counts > 1 Larix decidua 3 > 2 Picea abies 5 > 3 Pinus sylvestris 4 > > > > mean(freqs$Counts) > [1] 4 > > sd(freqs$Counts) > [1] 1 > > > Just using table() would give you the same information, but > converting it to a dataframe makes that information more > readily accessible by familiar methods. > > Hoping this helps, > Ted. > >or using the iris dataset that comes with R and making use of as.data.frame.table we can shorten that slightly to just: as.data.frame.table(table(Species = iris$Species), responseName = "Count") Incidently, I just noticed that there is an inconsistency between as.data.frame and as.data.frame.table making it impossible to shorten as.data.frame.table to as.data.frame in the above due to the responseName= argument which is not referenced in the generic.> args(as.data.frame)function (x, row.names = NULL, optional = FALSE) NULL> args(as.data.frame.table)function (x, row.names = NULL, optional = FALSE, responseName = "Freq") NULL> R.version.string # Windows[1] "R version 2.2.0, 2005-10-24"