Please let me know if this is or is not the right place to ask these types of questions. Warning: I am new to R by two days. I have a simple dataset. I have loaded the dataset successfully using the following code: Filepath=(C:\temp\\pilot\dataset1.txt") Pilot=read.table(filepath, header=TRUE) Dataset1.txt is delimited and looks like this: Date illness count 2006/01/01 derm 17 2006/01/01 derm 35 2006/01/02 derm 24 2006/01/02 derm 80 . . . Total records like this approximately 18,000 I would like to use the aggregate function to sum the count by similar date and illness, so it should look like this after the aggregate 2006/01/01 derm 52 2006/01/02 derm 104 . . . And, the illness changes to fever with the same pattern. I would like to aggregate the same illnesses by date in the same fashion. A nudge in the right direction would be appreciated. Thanks. Ken Hall Computer Scientist Division of Healthcare Information (DHI) (proposed) Public Health Surveillance Program Office (proposed) Office of Surveillance, Epidemiology, & Laboratory Services (OSELS) (proposed) Centers for Disease Control & Prevention (CDC) kha6@cdc.gov Mobile: 404-993-3311 Office: 404-498-6839 [[alternative HTML version deleted]]
Ken - Try aggregate(Pilot$Count,list(Date=Pilot$Date,illness=Pilot$illness),sum) If you don't want to keep typing "Pilot", use with(Pilot,aggregate(Count,list(Date=Date,illness=illness),sum)) Notice that the aggregated variable will be called "x" in the output data frame from aggregate. - Phil Spector Statistical Computing Facility Department of Statistics UC Berkeley spector at stat.berkeley.edu On Thu, 19 Aug 2010, Hall, Ken (CDC/OSELS/NCPHI) wrote:> Please let me know if this is or is not the right place to ask these > types of questions. > > Warning: I am new to R by two days. > > I have a simple dataset. > I have loaded the dataset successfully using the following code: > > Filepath=(C:\temp\\pilot\dataset1.txt") > Pilot=read.table(filepath, header=TRUE) > > Dataset1.txt is delimited and looks like this: > > Date illness count > 2006/01/01 derm 17 > 2006/01/01 derm 35 > 2006/01/02 derm 24 > 2006/01/02 derm 80 > . > . > . > Total records like this approximately 18,000 > > I would like to use the aggregate function to sum the count by similar > date and illness, so it should look like this after the aggregate > > 2006/01/01 derm 52 > 2006/01/02 derm 104 > . > . > . > > And, the illness changes to fever with the same pattern. I would like to > aggregate the same illnesses by date in the same fashion. > > A nudge in the right direction would be appreciated. > > Thanks. > > Ken Hall > Computer Scientist > Division of Healthcare Information (DHI) (proposed) > Public Health Surveillance Program Office (proposed) > Office of Surveillance, Epidemiology, & Laboratory Services (OSELS) > (proposed) > Centers for Disease Control & Prevention (CDC) > kha6 at cdc.gov > Mobile: 404-993-3311 > Office: 404-498-6839 > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Aug 19, 2010, at 4:45 PM, Hall, Ken (CDC/OSELS/NCPHI) wrote:> Please let me know if this is or is not the right place to ask these > types of questions. > > Warning: I am new to R by two days. > > I have a simple dataset. > I have loaded the dataset successfully using the following code: > > Filepath=(C:\temp\\pilot\dataset1.txt") > Pilot=read.table(filepath, header=TRUE) > > Dataset1.txt is delimited and looks like this: > > Date illness count > 2006/01/01 derm 17 > 2006/01/01 derm 35 > 2006/01/02 derm 24 > 2006/01/02 derm 80 > . > . > . > Total records like this approximately 18,000 > > I would like to use the aggregate function to sum the count by similar > date and illness, so it should look like this after the aggregatePerhaps: with( Pilot, tapply(count, list(Date, illness), sum, na.rm=TRUE) If you need it as a dataframe, then pass the result to: ?as.data.frame.table> > 2006/01/01 derm 52 > 2006/01/02 derm 104 > . > . > . > > And, the illness changes to fever with the same pattern.Don't understand what that means.> I would like to > aggregate the same illnesses by date in the same fashion.I thought that was what you asked for above.> > A nudge in the right direction would be appreciated. > > Thanks. > > Ken Hall > Computer Scientist > Division of Healthcare Information (DHI) (proposed) > Public Health Surveillance Program Office (proposed) > Office of Surveillance, Epidemiology, & Laboratory Services (OSELS) >Always interested in helping the CDC but I think you may need to be more expansive in your problem descriptions. David Winsemius, MD West Hartford, CT
On Thu, Aug 19, 2010 at 4:45 PM, Hall, Ken (CDC/OSELS/NCPHI) <kha6 at cdc.gov> wrote:> Please let me know if this is or is not the right place to ask these > types of questions. > > Warning: I am new to R by two days. > > I have a simple dataset. > I have loaded the dataset successfully using the following code: > > Filepath=(C:\temp\\pilot\dataset1.txt") > Pilot=read.table(filepath, header=TRUE) > > Dataset1.txt is delimited and looks like this: > > Date ? ?illness count > 2006/01/01 ? ? ?derm ? ?17 > 2006/01/01 ? ? ?derm ? ?35 > 2006/01/02 ? ? ?derm ? ?24 > 2006/01/02 ? ? ?derm ? ?80 > . > . > . > Total records like this approximately 18,000 > > I would like to use the aggregate function to sum the count by similar > date and illness, so it should look like this after the aggregate > > 2006/01/01 ? ? ?derm ? ?52 > 2006/01/02 ? ? ?derm ? ?104 > .Try: aggregate(Pilot[3], Pilot[1:2], sum) or aggregate(count ~., Pilot, sum)
David pointed out that it would have helped to include the results of str(pilot) in my original post. I ran the str(pilot) and, sure enough, it revealed the problem (my error), that the column name should be "date1". Also, the editor in Outlook capitalized "Date", so that was another problem. Now, all suggestions by all three posts to solve this aggregation work perfectly. Thanks to all. Ken -----Original Message----- From: David Winsemius [mailto:dwinsemius at comcast.net] Sent: Friday, August 20, 2010 11:12 AM To: Hall, Ken (CDC/OSELS/NCPHI) Subject: Re: [R] Aggregate Help On Aug 20, 2010, at 10:42 AM, Hall, Ken (CDC/OSELS/NCPHI) wrote:> David, > > Thanks for your quick response.> > I received the following error message with > > with( Pilot, tapply(count, list(Date, illness), sum, na.rm=TRUE) > > Error in unique.default(x) : unique() applies only to vectorsWhen I post replies that cannot possibly have been tested due to lack of a reproducible example, I try to remember to preface my offerings with "Perhaps" because of the many potential pitfalls in offering code when you cannot see the structure of the objects. Had you offered the results of str(Pilot), it might have been clearer what the nature of the columns in that data.frame actual were. If you execute traceback() immediately after an error you can (sometimes) determine what the problem was. Best I can offer without looking at the data. Regards; David.> > Ken > > -----Original Message----- > From: David Winsemius [mailto:dwinsemius at comcast.net] > Sent: Thursday, August 19, 2010 5:02 PM > To: Hall, Ken (CDC/OSELS/NCPHI) > Cc: r-help at r-project.org > Subject: Re: [R] Aggregate Help > > > On Aug 19, 2010, at 4:45 PM, Hall, Ken (CDC/OSELS/NCPHI) wrote: > >> Please let me know if this is or is not the right place to ask these >> types of questions. >> >> Warning: I am new to R by two days. >> >> I have a simple dataset. >> I have loaded the dataset successfully using the following code: >> >> Filepath=(C:\temp\\pilot\dataset1.txt") >> Pilot=read.table(filepath, header=TRUE) >> >> Dataset1.txt is delimited and looks like this: >> >> Date illness count >> 2006/01/01 derm 17 >> 2006/01/01 derm 35 >> 2006/01/02 derm 24 >> 2006/01/02 derm 80 >> . >> . >> . >> Total records like this approximately 18,000 >> >> I would like to use the aggregate function to sum the count by >> similar >> date and illness, so it should look like this after the aggregate > > Perhaps: > > with( Pilot, tapply(count, list(Date, illness), sum, na.rm=TRUE) > > If you need it as a dataframe, then pass the result to: > > ?as.data.frame.table > > >> >> 2006/01/01 derm 52 >> 2006/01/02 derm 104 >> . >> . >> . >> >> And, the illness changes to fever with the same pattern. > > Don't understand what that means. > >> I would like to >> aggregate the same illnesses by date in the same fashion. > > I thought that was what you asked for above. > >> >> A nudge in the right direction would be appreciated. >> >> Thanks. >> >> Ken Hall >> Computer Scientist >> Division of Healthcare Information (DHI) (proposed) >> Public Health Surveillance Program Office (proposed) >> Office of Surveillance, Epidemiology, & Laboratory Services (OSELS) >> > > Always interested in helping the CDC but I think you may need to be > more expansive in your problem descriptions. > > David Winsemius, MD > West Hartford, CT > > >David Winsemius, MD West Hartford, CT