I have a table that looks like this: measurement?? ?date??? door ?? color 0.93529385?? ?513?? ?open?? ?red 0.97419293?? ?420??? open ?? red 0.962053514?? ?513?? ?closed?? ?red 0.963909937?? ?1230?? ?open?? ?blue 0.97652034?? ?1230?? ?open?? ?green 0.989310795?? ?1230?? ?closed?? ?blue 0.9941022?? ?917?? ?closed?? ?yellow I would like to create a table that has: Open measurement, Closed measurement, date, color.? For every date/color combination, there should be two columns to represent the door open/closed measurement. If there are multiple datapoints with a given door/date/color combination, then they should be averaged. I would also like to make two columns to represent the number of datapoints that were averaged in determining the open/closed measurements. Jeffrey
Install and load the "plyr" package and try something like:> ddply(d, .(date, color), summarize,+ meanOpen=mean(measurement[door=="open"]), nOpen=sum(door=="open"), + meanClosed=mean(measurement[door=="closed"]), nClosed=sum(door=="closed")) date color meanOpen nOpen meanClosed nClosed 1 420 red 0.9741929 1 NaN 0 2 513 red 0.9352938 1 0.9620535 1 3 917 yellow NaN 0 0.9941022 1 4 1230 blue 0.9639099 1 0.9893108 1 5 1230 green 0.9765203 1 NaN 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jeffrey Joh > Sent: Monday, February 06, 2012 4:28 PM > To: r-help at r-project.org > Subject: [R] Table rearranging > > > I have a table that looks like this: > > measurement?? ?date??? door ?? color > 0.93529385?? ?513?? ?open?? ?red > 0.97419293?? ?420??? open ?? red > 0.962053514?? ?513?? ?closed?? ?red > 0.963909937?? ?1230?? ?open?? ?blue > 0.97652034?? ?1230?? ?open?? ?green > 0.989310795?? ?1230?? ?closed?? ?blue > 0.9941022?? ?917?? ?closed?? ?yellow > > I would like to create a table that has: Open measurement, Closed measurement, date, color.? For every > date/color combination, there should be two columns to represent the door open/closed measurement. > > If there are multiple datapoints with a given door/date/color combination, then they should be > averaged. > I would also like to make two columns to represent the number of > datapoints that were averaged in determining the open/closed > measurements. > > Jeffrey > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thank you for your help, Bill.?>From the original table (not the plyr output), I would like to remove all the lines that do not have a corresponding open/closed measurement.? For example, if there is a Closed yellow measurement on 0917, but not an Open yellow 0917 measurement, then the Closed yellow should be deleted.How can I make this change? Jeffrey ----------------------------------------> From: wdunlap at tibco.com> To: johjeffrey at hotmail.com; r-help at r-project.org> Subject: RE: [R] Table rearranging> Date: Tue, 7 Feb 2012 00:43:25 +0000>> Install and load the "plyr" package and try something like:>> > ddply(d, .(date, color), summarize,> + meanOpen=mean(measurement[door=="open"]), nOpen=sum(door=="open"),> + meanClosed=mean(measurement[door=="closed"]), nClosed=sum(door=="closed"))> date color meanOpen nOpen meanClosed nClosed> 1 420 red 0.9741929 1 NaN 0> 2 513 red 0.9352938 1 0.9620535 1> 3 917 yellow NaN 0 0.9941022 1> 4 1230 blue 0.9639099 1 0.9893108 1> 5 1230 green 0.9765203 1 NaN 0>> Bill Dunlap> Spotfire, TIBCO Software> wdunlap tibco.com>> > -----Original Message-----> > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jeffrey Joh> > Sent: Monday, February 06, 2012 4:28 PM> > To: r-help at r-project.org> > Subject: [R] Table rearranging> >> >> > I have a table that looks like this:> >> > measurement date door color> > 0.93529385 513 open red> > 0.97419293 420 open red> > 0.962053514 513 closed red> > 0.963909937 1230 open blue> > 0.97652034 1230 open green> > 0.989310795 1230 closed blue> > 0.9941022 917 closed yellow> >> > I would like to create a table that has: Open measurement, Closed measurement, date, color. For every> > date/color combination, there should be two columns to represent the door open/closed measurement.> >> > If there are multiple datapoints with a given door/date/color combination, then they should be> > averaged.> > I would also like to make two columns to represent the number of> > datapoints that were averaged in determining the open/closed> > measurements.> >> > Jeffrey> >> > ______________________________________________> > R-help at r-project.org mailing list> > https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code.
On Feb 7, 2012, at 4:21 AM, Jeffrey Joh wrote:> > Thank you for your help, Bill. > >> From the original table (not the plyr output), I would like to >> remove all the lines that do not have a corresponding open/closed >> measurement. For example, if there is a Closed yellow measurement >> on 0917, but not an Open yellow 0917 measurement, then the Closed >> yellow should be deleted. > > How can I make this change? >In R you need to assign the results of a function to an object name so you code would look like: modified_data <- ddply(d, .(date, color), summarize, meanClosed=mean(measurement[door=="closed"]), nClosed=sum(door=="closed")) -- David> Jeffrey > > > ---------------------------------------- >> From: wdunlap at tibco.com > >> To: johjeffrey at hotmail.com; r-help at r-project.org > >> Subject: RE: [R] Table rearranging > >> Date: Tue, 7 Feb 2012 00:43:25 +0000 > >> > >> Install and load the "plyr" package and try something like: > >> > >>> ddply(d, .(date, color), summarize, > >> + ddply(d, .(date, color), summarize > >> + meanClosed=mean(measurement[door=="closed"]), >> nClosed=sum(door=="closed")) > >> date color meanOpen nOpen meanClosed nClosed > >> 1 420 red 0.9741929 1 NaN 0 > >> 2 513 red 0.9352938 1 0.9620535 1 > >> 3 917 yellow NaN 0 0.9941022 1 > >> 4 1230 blue 0.9639099 1 0.9893108 1 > >> 5 1230 green 0.9765203 1 NaN 0 > >> > >> Bill Dunlap > >> Spotfire, TIBCO Software > >> wdunlap tibco.com > >> > >>> -----Original Message----- > >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org >>> ] On Behalf Of Jeffrey Joh > >>> Sent: Monday, February 06, 2012 4:28 PM > >>> To: r-help at r-project.org > >>> Subject: [R] Table rearranging > >>> > >>> > >>> I have a table that looks like this: > >>> > >>> measurement date door color > >>> 0.93529385 513 open red > >>> 0.97419293 420 open red > >>> 0.962053514 513 closed red > >>> 0.963909937 1230 open blue > >>> 0.97652034 1230 open green > >>> 0.989310795 1230 closed blue > >>> 0.9941022 917 closed yellow > >>> > >>> I would like to create a table that has: Open measurement, Closed >>> measurement, date, color. For every > >>> date/color combination, there should be two columns to represent >>> the door open/closed measurement. > >>> > >>> If there are multiple datapoints with a given door/date/color >>> combination, then they should be > >>> averaged. > >>> I would also like to make two columns to represent the number of > >>> datapoints that were averaged in determining the open/closed > >>> measurements. > >>> > >>> Jeffrey > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Hi David, I am not sure how ddply/summarize solves my issue.? I have the following table: ID measurement date door color 1 0.93529385 513 open red 2 0.97419293 420 open red 3 0.962053514 513 closed red 4 0.963909937 1230 open blue 5 0.97652034 1230 open green 6 0.989310795 1230 closed blue 7 0.9941022 917 closed yellow 8 0.8945757 1230 open blue I only want to keep the lines that have corresponding open/closed measurements.? For example, I want to keep lines 4,6,8 because for the "1230 blue" condition, there exists both open and closed measurements. However, the "513 red" condition has an open measurement, but no closed measurement.? Therefore, line 1 should be deleted. Jeffrey ----------------------------------------> CC: r-help at r-project.org; wdunlap at tibco.com > From: dwinsemius at comcast.net > To: johjeffrey at hotmail.com > Subject: Re: [R] Table rearranging > Date: Tue, 7 Feb 2012 09:08:00 -0500 > > > On Feb 7, 2012, at 4:21 AM, Jeffrey Joh wrote: > > > > > Thank you for your help, Bill. > > > >> From the original table (not the plyr output), I would like to > >> remove all the lines that do not have a corresponding open/closed > >> measurement. For example, if there is a Closed yellow measurement > >> on 0917, but not an Open yellow 0917 measurement, then the Closed > >> yellow should be deleted. > > > > How can I make this change? > > > > In R you need to assign the results of a function to an object name so > you code would look like: > > modified_data <- ddply(d, .(date, color), summarize, > meanClosed=mean(measurement[door=="closed"]), > nClosed=sum(door=="closed")) > > -- > David > > Jeffrey > > > > > > ---------------------------------------- > >> From: wdunlap at tibco.com > > > >> To: johjeffrey at hotmail.com; r-help at r-project.org > > > >> Subject: RE: [R] Table rearranging > > > >> Date: Tue, 7 Feb 2012 00:43:25 +0000 > > > >> > > > >> Install and load the "plyr" package and try something like: > > > >> > > > >>> ddply(d, .(date, color), summarize, > > > >> + ddply(d, .(date, color), summarize > > > >> + meanClosed=mean(measurement[door=="closed"]), > >> nClosed=sum(door=="closed")) > > > >> date color meanOpen nOpen meanClosed nClosed > > > >> 1 420 red 0.9741929 1 NaN 0 > > > >> 2 513 red 0.9352938 1 0.9620535 1 > > > >> 3 917 yellow NaN 0 0.9941022 1 > > > >> 4 1230 blue 0.9639099 1 0.9893108 1 > > > >> 5 1230 green 0.9765203 1 NaN 0 > > > >> > > > >> Bill Dunlap > > > >> Spotfire, TIBCO Software > > > >> wdunlap tibco.com > > > >> > > > >>> -----Original Message----- > > > >>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org > >>> ] On Behalf Of Jeffrey Joh > > > >>> Sent: Monday, February 06, 2012 4:28 PM > > > >>> To: r-help at r-project.org > > > >>> Subject: [R] Table rearranging > > > >>> > > > >>> > > > >>> I have a table that looks like this: > > > >>> > > > >>> measurement date door color > > > >>> 0.93529385 513 open red > > > >>> 0.97419293 420 open red > > > >>> 0.962053514 513 closed red > > > >>> 0.963909937 1230 open blue > > > >>> 0.97652034 1230 open green > > > >>> 0.989310795 1230 closed blue > > > >>> 0.9941022 917 closed yellow > > > >>> > > > >>> I would like to create a table that has: Open measurement, Closed > >>> measurement, date, color. For every > > > >>> date/color combination, there should be two columns to represent > >>> the door open/closed measurement. > > > >>> > > > >>> If there are multiple datapoints with a given door/date/color > >>> combination, then they should be > > > >>> averaged. > > > >>> I would also like to make two columns to represent the number of > > > >>> datapoints that were averaged in determining the open/closed > > > >>> measurements. > > > >>> > > > >>> Jeffrey > > > >>> > > > >>> ______________________________________________ > > > >>> R-help at r-project.org mailing list > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >>> and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > > R-help at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius, MD > West Hartford, CT >
On Feb 7, 2012, at 8:43 PM, Jeffrey Joh wrote:> > Hi David, I am not sure how ddply/summarize solves my issue. I have > the following table: > > ID measurement date door color > 1 0.93529385 513 open red > 2 0.97419293 420 open red > 3 0.962053514 513 closed red > 4 0.963909937 1230 open blue > 5 0.97652034 1230 open green > 6 0.989310795 1230 closed blue > 7 0.9941022 917 closed yellow > 8 0.8945757 1230 open blue > > I only want to keep the lines that have corresponding open/closed > measurements. For example, I want to keep lines 4,6,8 because for > the "1230 blue" condition, there exists both open and closed > measurements. > > However, the "513 red" condition has an open measurement, but no > closed measurement.Huh? what about line 3? -- David,> Therefore, line 1 should be deleted. > > Jeffrey > > > ---------------------------------------- >> CC: r-help at r-project.org; wdunlap at tibco.com >> From: dwinsemius at comcast.net >> To: johjeffrey at hotmail.com >> Subject: Re: [R] Table rearranging >> Date: Tue, 7 Feb 2012 09:08:00 -0500 >> >> >> On Feb 7, 2012, at 4:21 AM, Jeffrey Joh wrote: >> >>> >>> Thank you for your help, Bill. >>> >>>> From the original table (not the plyr output), I would like to >>>> remove all the lines that do not have a corresponding open/closed >>>> measurement. For example, if there is a Closed yellow measurement >>>> on 0917, but not an Open yellow 0917 measurement, then the Closed >>>> yellow should be deleted. >>> >>> How can I make this change? >>> >> >> In R you need to assign the results of a function to an object name >> so >> you code would look like: >> >> modified_data <- ddply(d, .(date, color), summarize, >> meanClosed=mean(measurement[door=="closed"]), >> nClosed=sum(door=="closed")) >> >> -- >> David >>> Jeffrey >>> >>> >>> ---------------------------------------- >>>> From: wdunlap at tibco.com >>> >>>> To: johjeffrey at hotmail.com; r-help at r-project.org >>> >>>> Subject: RE: [R] Table rearranging >>> >>>> Date: Tue, 7 Feb 2012 00:43:25 +0000 >>> >>>> >>> >>>> Install and load the "plyr" package and try something like: >>> >>>> >>> >>>>> ddply(d, .(date, color), summarize, >>> >>>> + ddply(d, .(date, color), summarize >>> >>>> + meanClosed=mean(measurement[door=="closed"]), >>>> nClosed=sum(door=="closed")) >>> >>>> date color meanOpen nOpen meanClosed nClosed >>> >>>> 1 420 red 0.9741929 1 NaN 0 >>> >>>> 2 513 red 0.9352938 1 0.9620535 1 >>> >>>> 3 917 yellow NaN 0 0.9941022 1 >>> >>>> 4 1230 blue 0.9639099 1 0.9893108 1 >>> >>>> 5 1230 green 0.9765203 1 NaN 0 >>> >>>> >>> >>>> Bill Dunlap >>> >>>> Spotfire, TIBCO Software >>> >>>> wdunlap tibco.com >>> >>>> >>> >>>>> -----Original Message----- >>> >>>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org >>>>> ] On Behalf Of Jeffrey Joh >>> >>>>> Sent: Monday, February 06, 2012 4:28 PM >>> >>>>> To: r-help at r-project.org >>> >>>>> Subject: [R] Table rearranging >>> >>>>> >>> >>>>> >>> >>>>> I have a table that looks like this: >>> >>>>> >>> >>>>> measurement date door color >>> >>>>> 0.93529385 513 open red >>> >>>>> 0.97419293 420 open red >>> >>>>> 0.962053514 513 closed red >>> >>>>> 0.963909937 1230 open blue >>> >>>>> 0.97652034 1230 open green >>> >>>>> 0.989310795 1230 closed blue >>> >>>>> 0.9941022 917 closed yellow >>> >>>>> >>> >>>>> I would like to create a table that has: Open measurement, Closed >>>>> measurement, date, color. For every >>> >>>>> date/color combination, there should be two columns to represent >>>>> the door open/closed measurement. >>> >>>>> >>> >>>>> If there are multiple datapoints with a given door/date/color >>>>> combination, then they should be >>> >>>>> averaged. >>> >>>>> I would also like to make two columns to represent the number of >>> >>>>> datapoints that were averaged in determining the open/closed >>> >>>>> measurements. >>> >>>>> >>> >>>>> Jeffrey >>> >>>>> >>> >>>>> ______________________________________________ >>> >>>>> R-help at r-project.org mailing list >>> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>> >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> >>>>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> >> David Winsemius, MD >> West Hartford, CT >> >David Winsemius, MD West Hartford, CT