Hello all, I have not been able to find an answer to this problem. I feel like it might be so simple though that it might not get a response. Suppose I have a dataframe like the one I have copied below (minus the 'calib' column). I wish to create a column like calib where I am subtracting the 'Count' when 'stain' is 'none' from all other 'Count' data for every value of 'rep'. This is sort of analogous to putting a $ in front of the number that identifies a cell in a spreadsheet environment. Specifically I need some like this: mydataframe$calib <- Count - (Count when stain = none for each value rep) Any thoughts on how I might accomplish this? Thanks in advance. Sam Note: I've already calculated the calib column in gnumeric for clarity. rep Count stain calib 1 1522 none 0 1 147 syto -1375 1 544.8 sytolec -977.2 1 2432.6 sytolec 910.6 1 234.6 sytolec -1287.4 2 5699.8 none 0 2 265.6 syto -5434.2 2 329.6 sytolec -5370.2 2 383 sytolec -5316.8 2 968.8 sytolec -4731 3 2466.8 none 0 3 1303 syto -1163.8 3 1290.6 sytolec -1176.2 3 110.2 sytolec -2356.6 3 15086.8 sytolec 12620 -- ***************************************************** Sam Albers Geography Program University of Northern British Columbia 3333 University Way Prince George, British Columbia Canada, V2N 4Z9 phone: 250 960-6777 ***************************************************** [[alternative HTML version deleted]]
Hello, On Fri, Mar 12, 2010 at 3:27 PM, Sam Albers <tonightsthenight@gmail.com>wrote:> Hello all, > > I have not been able to find an answer to this problem. I feel like it > might > be so simple though that it might not get a response. > > Suppose I have a dataframe like the one I have copied below (minus the > 'calib' column). I wish to create a column like calib where I am > subtracting > the 'Count' when 'stain' is 'none' from all other 'Count' data for every > value of 'rep'. This is sort of analogous to putting a $ in front of the > number that identifies a cell in a spreadsheet environment. Specifically I > need some like this: > > mydataframe$calib <- Count - (Count when stain = none for each value rep) > > Any thoughts on how I might accomplish this? >Here's one way: b <- a[(a$stain=="none"), "Count"] a$calib <- a$Count - b[a$rep] Note that it only works if the values of rep are integers starting with 1 and increasing sequentially (1, 2, 3, ...) Jonathan> > Thanks in advance. > > Sam > > Note: I've already calculated the calib column in gnumeric for clarity. > > rep Count stain calib > 1 1522 none 0 > 1 147 syto -1375 > 1 544.8 sytolec -977.2 > 1 2432.6 sytolec 910.6 > 1 234.6 sytolec -1287.4 > 2 5699.8 none 0 > 2 265.6 syto -5434.2 > 2 329.6 sytolec -5370.2 > 2 383 sytolec -5316.8 > 2 968.8 sytolec -4731 > 3 2466.8 none 0 > 3 1303 syto -1163.8 > 3 1290.6 sytolec -1176.2 > 3 110.2 sytolec -2356.6 > 3 15086.8 sytolec 12620 > > -- > ***************************************************** > Sam Albers > Geography Program > University of Northern British Columbia > 3333 University Way > Prince George, British Columbia > Canada, V2N 4Z9 > phone: 250 960-6777 > ***************************************************** > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
It depends how you stored your missing values. If they are listed as NAs then mydataframe$calib <- Count - is.na(stain)*Count or mydataframe$calib <- Count*(1 - is.na(stain)) The trick is the boolean 'TRUE' equals 1 in numeric calculations. -- View this message in context: http://n4.nabble.com/Vertical-subtraction-in-dataframes-tp1591223p1591351.html Sent from the R help mailing list archive at Nabble.com.
Hi: Here's a ddply solution to your problem (package plyr): library(plyr) # within group function to apply; assumes that 'none' is first obs in group # x will be the Count variable in the call... subt2 <- function(x) x - x[1] head(df) rep Count stain 1 1 1522.0 none 2 1 147.0 syto 3 1 544.8 sytolec 4 1 2432.6 sytolec 5 1 234.6 sytolec 6 2 5699.8 none ddply(df, .(rep), transform, calib = subt2(Count)) rep Count stain calib 1 1 1522.0 none 0.0 2 1 147.0 syto -1375.0 3 1 544.8 sytolec -977.2 4 1 2432.6 sytolec 910.6 5 1 234.6 sytolec -1287.4 6 2 5699.8 none 0.0 7 2 265.6 syto -5434.2 8 2 329.6 sytolec -5370.2 9 2 383.0 sytolec -5316.8 10 2 968.8 sytolec -4731.0 11 3 2466.8 none 0.0 12 3 1303.0 syto -1163.8 13 3 1290.6 sytolec -1176.2 14 3 110.2 sytolec -2356.6 15 3 15086.8 sytolec 12620.0 HTH, Dennis On Fri, Mar 12, 2010 at 2:27 PM, Sam Albers <tonightsthenight@gmail.com>wrote:> Hello all, > > I have not been able to find an answer to this problem. I feel like it > might > be so simple though that it might not get a response. > > Suppose I have a dataframe like the one I have copied below (minus the > 'calib' column). I wish to create a column like calib where I am > subtracting > the 'Count' when 'stain' is 'none' from all other 'Count' data for every > value of 'rep'. This is sort of analogous to putting a $ in front of the > number that identifies a cell in a spreadsheet environment. Specifically I > need some like this: > > mydataframe$calib <- Count - (Count when stain = none for each value rep) > > Any thoughts on how I might accomplish this? > > Thanks in advance. > > Sam > > Note: I've already calculated the calib column in gnumeric for clarity. > > rep Count stain calib > 1 1522 none 0 > 1 147 syto -1375 > 1 544.8 sytolec -977.2 > 1 2432.6 sytolec 910.6 > 1 234.6 sytolec -1287.4 > 2 5699.8 none 0 > 2 265.6 syto -5434.2 > 2 329.6 sytolec -5370.2 > 2 383 sytolec -5316.8 > 2 968.8 sytolec -4731 > 3 2466.8 none 0 > 3 1303 syto -1163.8 > 3 1290.6 sytolec -1176.2 > 3 110.2 sytolec -2356.6 > 3 15086.8 sytolec 12620 > > -- > ***************************************************** > Sam Albers > Geography Program > University of Northern British Columbia > 3333 University Way > Prince George, British Columbia > Canada, V2N 4Z9 > phone: 250 960-6777 > ***************************************************** > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
On Mar 12, 2010, at 5:27 PM, Sam Albers wrote:> Hello all, > > I have not been able to find an answer to this problem. I feel like > it might > be so simple though that it might not get a response. > > Suppose I have a dataframe like the one I have copied below (minus the > 'calib' column). I wish to create a column like calib where I am > subtracting > the 'Count' when 'stain' is 'none' from all other 'Count' data for > every > value of 'rep'. This is sort of analogous to putting a $ in front of > the > number that identifies a cell in a spreadsheet environment. > Specifically I > need some like this: > > mydataframe$calib <- Count - (Count when stain = none for each value > rep) > > Any thoughts on how I might accomplish this? > > Thanks in advance. > > Sam > > Note: I've already calculated the calib column in gnumeric for > clarity. > > rep Count stain calib > 1 1522 none 0 > 1 147 syto -1375 > 1 544.8 sytolec -977.2 > 1 2432.6 sytolec 910.6 > 1 234.6 sytolec -1287.4 > 2 5699.8 none 0 > 2 265.6 syto -5434.2 > 2 329.6 sytolec -5370.2 > 2 383 sytolec -5316.8 > 2 968.8 sytolec -4731 > 3 2466.8 none 0 > 3 1303 syto -1163.8 > 3 1290.6 sytolec -1176.2 > 3 110.2 sytolec -2356.6 > 3 15086.8 sytolec 12620This method does not depend on the ordering which I believe both solutions so far do require (but it may fail if there is more than one value satisfying the stain=="none" test). It is an example of what Spector calls split-apply-bind logic. See below: > dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep), function(x) x$calib <- x$Count- x[x$stain == "none", "Count"]) ) > dfrm repp Count stain calib calib2 1 1 1522.0 none 0.0 0.0 2 1 147.0 syto -1375.0 -1375.0 3 1 544.8 sytolec -977.2 -977.2 4 1 2432.6 sytolec 910.6 910.6 5 1 234.6 sytolec -1287.4 -1287.4 6 2 5699.8 none 0.0 0.0 7 2 265.6 syto -5434.2 -5434.2 8 2 329.6 sytolec -5370.2 -5370.2 9 2 383.0 sytolec -5316.8 -5316.8 10 2 968.8 sytolec -4731.0 -4731.0 11 3 2466.8 none 0.0 0.0 12 3 1303.0 syto -1163.8 -1163.8 13 3 1290.6 sytolec -1176.2 -1176.2 14 3 110.2 sytolec -2356.6 -2356.6 15 3 15086.8 sytolec 12620.0 12620.0 > dfrm[3,3] <-"none" > dfrm$calib2 <- unlist( lapply(split(dfrm, dfrm$rep), function(x) x $calib <- x$Count- x[x$stain=="none", "Count"]) ) Warning message: In x$Count - x[x$stain == "none", "Count"] : longer object length is not a multiple of shorter object length>-- David Winsemius, MD West Hartford, CT
Try this ave solution noting that summing Count * (stain == "none") over each group gives the Count in its stain=="none" row and ave causes that Count value to be repeated for every row of the group so we get a vector that can be subtracted from Count: transform(DF, calib = Count - ave(Count * (stain == "none"), rep, FUN = sum)) and here is an SQL solution which works in roughly the same way: library(sqldf) sqldf("select rep, Count, stain, Count - none calib from DF natural join (select rep, sum(Count * (stain = 'none')) none from DF group by rep)") On Fri, Mar 12, 2010 at 5:27 PM, Sam Albers <tonightsthenight at gmail.com> wrote:> Hello all, > > I have not been able to find an answer to this problem. I feel like it might > be so simple though that it might not get a response. > > Suppose I have a dataframe like the one I have copied below (minus the > 'calib' column). I wish to create a column like calib where I am subtracting > the 'Count' when 'stain' is 'none' from all other 'Count' ?data for every > value of 'rep'. This is sort of analogous to putting a $ in front of the > number that identifies a cell in a spreadsheet environment. ?Specifically I > need some like this: > > mydataframe$calib <- Count - (Count when stain = none for each value rep) > > Any thoughts on how I might accomplish this? > > Thanks in advance. > > Sam > > Note: I've already calculated the calib column in gnumeric for clarity. > > rep Count stain calib > 1 1522 ? ? ? ? none 0 > 1 147 ? ? ? ? syto -1375 > 1 544.8 sytolec -977.2 > 1 2432.6 sytolec 910.6 > 1 234.6 sytolec -1287.4 > 2 5699.8 none 0 > 2 265.6 syto -5434.2 > 2 329.6 sytolec -5370.2 > 2 383 ? ? ? ? sytolec -5316.8 > 2 968.8 sytolec -4731 > 3 2466.8 none 0 > 3 1303 ? ? ? ? syto -1163.8 > 3 1290.6 sytolec -1176.2 > 3 110.2 sytolec -2356.6 > 3 15086.8 sytolec 12620 > > -- > ***************************************************** > Sam Albers > Geography Program > University of Northern British Columbia > 3333 University Way > Prince George, British Columbia > Canada, V2N 4Z9 > phone: 250 960-6777 > ***************************************************** > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >