Hello everybody, I am beginning with loops and functions and would be glad to have help in the following question: If i have a dataframe like this Site Prof H 1 1 24 1 1 16 1 1 67 1 2 23 1 2 56 1 2 45 2 1 67 2 1 46 And I would like to create a new column that subtracts the minimum of H from H, but for S1 and P1 only the minimum of the data points falling into this category should be taken. So for example the three first numbers of the new column write: 24-16, 16-16, 67-16 the following numbers refering to Site1 and Prof2 write: 23-23, 56-23, 45-23. I think with two loops one refering to the Site, the other to the Prof, it should be possible to automatically create the new column. Thanks a lot for any help. -- View this message in context: http://r.789695.n4.nabble.com/Simple-loop-tp3492819p3492819.html Sent from the R help mailing list archive at Nabble.com.
Hi. There is no need to do this in a for loop. Here is one approach: x <- read.table(textConnection("Site Prof H 1 1 24 1 1 16 1 1 67 1 2 23 1 2 56 1 2 45 2 1 67 2 1 46"), header = TRUE) closeAllConnections() x cbind(x,newCol=unlist(tapply(x[,3],paste(x[,1],x[,2],sep=""), function(x) x-min(x))) Site Prof H newCol 111 1 1 24 8 112 1 1 16 0 113 1 1 67 51 121 1 2 23 0 122 1 2 56 33 123 1 2 45 22 211 2 1 67 21 212 2 1 46 0 Andrija On Tue, May 3, 2011 at 5:44 PM, Woida71 <w.gostner@ipp.bz.it> wrote:> Hello everybody, > I am beginning with loops and functions and would be glad to have help in > the following question: > If i have a dataframe like this > Site Prof H > 1 1 24 > 1 1 16 > 1 1 67 > 1 2 23 > 1 2 56 > 1 2 45 > 2 1 67 > 2 1 46 > And I would like to create a new column that subtracts the minimum of H > from > H, but for S1 and P1 > only the minimum of the data points falling into this category should be > taken. > So for example the three first numbers of the new column write: 24-16, > 16-16, 67-16 > the following numbers refering to Site1 and Prof2 write: 23-23, 56-23, > 45-23. > I think with two loops one refering to the Site, the other to the Prof, it > should be possible to automatically > create the new column. > Thanks a lot for any help. > > -- > View this message in context: > http://r.789695.n4.nabble.com/Simple-loop-tp3492819p3492819.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
It is actually possible and preferable to do this with no loops. Assuming your data is in a dataframe called dat: idx <- with(dat, Site == 1 & Prof == 1) dat <- within(dat, { new = H - ifelse(Site == 1 & Prof == 1, min(H[idx]), min(H[!idx])) }) dat which also serves to illuminate the difference between with and within as a bonus. HTH, Jon On Tue, May 3, 2011 at 11:44 AM, Woida71 <w.gostner at ipp.bz.it> wrote:> Hello everybody, > I am beginning with loops and functions and would be glad to have help in > the following question: > If i have a dataframe like this > Site ?Prof ?H > 1 ? ? ?1 ? ? 24 > 1 ? ? ?1 ? ? 16 > 1 ? ? ?1 ? ? 67 > 1 ? ? ?2 ? ? 23 > 1 ? ? ?2 ? ? 56 > 1 ? ? ?2 ? ? 45 > 2 ? ? ?1 ? ? 67 > 2 ? ? ?1 ? ? 46 > And I would like to create a new column that subtracts the minimum of H from > H, but for S1 and P1 > only the minimum of the data points falling into this category should be > taken. > So for example the three first numbers of the new column write: 24-16, > 16-16, 67-16 > the following numbers refering to Site1 and Prof2 write: 23-23, 56-23, > 45-23. > I think with two loops one refering to the Site, the other to the Prof, it > should be possible to automatically > create the new column. > Thanks a lot for any help. > > -- > View this message in context: http://r.789695.n4.nabble.com/Simple-loop-tp3492819p3492819.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- ==============================================Jon Daily Technician ==============================================#!/usr/bin/env outside # It's great, trust me.
Hi: Here are two more candidates, using packages plyr and data.table. Your toy data frame is called dd below. library(plyr) ddply(dd, .(Site, Prof), transform, Hadj = H - min(H)) Site Prof H Hadj 1 1 1 24 8 2 1 1 16 0 3 1 1 67 51 4 1 2 23 0 5 1 2 56 33 6 1 2 45 22 7 2 1 67 21 8 2 1 46 0 library(data.table) dt <- data.table(dd, key = 'Site, Prof') dt[, list(H = H, Hadj = H - min(H)), by = 'Site, Prof'] <same output as above> HTH, Dennis On Tue, May 3, 2011 at 8:44 AM, Woida71 <w.gostner at ipp.bz.it> wrote:> Hello everybody, > I am beginning with loops and functions and would be glad to have help in > the following question: > If i have a dataframe like this > Site ?Prof ?H > 1 ? ? ?1 ? ? 24 > 1 ? ? ?1 ? ? 16 > 1 ? ? ?1 ? ? 67 > 1 ? ? ?2 ? ? 23 > 1 ? ? ?2 ? ? 56 > 1 ? ? ?2 ? ? 45 > 2 ? ? ?1 ? ? 67 > 2 ? ? ?1 ? ? 46 > And I would like to create a new column that subtracts the minimum of H from > H, but for S1 and P1 > only the minimum of the data points falling into this category should be > taken. > So for example the three first numbers of the new column write: 24-16, > 16-16, 67-16 > the following numbers refering to Site1 and Prof2 write: 23-23, 56-23, > 45-23. > I think with two loops one refering to the Site, the other to the Prof, it > should be possible to automatically > create the new column. > Thanks a lot for any help. > > -- > View this message in context: http://r.789695.n4.nabble.com/Simple-loop-tp3492819p3492819.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of Petr Savicky > Sent: Wednesday, May 04, 2011 12:51 AM > To: r-help at r-project.org > Subject: Re: [R] Simple loop > > On Tue, May 03, 2011 at 12:04:47PM -0700, William Dunlap wrote: > [...] > > ave() can deal that problem: > > > cbind(x, newCol2 = with(x, ave(H, Site, Prof, > > FUN=function(y)y-min(y)))) > > Site Prof H newCol2 > > 1 1 1 24 8 > > 2 1 1 16 0 > > 3 1 1 67 51 > > 4 1 2 23 0 > > 5 1 2 56 33 > > 6 1 2 45 22 > > 7 2 1 67 21 > > 8 2 1 46 0 > > Warning message: > > In min(y) : no non-missing arguments to min; returning Inf > > The warning is unfortunate: ave() calls FUN even for when > > there is no data for a particular group (Site=2, Prof=2 in this > > case). > > The warning may be avoided using min(y, Inf) instead of min().Yes, but the fact remains that ave() wastes time and causes unnecessary warnings and errors by calling FUN when it knows it will do nothing with the result (because there are no entries in x with a given combination of the factor levels in the ... arguments). Using paste(Site,Prof) when calling ave() is ugly, in that it forces you to consider implementation details that you expect ave() to take care of (how does paste convert various types to strings?). It also courts errors since paste("A B", "C") and paste("A", "B C") give the same result but represent different Site/Prof combinations. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> > cbind(x, newCol2 = with(x, ave(H, Site, Prof, > FUN=function(y)y-min(y,Inf)))) > > Site Prof H newCol2 > 1 1 1 24 8 > 2 1 1 16 0 > 3 1 1 67 51 > 4 1 2 23 0 > 5 1 2 56 33 > 6 1 2 45 22 > 7 2 1 67 21 > 8 2 1 46 0 > > Another approach is to combine Site, Prof to a single column > in any way suitable for the application. For example > > cbind(x, newCol2 = with(x, ave(H, paste(Site, Prof), > FUN=function(y)y-min(y)))) > > Petr Savicky. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Wed, May 04, 2011 at 08:52:07AM -0700, William Dunlap wrote:> > -----Original Message----- > > From: r-help-bounces at r-project.org > > [mailto:r-help-bounces at r-project.org] On Behalf Of Petr Savicky > > Sent: Wednesday, May 04, 2011 12:51 AM > > To: r-help at r-project.org > > Subject: Re: [R] Simple loop > > > > On Tue, May 03, 2011 at 12:04:47PM -0700, William Dunlap wrote: > > [...] > > > ave() can deal that problem: > > > > cbind(x, newCol2 = with(x, ave(H, Site, Prof, > > > FUN=function(y)y-min(y)))) > > > Site Prof H newCol2 > > > 1 1 1 24 8 > > > 2 1 1 16 0 > > > 3 1 1 67 51 > > > 4 1 2 23 0 > > > 5 1 2 56 33 > > > 6 1 2 45 22 > > > 7 2 1 67 21 > > > 8 2 1 46 0 > > > Warning message: > > > In min(y) : no non-missing arguments to min; returning Inf > > > The warning is unfortunate: ave() calls FUN even for when > > > there is no data for a particular group (Site=2, Prof=2 in this > > > case). > > > > The warning may be avoided using min(y, Inf) instead of min(). > > Yes, but the fact remains that ave() wastes time and causes > unnecessary warnings and errors by calling FUN when it knows > it will do nothing with the result (because there are no entries > in x with a given combination of the factor levels in the ... > arguments).I agree. For the original question, avoiding the warning is preferrable. The general question belongs more to R-devel.> Using paste(Site,Prof) when calling ave() is ugly, in that it > forces you to consider implementation details that you expect > ave() to take care of (how does paste convert various types > to strings?). It also courts errors since paste("A B", "C") > and paste("A", "B C") give the same result but represent different > Site/Prof combinations.Thank you for this remark. I used the formulation "combine ... in any way suitable for the application" with this effect in mind, but let us be more specific. For numbers, in particular integers, paste() seems to be good enough. For character vectors, a possible approach is paste(X, Y, sep="\r") since the character "\r" is unlikely to be used in character vectors. A similar approach is used, for example in unique.matrix(). I did not like it much, but it also has advantages. Petr Savicky.
On May 4, 2011, at 17:52 , William Dunlap wrote:>> -----Original Message----- >> From: r-help-bounces at r-project.org >> [mailto:r-help-bounces at r-project.org] On Behalf Of Petr Savicky >> Sent: Wednesday, May 04, 2011 12:51 AM >> To: r-help at r-project.org >> Subject: Re: [R] Simple loop >> >> On Tue, May 03, 2011 at 12:04:47PM -0700, William Dunlap wrote: >> [...] >>> ave() can deal that problem: >>>> cbind(x, newCol2 = with(x, ave(H, Site, Prof, >>> FUN=function(y)y-min(y)))) >>> Site Prof H newCol2 >>> 1 1 1 24 8 >>> 2 1 1 16 0 >>> 3 1 1 67 51 >>> 4 1 2 23 0 >>> 5 1 2 56 33 >>> 6 1 2 45 22 >>> 7 2 1 67 21 >>> 8 2 1 46 0 >>> Warning message: >>> In min(y) : no non-missing arguments to min; returning Inf >>> The warning is unfortunate: ave() calls FUN even for when >>> there is no data for a particular group (Site=2, Prof=2 in this >>> case). >> >> The warning may be avoided using min(y, Inf) instead of min(). > > Yes, but the fact remains that ave() wastes time and causes > unnecessary warnings and errors by calling FUN when it knows > it will do nothing with the result (because there are no entries > in x with a given combination of the factor levels in the ... > arguments). > > Using paste(Site,Prof) when calling ave() is ugly, in that it > forces you to consider implementation details that you expect > ave() to take care of (how does paste convert various types > to strings?). It also courts errors since paste("A B", "C") > and paste("A", "B C") give the same result but represent different > Site/Prof combinations.Well, ave() uses interaction(...) and interaction() has a "drop" argument, so> with(x, ave(H, Site, Prof, drop=TRUE, FUN=function(y)y-min(y)))[1] 8 0 51 0 33 22 21 0 (I suppose ?ave should be a bit more explicit about passing "...")> > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> >> cbind(x, newCol2 = with(x, ave(H, Site, Prof, >> FUN=function(y)y-min(y,Inf)))) >> >> Site Prof H newCol2 >> 1 1 1 24 8 >> 2 1 1 16 0 >> 3 1 1 67 51 >> 4 1 2 23 0 >> 5 1 2 56 33 >> 6 1 2 45 22 >> 7 2 1 67 21 >> 8 2 1 46 0 >> >> Another approach is to combine Site, Prof to a single column >> in any way suitable for the application. For example >> >> cbind(x, newCol2 = with(x, ave(H, paste(Site, Prof), >> FUN=function(y)y-min(y)))) >> >> Petr Savicky. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com