Dear R helpers, I wish to use the "sum" operator for each row of a data frame. However, it appears that the operator acts on the entire data frame, over all columns. What is the best way to obtain row- wise operation? The following code shows my attempts so far, and their problems:- test1=array(rbinom(120,1,0.5),c(20,3)) test1[,3]=NA sum(test1[,1:2]) test1[,3][sum(test1[,1:2])>=2]=1 test1[,3][sum(test1[,1:2])] test1 test2=array(rbinom(120,1,0.5),c(20,3)) test2[,3]=NA sum(test2[,1:2]) test2[,3][(test2[,1]+test2[,2])>=2]=1 test2[,3][sum(test2[,1:2])] test2 In the 1st section, I try to use "sum" to add the first two columns of a data frame. Here, sum(test1[,1:2]) evaluates to a single integer but this modifies *all* the rows of test1. In the 2nd section, specifying addition of the first two columns with a '+' acts row-by row, as I want. This is OK for this demonstration, but would be impractical in the program I am trying to write (where the columns I wish to sum are numerous and change from time to time). I would be very grateful to know if it is possible to get operators to act on rows of a data frame, and if so, how. I am running R1.8.1 on Windows NT. Jonathan Williams OPTIMA Radcliffe Infirmary Woodstock Road OXFORD OX2 6HE Tel +1865 (2)24356
Hi On 11 Mar 2004 at 12:13, Jonathan Williams wrote: > Dear R helpers, > I wish to use the "sum" operator for each row of a data frame. > However, it appears that the operator acts on the entire data > frame, over all columns. What is the best way to obtain row- > wise operation? > > The following code shows my attempts so far, and their problems:- > > test1=array(rbinom(120,1,0.5),c(20,3)) > test1[,3]=NA > sum(test1[,1:2]) Try rowSums() rowSums(test1, na.rm=T) > test1[,3][sum(test1[,1:2])>=2]=1 > test1[,3][sum(test1[,1:2])] > test1 > > test2=array(rbinom(120,1,0.5),c(20,3)) > test2[,3]=NA This can be done by test2[rowSums(test2, na.rm=T)>=2,3]<-1 Cheers Petr > sum(test2[,1:2]) > test2[,3][(test2[,1]+test2[,2])>=2]=1 > test2[,3][sum(test2[,1:2])] > test2 > > In the 1st section, I try to use "sum" to add the first two columns of > a data frame. Here, sum(test1[,1:2]) evaluates to a single integer but > this modifies *all* the rows of test1. > > In the 2nd section, specifying addition of the first two columns with > a '+' acts row-by row, as I want. This is OK for this demonstration, > but would be impractical in the program I am trying to write (where > the columns I wish to sum are numerous and change from time to time). > > I would be very grateful to know if it is possible to get operators to > act on rows of a data frame, and if so, how. > > I am running R1.8.1 on Windows NT. > > Jonathan Williams > OPTIMA > Radcliffe Infirmary > Woodstock Road > OXFORD OX2 6HE > Tel +1865 (2)24356 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html Petr Pikal petr.pikal at precheza.cz
The generic solution is apply(myArray, 1, myfun) - see ?apply (while you're there see ?lapply and ?sapply as well!). For you particular case, you can use rowSums(). There is a distinction between `array`, `matrix` and `data.frame` - these are not interchangeable terms. As it happens, both apply() and rowSums() will coerce the first argument to an array if possible - thus they might work on a data.frame. But be very sure your data.frame does not contain any factors or character values.> -----Original Message----- > From: Jonathan Williams > [mailto:jonathan.williams at pharmacology.oxford.ac.uk] > Sent: 11 March 2004 12:14 > To: Ethz. Ch > Subject: [R] making operators act on rows of a data frame > > > Security Warning: > If you are not sure an attachment is safe to open contact > Andy on x234. > There are 0 attachments with this message. > ________________________________________________________________ > > Dear R helpers, > I wish to use the "sum" operator for each row of a data frame. > However, it appears that the operator acts on the entire data > frame, over all columns. What is the best way to obtain row- > wise operation? > > The following code shows my attempts so far, and their problems:- > > test1=array(rbinom(120,1,0.5),c(20,3)) > test1[,3]=NA > sum(test1[,1:2]) > test1[,3][sum(test1[,1:2])>=2]=1 > test1[,3][sum(test1[,1:2])] > test1 > > test2=array(rbinom(120,1,0.5),c(20,3)) > test2[,3]=NA > sum(test2[,1:2]) > test2[,3][(test2[,1]+test2[,2])>=2]=1 > test2[,3][sum(test2[,1:2])] > test2 > > In the 1st section, I try to use "sum" to add the first two columns > of a data frame. Here, sum(test1[,1:2]) evaluates to a single integer > but this modifies *all* the rows of test1. > > In the 2nd section, specifying addition of the first two columns with > a '+' acts row-by row, as I want. This is OK for this demonstration, > but would be impractical in the program I am trying to write (where > the columns I wish to sum are numerous and change from time to time). > > I would be very grateful to know if it is possible to get operators > to act on rows of a data frame, and if so, how. > > I am running R1.8.1 on Windows NT. > > Jonathan Williams > OPTIMA > Radcliffe Infirmary > Woodstock Road > OXFORD OX2 6HE > Tel +1865 (2)24356 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide!http://www.R-project.org/posting-guide.html Simon Fear Senior Statistician Syne qua non Ltd Tel: +44 (0) 1379 644449 Fax: +44 (0) 1379 644445 email: Simon.Fear at synequanon.com web: http://www.synequanon.com Number of attachments included with this message: 0 This message (and any associated files) is confidential and\...{{dropped}}
Jonathan - Petr Pikal suggests RowSums() for your specific task. A more general solution is given by apply() and sapply(). These functions will apply any R function (even one that you would write yourself) to individual members of any dimension of an array, matrix or data frame, or any combination of dimensions (in the context of a three or four-dimensional [or more] array). See help("apply"), help("sapply"). - tom blackwell - u michigan medical school - ann arbor - On Thu, 11 Mar 2004, Jonathan Williams wrote:> Dear R helpers, > I wish to use the "sum" operator for each row of a data frame. > However, it appears that the operator acts on the entire data > frame, over all columns. What is the best way to obtain row- > wise operation? > > The following code shows my attempts so far, and their problems:- > > test1=array(rbinom(120,1,0.5),c(20,3)) > test1[,3]=NA > sum(test1[,1:2]) > test1[,3][sum(test1[,1:2])>=2]=1 > test1[,3][sum(test1[,1:2])] > test1 > > test2=array(rbinom(120,1,0.5),c(20,3)) > test2[,3]=NA > sum(test2[,1:2]) > test2[,3][(test2[,1]+test2[,2])>=2]=1 > test2[,3][sum(test2[,1:2])] > test2 > > In the 1st section, I try to use "sum" to add the first two columns > of a data frame. Here, sum(test1[,1:2]) evaluates to a single integer > but this modifies *all* the rows of test1. > > In the 2nd section, specifying addition of the first two columns with > a '+' acts row-by row, as I want. This is OK for this demonstration, > but would be impractical in the program I am trying to write (where > the columns I wish to sum are numerous and change from time to time). > > I would be very grateful to know if it is possible to get operators > to act on rows of a data frame, and if so, how. > > I am running R1.8.1 on Windows NT. > > Jonathan Williams > OPTIMA > Radcliffe Infirmary > Woodstock Road > OXFORD OX2 6HE > Tel +1865 (2)24356 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >