Dear all, I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table("file.txt",fill=T,colClasses = "character",header=T) file looks like this- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 0.000000 0.000000 10 135344110 0.000000 0.000000 24.000000 0.000000 10 135344110 0.000000 0.000000 24.000000 0.000000 10 135344113 0.000000 0.000000 24.000000 0.000000 10 135344114 24.000000 0.000000 0.000000 0.000000 10 135344114 24.000000 0.000000 0.000000 0.000000 10 135344116 0.000000 0.000000 0.000000 24.000000 10 135344118 0.000000 24.000000 0.000000 0.000000 10 135344118 0.000000 0.000000 0.000000 24.000000 10 135344122 24.000000 0.000000 0.000000 0.000000 10 135344122 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 0.000000 0.000000 24.000000 10 135344126 0.000000 0.000000 24.000000 0.000000 Now some of the values in column Pos are same.for these same positions i want to add the values of columns 2:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 48.000000 0.000000 so the whole output for above input should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 48.000000 0.000000 10 135344113 0.000000 0.000000 24.000000 0.000000 10 135344114 48.000000 0.000000 0.000000 0.000000 10 135344116 0.000000 0.000000 0.000000 24.000000 10 135344118 0.000000 24.000000 0.000000 24.000000 10 135344122 24.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 48.000000 0.000000 24.000000 10 135344126 0.000000 0.000000 24.000000 0.000000 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: file.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110714/d0532f56/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: question.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110714/d0532f56/attachment-0001.txt>
?tapply (in base R) ?aggregate ?by (wrapper for tapply) ?ave (in base R -- based on tapply) Also package plyr (and several others, undoubtedly). Also google on "R summarize data by groups" or similar gets many relevant hits. -- Bert 2011/7/14 Bansal, Vikas <vikas.bansal at kcl.ac.uk>:> Dear all, > > I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) > > I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- > > file=read.table("file.txt",fill=T,colClasses = "character",header=T) > > file looks like this- > > ?Chr ? ? ? Pos ? ? ? ? ? ?CaseA ? ? CaseC ? ? ? ? ? ?CaseG ? ? ?CaseT > ?10 135344110 ?0.000000 24.000000 ?0.000000 ?0.000000 > ?10 135344110 ?0.000000 ?0.000000 24.000000 ?0.000000 > ?10 135344110 ?0.000000 ?0.000000 24.000000 ?0.000000 > ?10 135344113 ?0.000000 ?0.000000 24.000000 ?0.000000 > ?10 135344114 24.000000 ?0.000000 ?0.000000 ?0.000000 > ?10 135344114 24.000000 ?0.000000 ?0.000000 ?0.000000 > ?10 135344116 ?0.000000 ?0.000000 ?0.000000 24.000000 > ?10 135344118 ?0.000000 24.000000 ?0.000000 ?0.000000 > ?10 135344118 ?0.000000 ?0.000000 ?0.000000 24.000000 > ?10 135344122 24.000000 ?0.000000 ?0.000000 ?0.000000 > ?10 135344122 ?0.000000 24.000000 ?0.000000 ?0.000000 > ?10 135344123 ?0.000000 24.000000 ?0.000000 ?0.000000 > ?10 135344123 ?0.000000 24.000000 ?0.000000 ?0.000000 > ?10 135344123 ?0.000000 ?0.000000 ?0.000000 24.000000 > ?10 135344126 ?0.000000 ?0.000000 24.000000 ?0.000000 > > Now some of the values in column Pos are same.for these same positions i want to add the values of columns 2:6 > I will explain with an example- > The output of first row should be- > > ?Chr ? ? ? Pos ? ? ? ? ? ?CaseA ? ? CaseC ? ? ? ? ? ?CaseG ? ? ?CaseT > ?10 135344110 ?0.000000 24.000000 ?48.000000 ?0.000000 > > so the whole output for above input should be- > > ?Chr ? ? ? Pos ? ? ? ? ? ?CaseA ? ? CaseC ? ? ? ? ? ?CaseG ? ? ?CaseT > ?10 135344110 ? ?0.000000 ?24.000000 ?48.000000 ? ?0.000000 > ?10 135344113 ? ?0.000000 ? 0.000000 ? 24.000000 ? ?0.000000 > ?10 135344114 ?48.000000 ?0.000000 ? ?0.000000 ? ? 0.000000 > ?10 135344116 ? 0.000000 ? 0.000000 ? ?0.000000 ? ?24.000000 > ?10 135344118 ? 0.000000 ?24.000000 ? 0.000000 ? ?24.000000 > ?10 135344122 ?24.000000 24.000000 ? 0.000000 ? ?0.000000 > ?10 135344123 ? 0.000000 ?48.000000 ? 0.000000 ? ?24.000000 > ?10 135344126 ? 0.000000 ?0.000000 ? ?24.000000 ? 0.000000 > > Can you please help me. > > > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics
I have checked it but did not get any results.Is there a way I can do it?I will be very thankful to you. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London ________________________________________ From: Bert Gunter [gunter.berton at gene.com] Sent: Thursday, July 14, 2011 4:54 PM To: Bansal, Vikas Cc: r-help at r-project.org Subject: Re: [R] Adding rows based on column value ?tapply (in base R) ?aggregate ?by (wrapper for tapply) ?ave (in base R -- based on tapply) Also package plyr (and several others, undoubtedly). Also google on "R summarize data by groups" or similar gets many relevant hits. -- Bert 2011/7/14 Bansal, Vikas <vikas.bansal at kcl.ac.uk>:> Dear all, > > I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) > > I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- > > file=read.table("file.txt",fill=T,colClasses = "character",header=T) > > file looks like this- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.000000 24.000000 0.000000 0.000000 > 10 135344110 0.000000 0.000000 24.000000 0.000000 > 10 135344110 0.000000 0.000000 24.000000 0.000000 > 10 135344113 0.000000 0.000000 24.000000 0.000000 > 10 135344114 24.000000 0.000000 0.000000 0.000000 > 10 135344114 24.000000 0.000000 0.000000 0.000000 > 10 135344116 0.000000 0.000000 0.000000 24.000000 > 10 135344118 0.000000 24.000000 0.000000 0.000000 > 10 135344118 0.000000 0.000000 0.000000 24.000000 > 10 135344122 24.000000 0.000000 0.000000 0.000000 > 10 135344122 0.000000 24.000000 0.000000 0.000000 > 10 135344123 0.000000 24.000000 0.000000 0.000000 > 10 135344123 0.000000 24.000000 0.000000 0.000000 > 10 135344123 0.000000 0.000000 0.000000 24.000000 > 10 135344126 0.000000 0.000000 24.000000 0.000000 > > Now some of the values in column Pos are same.for these same positions i want to add the values of columns 3:6 > I will explain with an example- > The output of first row should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.000000 24.000000 48.000000 0.000000 > > so the whole output for above input should be- > > Chr Pos CaseA CaseC CaseG CaseT > 10 135344110 0.000000 24.000000 48.000000 0.000000 > 10 135344113 0.000000 0.000000 24.000000 0.000000 > 10 135344114 48.000000 0.000000 0.000000 0.000000 > 10 135344116 0.000000 0.000000 0.000000 24.000000 > 10 135344118 0.000000 24.000000 0.000000 24.000000 > 10 135344122 24.000000 24.000000 0.000000 0.000000 > 10 135344123 0.000000 48.000000 0.000000 24.000000 > 10 135344126 0.000000 0.000000 24.000000 0.000000 > > Can you please help me. > > > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics
Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London ________________________________________ From: Bansal, Vikas Sent: Thursday, July 14, 2011 6:07 PM To: Bert Gunter Subject: RE: [R] Adding rows based on column value Yes sir.I am trying. I am using this- aggregate(x = file[,3:6], by = list(file[,2]), FUN = "sum") but I think this is not a right way.Because we cannot use "sum" to add.That is why I was asking for help. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London ________________________________________ From: Bert Gunter [gunter.berton at gene.com] Sent: Thursday, July 14, 2011 6:01 PM To: Bansal, Vikas Subject: Re: [R] Adding rows based on column value Not from me -- I don't believe you've made an honest effort. Maybe someone else will help you. You might try posting reproducible code that show your efforts -- as the posting guide requests. -- Bert On Thu, Jul 14, 2011 at 9:46 AM, Bansal, Vikas <vikas.bansal at kcl.ac.uk> wrote:> I have checked it but did not get any results.Is there a way I can do it?I will be very thankful to you. > > Thanking you, > Warm Regards > Vikas Bansal > Msc Bioinformatics > Kings College London > ________________________________________ > From: Bert Gunter [gunter.berton at gene.com] > Sent: Thursday, July 14, 2011 4:54 PM > To: Bansal, Vikas > Cc: r-help at r-project.org > Subject: Re: [R] Adding rows based on column value > > ?tapply (in base R) > ?aggregate ?by (wrapper for tapply) > ?ave (in base R -- based on tapply) > > Also package plyr (and several others, undoubtedly). > > Also google on "R summarize data by groups" or similar gets many relevant hits. > > -- Bert > > > > > 2011/7/14 Bansal, Vikas <vikas.bansal at kcl.ac.uk>: >> Dear all, >> >> I have one problem and did not find any solution.(I have also attached the problem in text file because sometimes column spacing is not good in mail) >> >> I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- >> >> file=read.table("file.txt",fill=T,colClasses = "character",header=T) >> >> file looks like this- >> >> Chr Pos CaseA CaseC CaseG CaseT >> 10 135344110 0.000000 24.000000 0.000000 0.000000 >> 10 135344110 0.000000 0.000000 24.000000 0.000000 >> 10 135344110 0.000000 0.000000 24.000000 0.000000 >> 10 135344113 0.000000 0.000000 24.000000 0.000000 >> 10 135344114 24.000000 0.000000 0.000000 0.000000 >> 10 135344114 24.000000 0.000000 0.000000 0.000000 >> 10 135344116 0.000000 0.000000 0.000000 24.000000 >> 10 135344118 0.000000 24.000000 0.000000 0.000000 >> 10 135344118 0.000000 0.000000 0.000000 24.000000 >> 10 135344122 24.000000 0.000000 0.000000 0.000000 >> 10 135344122 0.000000 24.000000 0.000000 0.000000 >> 10 135344123 0.000000 24.000000 0.000000 0.000000 >> 10 135344123 0.000000 24.000000 0.000000 0.000000 >> 10 135344123 0.000000 0.000000 0.000000 24.000000 >> 10 135344126 0.000000 0.000000 24.000000 0.000000 >> >> Now some of the values in column Pos are same.for these same positions i want to add the values of columns 3:6 >> I will explain with an example- >> The output of first row should be- >> >> Chr Pos CaseA CaseC CaseG CaseT >> 10 135344110 0.000000 24.000000 48.000000 0.000000 >> >> so the whole output for above input should be- >> >> Chr Pos CaseA CaseC CaseG CaseT >> 10 135344110 0.000000 24.000000 48.000000 0.000000 >> 10 135344113 0.000000 0.000000 24.000000 0.000000 >> 10 135344114 48.000000 0.000000 0.000000 0.000000 >> 10 135344116 0.000000 0.000000 0.000000 24.000000 >> 10 135344118 0.000000 24.000000 0.000000 24.000000 >> 10 135344122 24.000000 24.000000 0.000000 0.000000 >> 10 135344123 0.000000 48.000000 0.000000 24.000000 >> 10 135344126 0.000000 0.000000 24.000000 0.000000 >> >> Can you please help me. >> >> >> >> Thanking you, >> Warm Regards >> Vikas Bansal >> Msc Bioinformatics >> Kings College London >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > > -- > "Men by nature long to get on to the ultimate truths, and will often > be impatient with elementary studies or fight shy of them. If it were > possible to reach the ultimate truths without the elementary studies > usually prefixed to them, these would not be preparatory studies but > superfluous diversions." > > -- Maimonides (1135-1204) > > Bert Gunter > Genentech Nonclinical Biostatistics >-- "Men by nature long to get on to the ultimate truths, and will often be impatient with elementary studies or fight shy of them. If it were possible to reach the ultimate truths without the elementary studies usually prefixed to them, these would not be preparatory studies but superfluous diversions." -- Maimonides (1135-1204) Bert Gunter Genentech Nonclinical Biostatistics
Bansal, Vikas <vikas.bansal <at> kcl.ac.uk> writes:> I am using this- > > aggregate(x = file[,3:6], by = list(file[,2]), FUN = "sum") >Better, although still not reproducible (please *do* read the posting guide -- it is listed at the bottom of every R list post and is the *first* google hit for "posting guide" (!); search for "Examples"). What about removing the quotation marks around "sum"? aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)> but I think this is not a right way. > Because we cannot use "sum" to add.That is > why I was asking for help.
Dear all, I have one problem and did not find any solution. I have attached the question in text file also because sometimes spacing is not good in mail. I have a file(file.txt) attached with this mail.I am reading it using this code to make a data frame (file)- file=read.table("file.txt",fill=T,colClasses = "character",header=T) file looks like this- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 0.000000 0.000000 10 135344110 0.000000 0.000000 24.000000 0.000000 10 135344110 0.000000 0.000000 24.000000 0.000000 10 135344113 0.000000 0.000000 24.000000 0.000000 10 135344114 24.000000 0.000000 0.000000 0.000000 10 135344114 24.000000 0.000000 0.000000 0.000000 10 135344116 0.000000 0.000000 0.000000 24.000000 10 135344118 0.000000 24.000000 0.000000 0.000000 10 135344118 0.000000 0.000000 0.000000 24.000000 10 135344122 24.000000 0.000000 0.000000 0.000000 10 135344122 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 0.000000 0.000000 24.000000 10 135344126 0.000000 0.000000 24.000000 0.000000 Now some of the values in column Pos are same.For these same positions i want to add the values of columns 3:6 I will explain with an example- The output of first row should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 48.000000 0.000000 because first three rows have same value in Pos column. so the whole output for above input should be- Chr Pos CaseA CaseC CaseG CaseT 10 135344110 0.000000 24.000000 48.000000 0.000000 10 135344113 0.000000 0.000000 24.000000 0.000000 10 135344114 48.000000 0.000000 0.000000 0.000000 10 135344116 0.000000 0.000000 0.000000 24.000000 10 135344118 0.000000 24.000000 0.000000 24.000000 10 135344122 24.000000 24.000000 0.000000 0.000000 10 135344123 0.000000 48.000000 0.000000 24.000000 10 135344126 0.000000 0.000000 24.000000 0.000000 Can you please help me. Thanking you, Warm Regards Vikas Bansal Msc Bioinformatics Kings College London -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: question.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110715/ce1d2191/attachment.txt> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: file.txt URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110715/ce1d2191/attachment-0001.txt>