Here is one way to do it:
> y <- textConnection("UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL
38_SL
+ 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
+ 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
+ 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735
+ 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA
+ 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815
0.1133")> x <- read.table(y, header=TRUE)
> closeAllConnections()
> # split and then aggregate so we can carry through some data
> z <- split(x, x$UniGene)
> z.l <- lapply(z, function(.data){
+ .agg <- colMeans(.data[, c(1,4:8)], na.rm=TRUE)
+ data.frame(.data[1, 2], .data[1, 3], lapply(.agg, unlist))
+ })> do.call(rbind, z.l)
.data.1..2. .data.1..3. UNIQID X1_SL X2_SL X17_SL
X18_SL X38_SL
Hs.10095 Hs.10095 MLLT1 1175390 -0.00595 0.62315 0.853150
1.11215 -0.19500
Hs.10101 Hs.10101 C1orf166 1175392 -0.49450 -0.04025 0.129900
-0.00575 -0.18240
Hs.101014 Hs.101014 CEP57 1190438 0.22230 0.04415 -0.458975
-0.43815 -0.14735
Hs.1011 Hs.1011 PROZ 1173756 -0.72110 -0.68895 0.465100
0.30815 0.11330>
>
On Wed, Jul 23, 2008 at 5:08 PM, Kaposi-Novak, Pal
<kaposinovakp at upmc.edu> wrote:>
> ________________________________________
> From: Kaposi-Novak, Pal
> Sent: Wednesday, July 23, 2008 5:07 PM
> To: jim holtman
> Subject: RE: [R] average replicate probe values
>
> Dear Dr Holtman,
>
> Thank you very much for your response.
>
> What I want is avarege data points in a data.frame from probes which
represent the same gene (ie have the same UniGene ID).
>
> For example in the table below probe sets in rows 3 and 4 both represent
the CEP57 gene.
>
> UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL
> 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
> 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
> 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735
> 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA
> 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133
>
> I would like to make R find the matching UniGene IDs and average expression
values for each sample.
> The result would look like the table below:
>
> UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL
> 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195
> 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824
> 1199466 Hs.101014 CEP57 0.2223 0.04415 -0.458975 -0.43815 -0.14735
> 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133
>
> I am sorry for the naivness of my question, but I am not a trained
biostatistician just need to analyze data.
>
> Sincerely,
>
> Pal Kaposi-Novak MD PhD
> PIRT Fellow
> University of Pittsburgh
> Department of Pathology
> BST S408, 200 Lothrop Str
> Pittsburgh, PA , 15261
> Tel: (412) 383-7748
> kaposinovakp at umpc.edu
> ________________________________________
> From: jim holtman [jholtman at gmail.com]
> Sent: Wednesday, July 23, 2008 7:15 AM
> To: Kaposi-Novak, Pal
> Cc: r-help at r-project.org
> Subject: Re: [R] average replicate probe values
>
> It would be helpful if you included a sample of the data so that we
> could understand what you would like to do with it (before/after
> pictures).
>
> ?aggregate
>
> On Tue, Jul 22, 2008 at 9:57 PM, Kaposi-Novak, Pal
> <kaposinovakp at upmc.edu> wrote:
>> Hi,
>>
>> Could somebody tell me how I can average expression values of replicate
probe sets in an data frame?
>>
>> Thanks
>>
>> Pal Kaposi-Novak MD PhD
>> PIRT Fellow
>> University of Pittsburgh
>> Department of Pathology
>> BST S408, 200 Lothrop Str
>> Pittsburgh, PA , 15261
>> Tel: (412) 383-7748
>> kaposinovakp at umpc.edu
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem you are trying to solve?