Hi Vivek,
I removed the rows with missing values and also duplicated rows.? Now, it looks
like it is working.
x<-read.table("RP_matrix_FPKM_PGTvsPDGT.txt",header=T,sep="\t")
x1<-
read.table("RP_plaise_FPKM_PGTvsPDGT.txt",header=T,sep="\t")
str(x1)
#'data.frame':??? 19680 obs. of? 6 variables:
# $ ID??? : Factor w/ 19678 levels "XLOC_000001",..: 1 2 3 4 5 6 7 8 9
10 ...
# $ PGT.1 : num? 112.47 13.76 62.13 4.16 0 ...
# $ PGT.0 : num? 118.83 14.88 94.29 3.49 0 ...
# $ PGT.2 : num? 179.324 22.677 117.368 6.36 0.385 ...
# $ PDGT.0: num? 301.154 39.165 242.685 9.119 0.126 ...
# $ PDGT.1: num? 144.5 30 161.2 3.5 0 ...
?str(x)
#'data.frame':??? 28599 obs. of? 6 variables:
# $ gene? : Factor w/ 28599 levels "XLOC_000001",..: 1 2 3 4 5 6 7 8 9
10 ...
# $ PGT.1 : num? 71.25 8.71 14.6 1.99 0 ...
# $ PGT.0 : num? 68.36 8.16 9.75 2.4 0 ...
# $ PGT.2 : num? 108.17 13.35 18.29 3.64 0 ...
# $ PDGT.0: num? 195.01 24.76 40.59 5.61 0 ...
# $ PDGT.1: num? 93.06 18.88 26.83 2.14 0 ...
?length(unique(x[,1]))
#[1] 28599
?length(unique(x1[,1]))
#[1] 19679
x2<- x1[-which(duplicated(x1[,1])),]
dim(x2)
#[1] 19679???? 6
x3<- na.omit(x2)
?dim(x3)
#[1] 19678???? 6
cl<-c(rep(0,3),rep(1,2))
origin<-c(rep(1,5))
library(RankProd)
RP.out <-
RPadvance(x3[,-1],cl,origin,gene.names=as.character(x3[,1]),num.perm=200)
A.K.
________________________________
From: Vivek Das <vd4mmind at gmail.com>
To: arun <smartpink111 at yahoo.com>
Sent: Tuesday, August 6, 2013 9:38 AM
Subject: Re: Problem with t-test
No I have tried it again on other files and the error is not there it works
fine.. its a new file I have created, I am sending you the script and the file
which I am using, its a non fussy script I created and worked multiples times
with other files, I am sending you 2 different input files where in one it works
in the other it does not. With the files plaise its not working but with the
other input file its working.
library(RankProd)
x<-read.table("RP_matrix_RF_PGTvsPDGT.txt",header=T,sep="\t")
cl<-c(rep(0,3),rep(1,2))
origin<-c(rep(1,5))
RP.out <- RPadvance(x[,-1],cl,origin,gene.names=x[,1],num.perm=200)
topGene(RP.out,cutoff = 0.1)
#plotRP(RP.out, cutoff = 0.1)
table=topGene(RP.out,cutoff=0.1,method="pfp")
t1<-table$Table1
t2<-table$Table2
ind1<-which(t1[,4]<0.1)
ind2<-which(t2[,4]<0.1)
up<-t1[ind1,]
down<-t2[ind2,]
degs<-rbind(up,down)
----------------------------------------------------------
Vivek Das
PhD Student in Computational Biology
Giuseppe Testa's Lab
European School of Molecular Medicine
IFOM-IEO Campus
Via Adamello, 16
Milan, Italy
emails:?vivek.das at ieo.eu
??? ??? ??? vchris_05 at yahoo.co.in
??? ??? ??? vd4mmind at gmail.com
On Tue, Aug 6, 2013 at 3:17 PM, arun <smartpink111 at yahoo.com> wrote:
HI Vivek,>I never used RankProd before.? So, can't guarantee if I can sort the
problem.? But, you can send me the file and the script.? I will try it later.
>As you mentioned that RankProd worked before, is it on the same file or a
different file.? If it is the latter, then try running it on that file and see
if the error repeats.
>
>
>
>
>
>
>
>
>________________________________
>From: Vivek Das <vd4mmind at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Tuesday, August 6, 2013 9:09 AM
>
>Subject: Re: Problem with t-test
>
>
>
>Yes, I know this but am worried about the consistency of the data then as it
will remove a lot of observations and so the results will not be good infact I
tested it and am not getting p value as I expected. Anyways I am doing another
test which is a RankProd package in R. I am encountering a problem here, I have
used this package multiple number of times but have never faced this , do you
have any idea when do we get the below error?
>
>Error in `row.names<-.data.frame`(`*tmp*`, value = value) : duplicate
'row.names' are not allowed In addition: Warning message: non-unique
values when setting 'row.names': ?? in rankprod.?
>
>
>I am not being able to understand the duplicate'row.names' option as
these are gene location on the row with values of expression and the locations
are duplicate more than 2-3 times , I have used such data frame earlier as well
to compute the RankProd and they worked. But now I am getting some error. I can
share the script and the file with you if you need as the pipeline for RankProd
is very easy to execute.
>
>If you can give me some idea about the error it will be good.
>
>
>----------------------------------------------------------
>
>Vivek Das
>PhD Student in Computational Biology
>Giuseppe Testa's Lab
>European School of Molecular Medicine
>IFOM-IEO Campus
>Via Adamello, 16
>Milan, Italy
>
>emails:?vivek.das at ieo.eu
>??? ??? ??? vchris_05 at yahoo.co.in
>??? ??? ??? vd4mmind at gmail.com
>
>
>
>On Tue, Aug 6, 2013 at 3:01 PM, arun <smartpink111 at yahoo.com>
wrote:
>
>Hi Vivek,
>>No problem.
>>?t.test
>>na.action: a function which indicates what should happen when the data
>>????????? contain ?NA?s.? Defaults to
?getOption("na.action")?.
>>
>>In my system,
>>
>>getOption("na.action")
>>#[1] "na.omit"
>>
>>
>>So, it removes the NA's by default and reduce the number of
observations.
>>
>>
>>
>>________________________________
>>From: Vivek Das <vd4mmind at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Tuesday, August 6, 2013 8:52 AM
>>Subject: Re: Problem with t-test
>>
>>
>>
>>
>>yes actually I just tested few conditions and found that there are NaN
values and so this problem is happening.. I cannot proceed with this test and
have to change the pipeline with some other R package for my analysis. Thanks
for your input.
>>
>>
>>----------------------------------------------------------
>>
>>Vivek Das
>>PhD Student in Computational Biology
>>Giuseppe Testa's Lab
>>European School of Molecular Medicine
>>IFOM-IEO Campus
>>Via Adamello, 16
>>Milan, Italy
>>
>>emails:?vivek.das at ieo.eu
>>??? ??? ??? vchris_05 at yahoo.co.in
>>??? ??? ??? vd4mmind at gmail.com
>>
>>
>>
>>On Tue, Aug 6, 2013 at 2:42 PM, arun <smartpink111 at yahoo.com>
wrote:
>>
>>HI Vivek,
>>>It looks like the number of observations in each test are 2 (PDGT)
and 3 respectively.? It could be possible that some of the entries are NA, and
therefore, the observation number is low to produce the error.? It's just a
guess as this is not a reproducible example.?
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vivek Das <vd4mmind at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Tuesday, August 6, 2013 4:29 AM
>>>Subject: Problem with t-test
>>>
>>>
>>>
>>>
>>>data<-
read.table("/Users/vdas/Documents/RNA-Seq_Smaples_Udine_08032013/GBM_29052013/UD_RP_25072013/filteredFPKM_matrix.txt",sep="",header=TRUE,stringsAsFactors=FALSE)
>>>> head(data)
>>>? ? ? ? ? ?ID Sample_118p Sample_118rp3 Sample_118rz Sample_118z
Sample_132p1 Sample_132p2 Sample_132p3 Sample_132rp1 Sample_132rp3 Sample_132rp4
Sample_132rz1
>>>1 XLOC_000001 ? 112.47400 ? ? 166.17900 ? ? 81.52270 ? 44.778700 ?
301.154000 ? ?118.82700 ? ?144.47000 ? ?170.407000 ? ?406.899000 ? ?189.131000 ?
? 97.183400
>>>2 XLOC_000002 ? ?13.76090 ? ? ?17.76730 ? ? 11.91100 ? ?6.290600 ?
?39.164800 ? ? 14.88320 ? ? 30.02390 ? ? 42.717200 ? ? 88.814600 ? ? 23.310500 ?
? 15.440800
>>>3 XLOC_000003 ? ?62.13010 ? ? 102.16200 ? ?748.31300 ?273.520000 ?
242.685000 ? ? 94.28880 ? ?161.22800 ? ?225.243000 ? ?497.011000 ? ?160.376000 ?
?896.121000
>>>4 XLOC_000004 ? ? 4.16261 ? ? ? 5.71899 ? ? ?4.55739 ? ?2.486340 ? ?
9.119170 ? ? ?3.49082 ? ? ?3.49611 ? ? ?4.975020 ? ? 12.598600 ? ? ?6.387530 ? ?
?4.949830
>>>5 XLOC_000010 ? ? 0.00000 ? ? ? 0.00000 ? ? ?0.29217 ? ?0.270976 ? ?
0.126338 ? ? ?0.00000 ? ? ?0.00000 ? ? ?0.464747 ? ? ?0.596984 ? ? ?0.199851 ? ?
?0.892021
>>>6 XLOC_000011 ? ? 3.59279 ? ? ? 9.09855 ? ? ?2.57678 ? ?1.593230 ?
?16.936300 ? ? ?4.47379 ? ? ?6.87020 ? ? ?6.922430 ? ? 21.762200 ? ? ?7.461560 ?
? ?4.420570
>>>? Sample_132rz2 Sample_132z Sample_141p1 Sample_141p2 Sample_141p3
Sample_141p4 Sample_141z Sample_183p1 Sample_183p2 Sample_183p3 Sample_183z
Sample_91p
>>>1 ? ? 72.739000 ? 386.81000 ? ? 86.96600 ? ?85.703100 ? ? 53.01000 ?
?158.31400 ? 145.84300 ? 219.667000 ? 240.231000 ? ?127.42000 ? ?78.58140
179.324000
>>>2 ? ? ?7.475080 ? ?40.35110 ? ? 12.61660 ? ?12.737300 ? ? 10.96970 ?
? 28.26550 ? ?22.65940 ? ?27.217700 ? ?27.832800 ? ? 18.21300 ? ? 7.88030
?22.676900
>>>3 ? ?465.496000 ?2330.57000 ? ? 72.35270 ? ?73.962600 ? ? 71.36860 ?
?203.20100 ?1048.81000 ? 172.241000 ? 183.260000 ? ? 98.11680 ? 473.46400
117.368000
>>>4 ? ? ?4.818980 ? ?18.22750 ? ? ?3.22435 ? ? 2.074460 ? ? ?1.97518 ?
? ?4.05074 ? ? 8.86568 ? ? 5.118540 ? ? 6.414700 ? ? ?4.65076 ? ? 4.37495 ?
6.360260
>>>5 ? ? ?0.863341 ? ? 2.91729 ? ? ?0.00000 ? ? 0.226087 ? ? ?0.00000 ?
? ?0.00000 ? ? 2.16320 ? ? 0.356073 ? ? 0.655415 ? ? ?0.00000 ? ? 1.15980 ?
0.385098
>>>6 ? ? ?3.341780 ? ?15.43730 ? ? ?5.21231 ? ? 3.854980 ? ? ?2.53136 ?
? ?6.18972 ? ? 4.83315 ? ? 6.908790 ? ?12.524200 ? ? ?5.96035 ? ? 3.40959 ?
8.604070
>>>? Sample_91rp1 Sample_91rp3 Sample_91rp4 Sample_91rz
>>>1 ? 297.395000 ? 203.550000 ? ?251.53800 ?110.898000
>>>2 ? ?28.945600 ? ?18.749300 ? ? 22.76070 ? 15.679000
>>>3 ? 174.073000 ? 119.605000 ? ?122.66100 ?754.735000
>>>4 ? ? 9.227550 ? ? 6.656250 ? ? ?8.82010 ? ?7.172210
>>>5 ? ? 0.718336 ? ? 0.187613 ? ? ?0.34955 ? ?0.498937
>>>6 ? ?15.908700 ? ? 8.162870 ? ? ?9.35126 ? ?6.013790
>>>> PGT<-cbind(data[,2],data[,7],data[,24])
>>>> head(PGT)
>>>? ? ? ? ? [,1] ? ? ?[,2] ? ? ? [,3]
>>>[1,] 112.47400 118.82700 179.324000
>>>[2,] ?13.76090 ?14.88320 ?22.676900
>>>[3,] ?62.13010 ?94.28880 117.368000
>>>[4,] ? 4.16261 ? 3.49082 ? 6.360260
>>>[5,] ? 0.00000 ? 0.00000 ? 0.385098
>>>[6,] ? 3.59279 ? 4.47379 ? 8.604070
>>>> PDGT<-cbind(data[,6],data[,8])
>>>
>>>pval2<-NULL
>>>> for(i in 1:length(PGT[,1])){
>>>+
pval2<-c(pval2,t.test(as.numeric(PDGT[i,]),as.numeric(PGT[i,]))$p.value)
>>>+ print(i)
>>>+ }
>>>
>>>Error:
>>>Error in t.test.default(as.numeric(PDGT[i, ]), as.numeric(PGT[i, ]))
:?
>>>? not enough 'x' observations
>>>
>>>I cannot understand what went wrong with the vector . Can you please
tell me? I am not being able to figure it out?
>>>----------------------------------------------------------
>>>
>>>Vivek Das
>>>PhD Student in Computational Biology
>>>Giuseppe Testa's Lab
>>>European School of Molecular Medicine
>>>IFOM-IEO Campus
>>>Via Adamello, 16
>>>Milan, Italy
>>>
>>>emails:?vivek.das at ieo.eu
>>>??? ??? ??? vchris_05 at yahoo.co.in
>>>??? ??? ??? vd4mmind at gmail.com
>>>
>>
>