thr3ads.net - R help - [R] reading data [Feb 2013]

If this information is useful, please help other people find it:
Share via:

arun

2013-Feb-15 17:18 UTC

[R] reading data

Hi,
#working directory data1 #changed name data to data1.? Added some files in each
of sub directories a1, a2, etc.
?indx1<- indx[indx!=""]
lapply(indx1,function(x) list.files(x))
#[[1]]
#[1] "a1.txt"??????? "mmmmm11kk.txt"

#[[2]]
#[1] "a2.txt"??????? "mmmmm11kk.txt"

#[[3]]
#[1] "a3.txt"??????? "mmmmm11kk.txt"

#[[4]]
#[1] "b1.txt"??????? "mmmmm11kk.txt"

#[[5]]
#[1] "b2.txt"??????? "b3.txt"???????
"mmmmm11kk.txt"

[[6]]
[1] "c1.txt"??????? "c2.txt"???????
"c3.txt"??????? "c4.txt"??????
[5] "mmmmm11kk.txt"

res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
head(res,2)
#$a1
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$a2
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734


If you want the names to be group_a, group_b etc.
?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
res[grep("group_b",names(res))]
$group_b
#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$group_b
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1 ? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

A.K.







----- Original Message -----
From: "veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Friday, February 15, 2013 9:15 AM
Subject: reading data

Hi,
I post yesterday and you helped me. I have little problem.

At first, I never worked with regular expressions...

The code that you gave me it's ok, but my files are inside the folders
a1,a2,a3. I try to explain better.

I have one folder named "data". Inside this folder I have some other
folders named "a1","a2","b1",b2",...and
inside of each one of that I have some files. I want only the file
"mmmmmm.txt" (in all folders I have One file with this name).
The name of the folder give me the name of the group,but I need to read the file
inside. And after, have "group_a", group_"b"...because I
need to work with this data grouped (and know the name of the group).

Thank you.

arun

2013-Feb-15 17:36 UTC

head link

[R] reading data

HI,

Just to add:

res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
res[grep("group_b",names(res))]

I am not sure how you want the grouped data to look like.? If you want something
like this:
res1<-do.call(rbind,res)
res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
res2
#$group_a
?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$group_b
?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$group_c
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734


#or if you want it like this:
res2<-split(res,names(res))

res2[["group_b"]]
#$group_b
#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$group_b
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

Hope this helps.
A.K.



----- Original Message -----
From: "veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
To: smartpink111 at yahoo.com
Cc: 
Sent: Friday, February 15, 2013 9:15 AM
Subject: reading data

Hi,
I post yesterday and you helped me. I have little problem.

At first, I never worked with regular expressions...

The code that you gave me it's ok, but my files are inside the folders
a1,a2,a3. I try to explain better.

I have one folder named "data". Inside this folder I have some other
folders named "a1","a2","b1",b2",...and
inside of each one of that I have some files. I want only the file
"mmmmmm.txt" (in all folders I have One file with this name).
The name of the folder give me the name of the group,but I need to read the file
inside. And after, have "group_a", group_"b"...because I
need to work with this data grouped (and know the name of the group).

Thank you.

arun

2013-Feb-15 18:05 UTC

head link

[R] reading data

HI,
No problem.
?c() for concatenate to vector or list().
If I use do.call(cbind,..) or do.call(rbind,...)

do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
#?? [,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]?? 
#a1 List,11 List,11 List,11 List,11 List,11 List,11


?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
#???? a1???? 
#[1,] List,11
#[2,] List,11
#[3,] List,11
#[4,] List,11
#[5,] List,11
#[6,] List,11
ie.
list within in a list

?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
?str(restrial)
#List of 6
# $ :List of 1
? #..$ a1:'data.frame':??? 6 obs. of? 11 variables:
? .#. ..$ Id: chr [1:6] "aAA" "aAAAA" "aA"
"aAA" ...
? #.. ..$ M : chr [1:6] "1" "1" "2" "1"
...
? #. ..$ mm: int [1:6] 2 2 1 2 3 2
? #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
? -----------------------------------------------------------------
str(res)
#List of 6
# $ a1:'data.frame':??? 6 obs. of? 11 variables:
?# ..$ Id: chr [1:6] "aAA" "aAAAA" "aA"
"aAA" ...
? #..$ M : chr [1:6] "1" "1" "2" "1" ...
?# ..$ mm: int [1:6] 2 2 1 2 3 2
?# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
-----------------------------------------------------------------

You mentioned about naming this to "group_a","group_b".
etc..
?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
res2<-split(res,names(res))

res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
?res3$group_a
$a1
#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$a2
#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734

#$a3
?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Friday, February 15, 2013 12:39 PM
Subject: Re: reading data


Thank you very much and sorry my questions.

But this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same
group, (the first letter give me the name of the group)

Another question, in do.call, you did do.call (c,.....) .What is c?

Sorry



2013/2/15 arun <smartpink111 at yahoo.com>

HI,>
>Just to add:
>
>
>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>
>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>res[grep("group_b",names(res))]
>
>I am not sure how you want the grouped data to look like.? If you want
something like this:
>res1<-do.call(rbind,res)
>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>res2
>#$group_a
>
>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>
>#$group_b
>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>#$group_c
>
>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>
>#or if you want it like this:
>res2<-split(res,names(res))
>
>res2[["group_b"]]
>
>#$group_b
>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>#$group_b
>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>Hope this helps.
>
>A.K.
>
>
>
>----- Original Message -----
>From: "veracosta.rt at gmail.com" <veracosta.rt at
gmail.com>
>To: smartpink111 at yahoo.com
>Cc:
>Sent: Friday, February 15, 2013 9:15 AM
>Subject: reading data
>
>Hi,
>I post yesterday and you helped me. I have little problem.
>
>At first, I never worked with regular expressions...
>
>The code that you gave me it's ok, but my files are inside the folders
a1,a2,a3. I try to explain better.
>
>I have one folder named "data". Inside this folder I have some
other folders named "a1","a2","b1",b2",...and
inside of each one of that I have some files. I want only the file
"mmmmmm.txt" (in all folders I have One file with this name).
>The name of the folder give me the name of the group,but I need to read the
file inside. And after, have "group_a", group_"b"...because
I need to work with this data grouped (and know the name of the group).
>
>Thank you.
>???

arun

2013-Feb-17 00:16 UTC

head link

[R] reading data

Hi,
Try by putting quotes ie.
res<- do.call("c",...)
A.K.








________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Saturday, February 16, 2013 7:10 PM
Subject: Re: reading data


Thank you.
In mine, I have an error " 'what' must be a character string or a
function".
I need to do equivalent in my system.
Thank you and sorry one more time.
No dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at
yahoo.com> escreveu:

Hi,>You didn't mention what the error message or whether you are reading
file names which are? not "mmmmm11kk.txt".
>
>It is workiing on my system as I run it again.
>?c() combine values into a vector or list.
>
>?sessionInfo()
>R version 2.15.1 (2012-06-22)
>Platform: x86_64-pc-linux-gnu (64-bit)
>
>locale:
>?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>?[7] LC_PAPER=C???????????????? LC_NAME=C????????????????
>?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>
>attached base packages:
>[1] stats???? graphics? grDevices utils???? datasets? methods?? base????
>
>other attached packages:
>[1] stringr_0.6.2? reshape2_1.2.2
>
>loaded via a namespace (and not attached):
>[1] plyr_1.8
>
>
>#code
>
>
>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>res2<-split(res,names(res))
>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>#result
>
>res3
>#$group_a
>#$group_a$a1
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>$group_a$a2
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>$group_a$a3
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>
>$group_b
>$group_b$b1
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>$group_b$b2
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>
>$group_c
>$group_c$c1
>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>
>
>A.K.
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Saturday, February 16, 2013 6:32 PM
>Subject: Re: reading data
>
>
>Sorry again... In:
>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>What is this c? In do.call(c,?? When I put this row im R, I have an error.
>Thank you
>No dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at
yahoo.com> escreveu:
>
>Hi,
>>No problem.
>>
>>BTW, these questions are not stupid..
>>Arun
>>
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Friday, February 15, 2013 1:08 PM
>>Subject: Re: reading data
>>
>>
>>Thank you very much.
>>
>>I will try to apply and after I tell you if it is ok :-)
>>
>>Thank you and sorry about this questions (sometimes stupid questions).
>>
>>
>>
>>
>>2013/2/15 arun <smartpink111 at yahoo.com>
>>
>>HI,
>>>No problem.
>>>?c() for concatenate to vector or list().
>>>If I use do.call(cbind,..) or do.call(rbind,...)
>>>
>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>#?? [,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>#a1 List,11 List,11 List,11 List,11 List,11 List,11
>>>
>>>
>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>#???? a1????
>>>#[1,] List,11
>>>#[2,] List,11
>>>#[3,] List,11
>>>#[4,] List,11
>>>#[5,] List,11
>>>#[6,] List,11
>>>ie.
>>>list within in a list
>>>
>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>?str(restrial)
>>>#List of 6
>>># $ :List of 1
>>>? #..$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>? .#. ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>? #.. ..$ M : chr [1:6] "1" "1" "2"
"1" ...
>>>? #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>? #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>? -----------------------------------------------------------------
>>>str(res)
>>>#List of 6
>>># $ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>?# ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>? #..$ M : chr [1:6] "1" "1" "2"
"1" ...
>>>?# ..$ mm: int [1:6] 2 2 1 2 3 2
>>>?# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>-----------------------------------------------------------------
>>>
>>>You mentioned about naming this to
"group_a","group_b". etc..
>>>
>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>res2<-split(res,names(res))
>>>
>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>?res3$group_a
>>>$a1
>>>
>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>#$a2
>>>
>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>#$a3
>>>
>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>A.K.
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Friday, February 15, 2013 12:39 PM
>>>Subject: Re: reading data
>>>
>>>
>>>
>>>Thank you very much and sorry my questions.
>>>
>>>But this code isn't grouping for letters sure? I mean, a1,a2,a3
is the same group, (the first letter give me the name of the group)
>>>
>>>Another question, in do.call, you did do.call (c,.....) .What is c?
>>>
>>>Sorry
>>>
>>>
>>>
>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>
>>>HI,
>>>>
>>>>Just to add:
>>>>
>>>>
>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>
>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>res[grep("group_b",names(res))]
>>>>
>>>>I am not sure how you want the grouped data to look like.? If
you want something like this:
>>>>res1<-do.call(rbind,res)
>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>res2
>>>>#$group_a
>>>>
>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>
>>>>#$group_b
>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>#$group_c
>>>>
>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>
>>>>#or if you want it like this:
>>>>res2<-split(res,names(res))
>>>>
>>>>res2[["group_b"]]
>>>>
>>>>#$group_b
>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>#$group_b
>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>Hope this helps.
>>>>
>>>>A.K.
>>>>
>>>>
>>>>
>>>>----- Original Message -----
>>>>From: "veracosta.rt at gmail.com" <veracosta.rt at
gmail.com>
>>>>To: smartpink111 at yahoo.com
>>>>Cc:
>>>>Sent: Friday, February 15, 2013 9:15 AM
>>>>Subject: reading data
>>>>
>>>>Hi,
>>>>I post yesterday and you helped me. I have little problem.
>>>>
>>>>At first, I never worked with regular expressions...
>>>>
>>>>The code that you gave me it's ok, but my files are inside
the folders a1,a2,a3. I try to explain better.
>>>>
>>>>I have one folder named "data". Inside this folder I
have some other folders named
"a1","a2","b1",b2",...and inside of each one
of that I have some files. I want only the file "mmmmmm.txt" (in all
folders I have One file with this name).
>>>>The name of the folder give me the name of the group,but I need
to read the file inside. And after, have "group_a",
group_"b"...because I need to work with this data grouped (and know
the name of the group).
>>>>
>>>>Thank you.
>>>>???
>>>???
>>
>

arun

2013-Feb-17 14:25 UTC

head link

[R] reading data

HI Vera,

No problem.? I am cc:ing to r-help.
A.K.






________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Sunday, February 17, 2013 5:44 AM
Subject: Re: reading data


Hi. Thank you. It works now:-) 
And yes, I use windows.
Thank you very much.
No dia 17 de Fev de 2013 00:44, "arun" <smartpink111 at
yahoo.com> escreveu:

Hi Vera,>
>Have you tried the suggestion?
>
>Are you using Windows?
>Thanks,
>Arun
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Saturday, February 16, 2013 7:10 PM
>Subject: Re: reading data
>
>
>Thank you.
>In mine, I have an error " 'what' must be a character string or
a function".
>I need to do equivalent in my system.
>Thank you and sorry one more time.
>No dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at
yahoo.com> escreveu:
>
>Hi,
>>You didn't mention what the error message or whether you are reading
file names which are? not "mmmmm11kk.txt".
>>
>>It is workiing on my system as I run it again.
>>?c() combine values into a vector or list.
>>
>>?sessionInfo()
>>R version 2.15.1 (2012-06-22)
>>Platform: x86_64-pc-linux-gnu (64-bit)
>>
>>locale:
>>?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>?[7] LC_PAPER=C???????????????? LC_NAME=C????????????????
>>?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>
>>attached base packages:
>>[1] stats???? graphics? grDevices utils???? datasets? methods?? base????
>>
>>other attached packages:
>>[1] stringr_0.6.2? reshape2_1.2.2
>>
>>loaded via a namespace (and not attached):
>>[1] plyr_1.8
>>
>>
>>#code
>>
>>
>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>res2<-split(res,names(res))
>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>#result
>>
>>res3
>>#$group_a
>>#$group_a$a1
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>$group_a$a2
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>$group_a$a3
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>
>>$group_b
>>$group_b$b1
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>$group_b$b2
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>
>>$group_c
>>$group_c$c1
>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>
>>
>>A.K.
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Saturday, February 16, 2013 6:32 PM
>>Subject: Re: reading data
>>
>>
>>Sorry again... In:
>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>What is this c? In do.call(c,?? When I put this row im R, I have an
error.
>>Thank you
>>No dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at
yahoo.com> escreveu:
>>
>>Hi,
>>>No problem.
>>>
>>>BTW, these questions are not stupid..
>>>Arun
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Friday, February 15, 2013 1:08 PM
>>>Subject: Re: reading data
>>>
>>>
>>>Thank you very much.
>>>
>>>I will try to apply and after I tell you if it is ok :-)
>>>
>>>Thank you and sorry about this questions (sometimes stupid
questions).
>>>
>>>
>>>
>>>
>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>
>>>HI,
>>>>No problem.
>>>>?c() for concatenate to vector or list().
>>>>If I use do.call(cbind,..) or do.call(rbind,...)
>>>>
>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>#?? [,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>#a1 List,11 List,11 List,11 List,11 List,11 List,11
>>>>
>>>>
>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>#???? a1????
>>>>#[1,] List,11
>>>>#[2,] List,11
>>>>#[3,] List,11
>>>>#[4,] List,11
>>>>#[5,] List,11
>>>>#[6,] List,11
>>>>ie.
>>>>list within in a list
>>>>
>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>?str(restrial)
>>>>#List of 6
>>>># $ :List of 1
>>>>? #..$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>? .#. ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>? #.. ..$ M : chr [1:6] "1" "1"
"2" "1" ...
>>>>? #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>? #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>?
-----------------------------------------------------------------
>>>>str(res)
>>>>#List of 6
>>>># $ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>?# ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>? #..$ M : chr [1:6] "1" "1" "2"
"1" ...
>>>>?# ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>?# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>-----------------------------------------------------------------
>>>>
>>>>You mentioned about naming this to
"group_a","group_b". etc..
>>>>
>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>res2<-split(res,names(res))
>>>>
>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>?res3$group_a
>>>>$a1
>>>>
>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>#$a2
>>>>
>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>
>>>>#$a3
>>>>
>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>A.K.
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Friday, February 15, 2013 12:39 PM
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>
>>>>Thank you very much and sorry my questions.
>>>>
>>>>But this code isn't grouping for letters sure? I mean,
a1,a2,a3 is the same group, (the first letter give me the name of the group)
>>>>
>>>>Another question, in do.call, you did do.call (c,.....) .What is
c?
>>>>
>>>>Sorry
>>>>
>>>>
>>>>
>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>
>>>>HI,
>>>>>
>>>>>Just to add:
>>>>>
>>>>>
>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>
>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>res[grep("group_b",names(res))]
>>>>>
>>>>>I am not sure how you want the grouped data to look like.?
If you want something like this:
>>>>>res1<-do.call(rbind,res)
>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>res2
>>>>>#$group_a
>>>>>
>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>
>>>>>#$group_b
>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$group_c
>>>>>
>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>
>>>>>#or if you want it like this:
>>>>>res2<-split(res,names(res))
>>>>>
>>>>>res2[["group_b"]]
>>>>>
>>>>>#$group_b
>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$group_b
>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>Hope this helps.
>>>>>
>>>>>A.K.
>>>>>
>>>>>
>>>>>
>>>>>----- Original Message -----
>>>>>From: "veracosta.rt at gmail.com" <veracosta.rt
at gmail.com>
>>>>>To: smartpink111 at yahoo.com
>>>>>Cc:
>>>>>Sent: Friday, February 15, 2013 9:15 AM
>>>>>Subject: reading data
>>>>>
>>>>>Hi,
>>>>>I post yesterday and you helped me. I have little problem.
>>>>>
>>>>>At first, I never worked with regular expressions...
>>>>>
>>>>>The code that you gave me it's ok, but my files are
inside the folders a1,a2,a3. I try to explain better.
>>>>>
>>>>>I have one folder named "data". Inside this folder
I have some other folders named
"a1","a2","b1",b2",...and inside of each one
of that I have some files. I want only the file "mmmmmm.txt" (in all
folders I have One file with this name).
>>>>>The name of the folder give me the name of the group,but I
need to read the file inside. And after, have "group_a",
group_"b"...because I need to work with this data grouped (and know
the name of the group).
>>>>>
>>>>>Thank you.
>>>>>???
>>>>???
>>>
>>
>

arun

2013-Feb-18 18:30 UTC

head link

[R] reading data

Hi Vera,

Not sure I understand your question.


Your statement
"In my lista?I?can?t merge rows to have the group, because the idea is 
for each file count? frequencies of mm, when b<0.01. after that I 
want a graph like the graph in attach."

?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")

res2<-split(lista,names(lista))
res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})

res4<-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]],
function(x) x[x[["b"]]<0.01,])))
names(res4)<- names(res2)
res4


lapply(res4,function(x) table(x$mm))

#$group_a

#2 3 
#9 3 

#$group_b

#2 3 
#6 2 

#$group_c

#2 3 
#3 1 


If you want the separate counts per a1,a2,a3 within the group:
res4<-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]],
function(x) table(x$mm[x[["b"]]<0.01]))))
?names(res4)<- names(res2)
?res4
#$group_a
?#? 2 3
#a1 3 1
#a2 3 1
#a3 3 1

#$group_b
?#? 2 3
#b1 3 1
#b2 3 1

#$group_c
?#? 2 3
#c1 3 1


I haven't gone through the rest of the codes as I was not sure about what
you want.


A.K.

________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, February 18, 2013 10:27 AM
Subject: Re: reading data


Hi!!!

I'm coming to ask a new question.

I want a function to do my statistics. I start with you had send me:

z.plot <- function(directory,number) {
??setwd(directory)
?indx<-gsub("[./]","",list.dirs()) 
?indx1<- indx[indx!=""] 
?print(indx1)
?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
?print(lista)
?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
?}
z.plot("C:/Users/Vera Costa/Desktop/dados.lixo",23)


In my lista?I?can?t merge rows to have the group, because the idea is for each
file count? frequencies of mm, when b<0.01. after that I want a graph like
the graph in attach.


When I had 2 groups and knew the name of the groups, I did the code (but Know I
have more groups and, maybe, I don?t know the name of the groups):

z.plot <- function(directory,number) {
?#reading data
??setwd(directory)
?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
?directT <- direct[grepl("^t", direct)]
?directC <- direct[grepl("^c", direct)]

?lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaC<-lapply(directC, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaT<-lapply(directT, function(x) read.table(x,header=TRUE, sep =
"\t"))

?#count different z values
?cab <- vector()
??? for (i in 1:length(lista)) {
???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
??????? dc<-table(dc$z)
??????? cab <- c(cab, names(dc))
??}

?#Relative freqs to construct the graph
??? cab <- unique(cab)
??? d <- matrix(ncol=length(cab))
?dci<- d[-1,]
??? dcf <- d[-1,]
?dti <- d[-1,]
?dtf <- d[-1,]

??? for (i in 1:length(listaC)) {

??#Relative freq of all data
??dcc<-listaC[[i]]
??dcc<-table(factor(dcc$z, levels=cab))
??dci<- rbind(dci, dcc)
??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"c")


??#Relative freq of data with FDR<0.01
??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE, FALSE),]
??dcc1<-table(factor(dcc1$z, levels=cab))
??dcf<- rbind(dcf,dcc1)
??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"c")
??????? }

?for (i in 1:length(listaT)) {

??#Relative freq of all data
??dct<-listaT[[i]]
??dct<-table(factor(dct$z, levels=cab))
??dti<- rbind(dti, dct)
??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"t")


??#Relative freq of data with FDR<0.01
??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01, TRUE, FALSE),]
??dct1<-table(factor(dct1$z, levels=cab))
??dtf<- rbind(dtf,dct1)
??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"t")
??????? }
??freq.i<-rbind(dci,dti)
??freq.f<-rbind(dcf,dtf)
??freq.rel.i<-freq.i/apply(freq.i,1,sum)
??freq.rel.f<-freq.f/apply(freq.f,1,sum)?

#Graph plot
colour<-sample(rainbow(nrow(freq.rel.i)))
par(mfrow=c(1,2))
barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
#average of the group (except c1&t1)
freqs<-rbind(dcf[-1,], dtf[-1,])
average<-apply(freqs,2,mean)

#chisquare test function
chisq.test<-function(x,y){
?somax<-sum(x)
?somay<-sum(y)
?nj.<-x+y
?nj<-sum(nj.)
?ejx<-(nj./nj)*somax
?ejy<-(nj./nj)*somay
?ETx<-((x-ejx)^2)/ejx
?ETy<-((y-ejy)^2)/ejy
?ETobs<-sum(ETx)+sum(ETy)
?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
?return(pvalue)
?}

#pvalues of the chisquare test between sample and average (H0: two samples has
the same distribution)
pvalues<-c()
for (i in 1:(nrow(freqs))){
a<-chisq.test(freqs[i,],average)
pvalues<-c(pvalues,a)
}
#data frame with final p-values 
dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
colnames(dataframe)<-c("sample name","pvalue")
print(dataframe)
}
z.plot("C:/Users/Vera/Desktop/data",23)



Thank you again



2013/2/17 arun <smartpink111 at yahoo.com>

HI Vera,>
>No problem.? I am cc:ing to r-help.
>
>A.K.
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Sunday, February 17, 2013 5:44 AM
>Subject: Re: reading data
>
>
>
>Hi. Thank you. It works now:-)
>And yes, I use windows.
>Thank you very much.
>No dia 17 de Fev de 2013 00:44, "arun" <smartpink111 at
yahoo.com> escreveu:
>
>Hi Vera,
>>
>>Have you tried the suggestion?
>>
>>Are you using Windows?
>>Thanks,
>>Arun
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Saturday, February 16, 2013 7:10 PM
>>Subject: Re: reading data
>>
>>
>>Thank you.
>>In mine, I have an error " 'what' must be a character
string or a function".
>>I need to do equivalent in my system.
>>Thank you and sorry one more time.
>>No dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at
yahoo.com> escreveu:
>>
>>Hi,
>>>You didn't mention what the error message or whether you are
reading file names which are? not "mmmmm11kk.txt".
>>>
>>>It is workiing on my system as I run it again.
>>>?c() combine values into a vector or list.
>>>
>>>?sessionInfo()
>>>R version 2.15.1 (2012-06-22)
>>>Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>>locale:
>>>?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>>?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>>?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>>?[7] LC_PAPER=C???????????????? LC_NAME=C????????????????
>>>?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>>[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>>
>>>attached base packages:
>>>[1] stats???? graphics? grDevices utils???? datasets? methods??
base????
>>>
>>>other attached packages:
>>>[1] stringr_0.6.2? reshape2_1.2.2
>>>
>>>loaded via a namespace (and not attached):
>>>[1] plyr_1.8
>>>
>>>
>>>#code
>>>
>>>
>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>res2<-split(res,names(res))
>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>#result
>>>
>>>res3
>>>#$group_a
>>>#$group_a$a1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_a$a2
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_a$a3
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>$group_b
>>>$group_b$b1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_b$b2
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>$group_c
>>>$group_c$c1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Saturday, February 16, 2013 6:32 PM
>>>Subject: Re: reading data
>>>
>>>
>>>Sorry again... In:
>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>What is this c? In do.call(c,?? When I put this row im R, I have an
error.
>>>Thank you
>>>No dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at
yahoo.com> escreveu:
>>>
>>>Hi,
>>>>No problem.
>>>>
>>>>BTW, these questions are not stupid..
>>>>Arun
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Friday, February 15, 2013 1:08 PM
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>Thank you very much.
>>>>
>>>>I will try to apply and after I tell you if it is ok :-)
>>>>
>>>>Thank you and sorry about this questions (sometimes stupid
questions).
>>>>
>>>>
>>>>
>>>>
>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>
>>>>HI,
>>>>>No problem.
>>>>>?c() for concatenate to vector or list().
>>>>>If I use do.call(cbind,..) or do.call(rbind,...)
>>>>>
>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>#?? [,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>>#a1 List,11 List,11 List,11 List,11 List,11 List,11
>>>>>
>>>>>
>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>#???? a1????
>>>>>#[1,] List,11
>>>>>#[2,] List,11
>>>>>#[3,] List,11
>>>>>#[4,] List,11
>>>>>#[5,] List,11
>>>>>#[6,] List,11
>>>>>ie.
>>>>>list within in a list
>>>>>
>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>?str(restrial)
>>>>>#List of 6
>>>>># $ :List of 1
>>>>>? #..$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>? .#. ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>>? #.. ..$ M : chr [1:6] "1" "1"
"2" "1" ...
>>>>>? #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>? #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>?
-----------------------------------------------------------------
>>>>>str(res)
>>>>>#List of 6
>>>>># $ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>?# ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>>? #..$ M : chr [1:6] "1" "1"
"2" "1" ...
>>>>>?# ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>?# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>-----------------------------------------------------------------
>>>>>
>>>>>You mentioned about naming this to
"group_a","group_b". etc..
>>>>>
>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>res2<-split(res,names(res))
>>>>>
>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>?res3$group_a
>>>>>$a1
>>>>>
>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$a2
>>>>>
>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$a3
>>>>>
>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>A.K.
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Friday, February 15, 2013 12:39 PM
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>
>>>>>Thank you very much and sorry my questions.
>>>>>
>>>>>But this code isn't grouping for letters sure? I mean,
a1,a2,a3 is the same group, (the first letter give me the name of the group)
>>>>>
>>>>>Another question, in do.call, you did do.call (c,.....)
.What is c?
>>>>>
>>>>>Sorry
>>>>>
>>>>>
>>>>>
>>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>>
>>>>>HI,
>>>>>>
>>>>>>Just to add:
>>>>>>
>>>>>>
>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>
>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>res[grep("group_b",names(res))]
>>>>>>
>>>>>>I am not sure how you want the grouped data to look
like.? If you want something like this:
>>>>>>res1<-do.call(rbind,res)
>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>res2
>>>>>>#$group_a
>>>>>>
>>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>
>>>>>>#$group_b
>>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>#$group_c
>>>>>>
>>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>
>>>>>>#or if you want it like this:
>>>>>>res2<-split(res,names(res))
>>>>>>
>>>>>>res2[["group_b"]]
>>>>>>
>>>>>>#$group_b
>>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>#$group_b
>>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>Hope this helps.
>>>>>>
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>----- Original Message -----
>>>>>>From: "veracosta.rt at gmail.com"
<veracosta.rt at gmail.com>
>>>>>>To: smartpink111 at yahoo.com
>>>>>>Cc:
>>>>>>Sent: Friday, February 15, 2013 9:15 AM
>>>>>>Subject: reading data
>>>>>>
>>>>>>Hi,
>>>>>>I post yesterday and you helped me. I have little
problem.
>>>>>>
>>>>>>At first, I never worked with regular expressions...
>>>>>>
>>>>>>The code that you gave me it's ok, but my files are
inside the folders a1,a2,a3. I try to explain better.
>>>>>>
>>>>>>I have one folder named "data". Inside this
folder I have some other folders named
"a1","a2","b1",b2",...and inside of each one
of that I have some files. I want only the file "mmmmmm.txt" (in all
folders I have One file with this name).
>>>>>>The name of the folder give me the name of the group,but
I need to read the file inside. And after, have "group_a",
group_"b"...because I need to work with this data grouped (and know
the name of the group).
>>>>>>
>>>>>>Thank you.
>>>>>>???
>>>>>???
>>>>
>>>
>>
>????????????????????????????????

arun

2013-Feb-18 19:41 UTC

head link

[R] reading data

Hi,
I am not able to open your graph.? I am using linux.

Also, the codes in the function are not reproducible
?directT <- direct[grepl("^t", direct)]
?directC <- direct[grepl("^c", direct)]

It takes double the time to know what is going on.

dir()
#[1] "a1" "a2" "a3" "b1" "b2"
"c1"

direct<- list.files(recursive=TRUE)[grepl("^a|^b",dir())]

?direct
#[1] "MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt"
#[4] "MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
directA<- list.files(recursive=TRUE)[grepl("^a",dir())]
directB<- list.files(recursive=TRUE)[grepl("^b",dir())]
lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))

listaA<-lapply(directA, function(x) read.table(x,header=TRUE, sep =
"\t",fill=TRUE))
listaB<-lapply(directB, function(x) read.table(x,header=TRUE, sep =
"\t",fill=TRUE))

#here I am changing the names listaT, z, etc..

count different mm values
?cab <- vector()
??? for (i in 1:length(lista)) {
???????? dc<-lista[[i]][ifelse(lista[[i]]$b<0.01, TRUE, FALSE),]
??????? dc<-table(dc$mm)
??????? cab <- c(cab, names(dc))
? }

?#Relative freqs to construct the graph
??? cab <- unique(cab)
??? d <- matrix(ncol=length(cab))
?dci<- d[-1,]
??? dcf <- d[-1,]
?dti <- d[-1,]
?dtf <- d[-1,]

??? ########################################
?for (i in 1:length(listaA)) {

? #Relative freq of all data
? dcc<-listaA[[i]]
? dcc<-table(factor(dcc$mm, levels=cab))
? dci<- rbind(dci, dcc)
? rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"a")


? #Relative freq of data with FDR<0.01
? dcc1<-listaA[[i]][ifelse(listaA[[i]]$FDR<0.01, TRUE, FALSE),]
? dcc1<-table(factor(dcc1$mm, levels=cab))
? dcf<- rbind(dcf,dcc1)
? rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"a")
??????? }

?for (i in 1:length(listaB)) {

? #Relative freq of all data
? dct<-listaB[[i]]
? dct<-table(factor(dct$mm, levels=cab))
? dti<- rbind(dti, dct)
? rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"b")


? #Relative freq of data with FDR<0.01
? dct1<-listaB[[i]][ifelse(listaB[[i]]$FDR<0.01, TRUE, FALSE),]
? dct1<-table(factor(dct1$mm, levels=cab))
? dtf<- rbind(dtf,dct1)
? rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"b")
??????? }
? freq.i<-rbind(dci,dti)
? freq.f<-rbind(dcf,dtf)
? freq.rel.i<-freq.i/apply(freq.i,1,sum)
? freq.rel.f<-freq.f/apply(freq.f,1,sum) 


?freq.i
#?? 2 3
#a1 4 1
#a2 4 1
#a3 4 1
#b1 4 1
#b2 4 1
#b3 4 1
#b4 4 1
#result from my code.??
?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")

res2<-split(lista,names(lista))
res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
res4<-lapply(seq_along(res3),function(i) do.call(rbind,lapply(res3[[i]],
function(x) table(x$mm[x[["b"]]<0.01]))))
?names(res4)<- names(res2)


res4
$group_a
#?? 2 3
#a1 3 1
#a2 3 1
#a3 3 1

#$group_b
?#? 2 3
#b1 3 1
#b2 3 1

#$group_c
?#? 2 3
#c1 3 1

There is a difference in output from freq.i and res4.? There were only two files
under 'group_b`.? So, check your codes.
A.K.






________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, February 18, 2013 10:27 AM
Subject: Re: reading data


Hi!!!

I'm coming to ask a new question.

I want a function to do my statistics. I start with you had send me:

z.plot <- function(directory,number) {
??setwd(directory)
?indx<-gsub("[./]","",list.dirs()) 
?indx1<- indx[indx!=""] 
?print(indx1)
?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
?print(lista)
?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
?}
z.plot("C:/Users/Vera Costa/Desktop/dados.lixo",23)


In my lista?I?can?t merge rows to have the group, because the idea is for each
file count? frequencies of mm, when b<0.01. after that I want a graph like
the graph in attach.


When I had 2 groups and knew the name of the groups, I did the code (but Know I
have more groups and, maybe, I don?t know the name of the groups):

z.plot <- function(directory,number) {
?#reading data
??setwd(directory)
?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
?directT <- direct[grepl("^t", direct)]
?directC <- direct[grepl("^c", direct)]

?lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaC<-lapply(directC, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaT<-lapply(directT, function(x) read.table(x,header=TRUE, sep =
"\t"))

?#count different z values
?cab <- vector()
??? for (i in 1:length(lista)) {
???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
??????? dc<-table(dc$z)
??????? cab <- c(cab, names(dc))
??}

?#Relative freqs to construct the graph
??? cab <- unique(cab)
??? d <- matrix(ncol=length(cab))
?dci<- d[-1,]
??? dcf <- d[-1,]
?dti <- d[-1,]
?dtf <- d[-1,]

??? for (i in 1:length(listaC)) {

??#Relative freq of all data
??dcc<-listaC[[i]]
??dcc<-table(factor(dcc$z, levels=cab))
??dci<- rbind(dci, dcc)
??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"c")


??#Relative freq of data with FDR<0.01
??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE, FALSE),]
??dcc1<-table(factor(dcc1$z, levels=cab))
??dcf<- rbind(dcf,dcc1)
??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"c")
??????? }

?for (i in 1:length(listaT)) {

??#Relative freq of all data
??dct<-listaT[[i]]
??dct<-table(factor(dct$z, levels=cab))
??dti<- rbind(dti, dct)
??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"t")


??#Relative freq of data with FDR<0.01
??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01, TRUE, FALSE),]
??dct1<-table(factor(dct1$z, levels=cab))
??dtf<- rbind(dtf,dct1)
??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"t")
??????? }
??freq.i<-rbind(dci,dti)
??freq.f<-rbind(dcf,dtf)
??freq.rel.i<-freq.i/apply(freq.i,1,sum)
??freq.rel.f<-freq.f/apply(freq.f,1,sum)?

#Graph plot
colour<-sample(rainbow(nrow(freq.rel.i)))
par(mfrow=c(1,2))
barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
#average of the group (except c1&t1)
freqs<-rbind(dcf[-1,], dtf[-1,])
average<-apply(freqs,2,mean)

#chisquare test function
chisq.test<-function(x,y){
?somax<-sum(x)
?somay<-sum(y)
?nj.<-x+y
?nj<-sum(nj.)
?ejx<-(nj./nj)*somax
?ejy<-(nj./nj)*somay
?ETx<-((x-ejx)^2)/ejx
?ETy<-((y-ejy)^2)/ejy
?ETobs<-sum(ETx)+sum(ETy)
?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
?return(pvalue)
?}

#pvalues of the chisquare test between sample and average (H0: two samples has
the same distribution)
pvalues<-c()
for (i in 1:(nrow(freqs))){
a<-chisq.test(freqs[i,],average)
pvalues<-c(pvalues,a)
}
#data frame with final p-values 
dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
colnames(dataframe)<-c("sample name","pvalue")
print(dataframe)
}
z.plot("C:/Users/Vera/Desktop/data",23)



Thank you again



2013/2/17 arun <smartpink111 at yahoo.com>

HI Vera,>
>No problem.? I am cc:ing to r-help.
>
>A.K.
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Sunday, February 17, 2013 5:44 AM
>Subject: Re: reading data
>
>
>
>Hi. Thank you. It works now:-)
>And yes, I use windows.
>Thank you very much.
>No dia 17 de Fev de 2013 00:44, "arun" <smartpink111 at
yahoo.com> escreveu:
>
>Hi Vera,
>>
>>Have you tried the suggestion?
>>
>>Are you using Windows?
>>Thanks,
>>Arun
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Saturday, February 16, 2013 7:10 PM
>>Subject: Re: reading data
>>
>>
>>Thank you.
>>In mine, I have an error " 'what' must be a character
string or a function".
>>I need to do equivalent in my system.
>>Thank you and sorry one more time.
>>No dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at
yahoo.com> escreveu:
>>
>>Hi,
>>>You didn't mention what the error message or whether you are
reading file names which are? not "mmmmm11kk.txt".
>>>
>>>It is workiing on my system as I run it again.
>>>?c() combine values into a vector or list.
>>>
>>>?sessionInfo()
>>>R version 2.15.1 (2012-06-22)
>>>Platform: x86_64-pc-linux-gnu (64-bit)
>>>
>>>locale:
>>>?[1] LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>>?[3] LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>>?[5] LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>>?[7] LC_PAPER=C???????????????? LC_NAME=C????????????????
>>>?[9] LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>>[11] LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>>
>>>attached base packages:
>>>[1] stats???? graphics? grDevices utils???? datasets? methods??
base????
>>>
>>>other attached packages:
>>>[1] stringr_0.6.2? reshape2_1.2.2
>>>
>>>loaded via a namespace (and not attached):
>>>[1] plyr_1.8
>>>
>>>
>>>#code
>>>
>>>
>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>res2<-split(res,names(res))
>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>#result
>>>
>>>res3
>>>#$group_a
>>>#$group_a$a1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_a$a2
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_a$a3
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>$group_b
>>>$group_b$b1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>$group_b$b2
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>$group_c
>>>$group_c$c1
>>>???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Saturday, February 16, 2013 6:32 PM
>>>Subject: Re: reading data
>>>
>>>
>>>Sorry again... In:
>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>What is this c? In do.call(c,?? When I put this row im R, I have an
error.
>>>Thank you
>>>No dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at
yahoo.com> escreveu:
>>>
>>>Hi,
>>>>No problem.
>>>>
>>>>BTW, these questions are not stupid..
>>>>Arun
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Friday, February 15, 2013 1:08 PM
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>Thank you very much.
>>>>
>>>>I will try to apply and after I tell you if it is ok :-)
>>>>
>>>>Thank you and sorry about this questions (sometimes stupid
questions).
>>>>
>>>>
>>>>
>>>>
>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>
>>>>HI,
>>>>>No problem.
>>>>>?c() for concatenate to vector or list().
>>>>>If I use do.call(cbind,..) or do.call(rbind,...)
>>>>>
>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>#?? [,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>>#a1 List,11 List,11 List,11 List,11 List,11 List,11
>>>>>
>>>>>
>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>#???? a1????
>>>>>#[1,] List,11
>>>>>#[2,] List,11
>>>>>#[3,] List,11
>>>>>#[4,] List,11
>>>>>#[5,] List,11
>>>>>#[6,] List,11
>>>>>ie.
>>>>>list within in a list
>>>>>
>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>?str(restrial)
>>>>>#List of 6
>>>>># $ :List of 1
>>>>>? #..$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>? .#. ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>>? #.. ..$ M : chr [1:6] "1" "1"
"2" "1" ...
>>>>>? #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>? #. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>?
-----------------------------------------------------------------
>>>>>str(res)
>>>>>#List of 6
>>>>># $ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>?# ..$ Id: chr [1:6] "aAA" "aAAAA"
"aA" "aAA" ...
>>>>>? #..$ M : chr [1:6] "1" "1"
"2" "1" ...
>>>>>?# ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>?# ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>-----------------------------------------------------------------
>>>>>
>>>>>You mentioned about naming this to
"group_a","group_b". etc..
>>>>>
>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>res2<-split(res,names(res))
>>>>>
>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>?res3$group_a
>>>>>$a1
>>>>>
>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$a2
>>>>>
>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>
>>>>>#$a3
>>>>>
>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>A.K.
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Friday, February 15, 2013 12:39 PM
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>
>>>>>Thank you very much and sorry my questions.
>>>>>
>>>>>But this code isn't grouping for letters sure? I mean,
a1,a2,a3 is the same group, (the first letter give me the name of the group)
>>>>>
>>>>>Another question, in do.call, you did do.call (c,.....)
.What is c?
>>>>>
>>>>>Sorry
>>>>>
>>>>>
>>>>>
>>>>>2013/2/15 arun <smartpink111 at yahoo.com>
>>>>>
>>>>>HI,
>>>>>>
>>>>>>Just to add:
>>>>>>
>>>>>>
>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>
>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>res[grep("group_b",names(res))]
>>>>>>
>>>>>>I am not sure how you want the grouped data to look
like.? If you want something like this:
>>>>>>res1<-do.call(rbind,res)
>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>res2
>>>>>>#$group_a
>>>>>>
>>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#13?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#14 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#15??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#16?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#17? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#18??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>
>>>>>>#$group_b
>>>>>>?# ??? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4??? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5?? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6???? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>#7??? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#8? aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#9???? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#10?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#11? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#12??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>#$group_c
>>>>>>
>>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>
>>>>>>#or if you want it like this:
>>>>>>res2<-split(res,names(res))
>>>>>>
>>>>>>res2[["group_b"]]
>>>>>>
>>>>>>#$group_b
>>>>>>#???? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>#$group_b
>>>>>>?# ?? Id? M mm??? x???????? b? u? k? j??? y??????? p???
v
>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867
8926
>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640
8926
>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092??
NA
>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616
8926
>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392?
496
>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509?
734
>>>>>>
>>>>>>Hope this helps.
>>>>>>
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>----- Original Message -----
>>>>>>From: "veracosta.rt at gmail.com"
<veracosta.rt at gmail.com>
>>>>>>To: smartpink111 at yahoo.com
>>>>>>Cc:
>>>>>>Sent: Friday, February 15, 2013 9:15 AM
>>>>>>Subject: reading data
>>>>>>
>>>>>>Hi,
>>>>>>I post yesterday and you helped me. I have little
problem.
>>>>>>
>>>>>>At first, I never worked with regular expressions...
>>>>>>
>>>>>>The code that you gave me it's ok, but my files are
inside the folders a1,a2,a3. I try to explain better.
>>>>>>
>>>>>>I have one folder named "data". Inside this
folder I have some other folders named
"a1","a2","b1",b2",...and inside of each one
of that I have some files. I want only the file "mmmmmm.txt" (in all
folders I have One file with this name).
>>>>>>The name of the folder give me the name of the group,but
I need to read the file inside. And after, have "group_a",
group_"b"...because I need to work with this data grouped (and know
the name of the group).
>>>>>>
>>>>>>Thank you.
>>>>>>???
>>>>>???
>>>>
>>>
>>
>????????????????????????????????

arun

2013-Feb-20 00:29 UTC

head link

[R] reading data

Hi,
Try this:


files<-paste("MSMS_",23,"PepInfo.txt",sep="")
read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
res2<-split(lista,names(lista))
res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
#Freq whole data
res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=1:3))))))
names(res4)<- names(res2)
library(reshape2)
freq.i1<-do.call(rbind,lapply(res4,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
freq.i1
#????????? id 1? 2 3
#group_a?? a1 1 12 6
#group_c.1 c1 0 10 3
#group_c.2 c2 0 12 3
#group_c.3 c3 0 13 4
#group_t.1 t1 0 10 4
#group_t.2 t2 1 12 6

freq.rel.i1<- as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) )
?freq.rel.i1
?# ???????????????? 1???????? 2???????? 3
#group_a?? 0.05263158 0.6315789 0.3157895
#group_c.1 0.00000000 0.7692308 0.2307692
#group_c.2 0.00000000 0.8000000 0.2000000
#group_c.3 0.00000000 0.7647059 0.2352941
#group_t.1 0.00000000 0.7142857 0.2857143
#group_t.2 0.05263158 0.6315789 0.3157895



#Freq with FDR< 0.01
res5<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z[x[["FDR"]]<0.01],levels=1:3))))))
names(res5)<- names(res2)

freq.f1<- do.call(rbind,lapply(res5,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))

?freq.f1
?# ??????? id 1? 2 3
#group_a?? a1 1 10 5
#group_c.1 c1 0? 7 2
#group_c.2 c2 0? 8 2
#group_c.3 c3 0? 6 4
#group_t.1 t1 0? 7 4
#group_t.2 t2 1 10 5


freq.rel.f1<- as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1]))

colour<-sample(rainbow(nrow(freq.rel.i1)))
par(mfrow=c(1,2))
barplot(freq.rel.i1,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i1))
barplot(freq.rel.f1,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f1))
#change the legend position

Also, didn't check the rest of the code from chisquare test.
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, February 19, 2013 4:19 PM
Subject: Re: reading data


Here is the code and some outputs.

z.plot <- function(directory,number) {
?#reading data
??setwd(directory)
?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
?directT <- direct[grepl("^t", direct)]
?directC <- direct[grepl("^c", direct)]

?lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaC<-lapply(directC, function(x) read.table(x,header=TRUE, sep =
"\t"))
?listaT<-lapply(directT, function(x) read.table(x,header=TRUE, sep =
"\t"))

?#count different z values
?cab <- vector()
??? for (i in 1:length(lista)) {
???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
??????? dc<-table(dc$z)
??????? cab <- c(cab, names(dc))
??}

?#Relative freqs to construct the graph
??? cab <- unique(cab)
?print(cab)

###[1] "2" "3" "1"



??? d <- matrix(ncol=length(cab))
?dci<- d[-1,]
??? dcf <- d[-1,]
?dti <- d[-1,]
?dtf <- d[-1,]

??? for (i in 1:length(listaC)) {

??#Relative freq of all data
??dcc<-listaC[[i]]
??dcc<-table(factor(dcc$z, levels=cab))
??dci<- rbind(dci, dcc)
??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"c")


??#Relative freq of data with FDR<0.01
??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE, FALSE),]
??dcc1<-table(factor(dcc1$z, levels=cab))
??dcf<- rbind(dcf,dcc1)
??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"c")
???????? }


?for (i in 1:length(listaT)) {

??#Relative freq of all data
??dct<-listaT[[i]]
??dct<-table(factor(dct$z, levels=cab))
??dti<- rbind(dti, dct)
??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"t")


??#Relative freq of data with FDR<0.01
??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01, TRUE, FALSE),]
??dct1<-table(factor(dct1$z, levels=cab))
??dtf<- rbind(dtf,dct1)
??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"t")
??????? }

??freq.i<-rbind(dci,dti)
??freq.f<-rbind(dcf,dtf)
??freq.rel.i<-freq.i/apply(freq.i,1,sum)
??freq.rel.f<-freq.f/apply(freq.f,1,sum)?

?print(freq.i)
##????? 2 3 1
#c1 10 3 0
#c2 12 3 0
#c3 13 4 0
#t1 10 4 0
#t2 12 6 1

?print(freq.f)
??###???? 2 3 1
#c1? 7 2 0
#c2? 8 2 0
#c3? 6 4 0
#t1? 7 4 0
#t2 10 5 1

?print(freq.rel.i)
###?????????????? 2???????? 3????????? 1
#c1 0.7692308 0.2307692 0.00000000
#c2 0.8000000 0.2000000 0.00000000
#c3 0.7647059 0.2352941 0.00000000
#t1 0.7142857 0.2857143 0.00000000
#t2 0.6315789 0.3157895 0.05263158
?print(freq.rel.f)

###???????????????? 2???????? 3????? 1
#c1 0.7777778 0.2222222 0.0000
#c2 0.8000000 0.2000000 0.0000
#c3 0.6000000 0.4000000 0.0000
#t1 0.6363636 0.3636364 0.0000
#t2 0.6250000 0.3125000 0.0625

#Graph plot
colour<-sample(rainbow(nrow(freq.rel.i)))
par(mfrow=c(1,2))
barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))

#average of the group (except c1&t1)
freqs<-rbind(dcf[-1,], dtf[-1,])
average<-apply(freqs,2,mean)
print(average)

###???????????? 2???????? 3???????? 1 
#8.0000000 3.6666667 0.3333333 

#chisquare test function
chisq.test<-function(x,y){
?somax<-sum(x)
?somay<-sum(y)
?nj.<-x+y
?nj<-sum(nj.)
?ejx<-(nj./nj)*somax
?ejy<-(nj./nj)*somay
?ETx<-((x-ejx)^2)/ejx
?ETy<-((y-ejy)^2)/ejy
?ETobs<-sum(ETx)+sum(ETy)
?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
?return(pvalue)
?}

#pvalues of the chisquare test between sample and average (H0: two samples has
the same distribution)
pvalues<-c()
for (i in 1:(nrow(freqs))){
a<-chisq.test(freqs[i,],average)
pvalues<-c(pvalues,a)
}


#data frame with final p-values 
dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
colnames(dataframe)<-c("sample name","pvalue")
print(dataframe)

###? ? sample name??? pvalue
#1????????? c2 0.7235907
#2????????? c3 0.7963287
#3???????????? 0.9079200
}
z.plot("C:/Users/Vera Costa/Desktop/dados",23)

###and two barplots..


Here, I remove the group a1.

Thank you



2013/2/19 arun <smartpink111 at yahoo.com>

Hi,>
>Could you send the results for the folder that was sent to me?? It will be
easy for me.
>
>Arun
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Tuesday, February 19, 2013 3:47 PM
>
>Subject: Re: reading data
>
>
>Oh sorry, I change the folder.
>
>I send for your folder
>
>
>
>2013/2/19 arun <smartpink111 at yahoo.com>
>
>Hello,
>>
>>
>>? Regarding the results, is it from the same folder that you sent to
me??
>>I am getting different results by running your steps.
>>
>>
>>direct<- list.files(recursive=TRUE)
>>? direct
>>#[1] "a1/MSMS_23PepInfo.txt" "c1/MSMS_23PepInfo.txt"
"c2/MSMS_23PepInfo.txt"
>>#[4] "c3/MSMS_23PepInfo.txt" "t1/MSMS_23PepInfo.txt"
"t2/MSMS_23PepInfo.txt"
>>
>>?directT<- list.files(recursive=TRUE)[grepl("^t",dir())]
>>
>>directT
>>#[1] "t1/MSMS_23PepInfo.txt" "t2/MSMS_23PepInfo.txt"
>>
>>
>>directC<- list.files(recursive=TRUE)[grepl("^c",dir())]
>>
>>directC
>>#[1] "c1/MSMS_23PepInfo.txt" "c2/MSMS_23PepInfo.txt"
"c3/MSMS_23PepInfo.txt"
>>
>>
>>
>>lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>?
>>listaT<-lapply(directT, function(x) read.table(x,header=TRUE, sep =
"\t",fill=TRUE))
>>listaC<-lapply(directC, function(x) read.table(x,header=TRUE, sep =
"\t",fill=TRUE))
>>
>>?#count different z values
>>?cab <- vector()
>>??? for (i in 1:length(lista)) {
>>???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>??????? dc<-table(dc$z)
>>??????? cab <- c(cab, names(dc))
>>? }
>>?
>>?#Relative freqs to construct the graph
>>??? cab <- unique(cab)
>>?print(cab)
>>
>>#[1] "1" "2" "3"? #Here results are not
correct
>>
>>
>>d <- matrix(ncol=length(cab))
>>?dci<- d[-1,]
>>??? dcf <- d[-1,]
>>?dti <- d[-1,]
>>?dtf <- d[-1,]
>>
>>??? for (i in 1:length(listaC)) {
>>
>>??#Relative freq of all data
>>??dcc<-listaC[[i]]
>>??dcc<-table(factor(dcc$z, levels=cab))
>>??dci<- rbind(dci, dcc)
>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"c")
>>
>>
>>??#Relative freq of data with FDR<0.01
>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE, FALSE),]
>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>??dcf<- rbind(dcf,dcc1)
>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"c")
>>??????? }
>>?print(dci) #here too.
>>
>>#?? 1? 2 3
>>#c1 0 10 3
>>#c2 0 12 3
>>#c3 0 13 4
>>
>>
>>It is important to clear this before I make any changes to the script.?
You need to send me the output of the same data folder to understand what is
going on.
>>
>>
>>Arun
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Tuesday, February 19, 2013 9:24 AM
>>
>>Subject: Re: reading data
>>
>>
>>Ok.
>>
>>Here is the code and some outputs.
>>
>>z.plot <- function(directory,number) {
>>?#reading data
>>??setwd(directory)
>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>?directT <- direct[grepl("^t", direct)]
>>?directC <- direct[grepl("^c", direct)]
>>
>>?lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep =
"\t"))
>>?listaC<-lapply(directC, function(x) read.table(x,header=TRUE, sep =
"\t"))
>>?listaT<-lapply(directT, function(x) read.table(x,header=TRUE, sep =
"\t"))
>>
>>?#count different z values
>>?cab <- vector()
>>??? for (i in 1:length(lista)) {
>>???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>??????? dc<-table(dc$z)
>>??????? cab <- c(cab, names(dc))
>>??}
>>
>>?#Relative freqs to construct the graph
>>??? cab <- unique(cab)
>>?print(cab)
>>
>>###[1] "1" "2" "3" "4"
"5"
>>
>>
>>
>>??? d <- matrix(ncol=length(cab))
>>?dci<- d[-1,]
>>??? dcf <- d[-1,]
>>?dti <- d[-1,]
>>?dtf <- d[-1,]
>>
>>??? for (i in 1:length(listaC)) {
>>
>>??#Relative freq of all data
>>??dcc<-listaC[[i]]
>>??dcc<-table(factor(dcc$z, levels=cab))
>>??dci<- rbind(dci, dcc)
>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"c")
>>
>>
>>??#Relative freq of data with FDR<0.01
>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE, FALSE),]
>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>??dcf<- rbind(dcf,dcc1)
>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"c")
>>??????? }
>>?print(dci)
>>
>>###???? 1???? 2??? 3?? 4? 5
>>#c1? 93? 8356 3621 450 55
>>#c2 108 13513 6859 793 73
>>#c3? 97 13526 6724 739 82
>>#c4 101 13417 6574 761 62
>>
>>?print(dcf)
>>
>>###??? 1??? 2??? 3?? 4? 5
>>#c1 10 4576 2100 199 17
>>#c2? 7 7831 4039 314 23
>>#c3 16 7887 4087 286 22
>>#c4 20 7824 4045 311 20
>>
>>?for (i in 1:length(listaT)) {
>>
>>??#Relative freq of all data
>>??dct<-listaT[[i]]
>>??dct<-table(factor(dct$z, levels=cab))
>>??dti<- rbind(dti, dct)
>>??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"t")
>>
>>
>>??#Relative freq of data with FDR<0.01
>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01, TRUE, FALSE),]
>>??dct1<-table(factor(dct1$z, levels=cab))
>>??dtf<- rbind(dtf,dct1)
>>??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"t")
>>??????? }
>>
>>?print(dti)
>>
>>###???? 1???? 2??? 3?? 4? 5
>>#t1? 32? 8640 4098 429 36
>>#t2 128 13209 6723 788 75
>>#t3? 85 13043 6691 754 82
>>#t4 139 13750 7036 807 84
>>
>>?print(dtf)
>>
>>
>>####??? 1??? 2??? 3?? 4? 5
>>#t1? 5 4885 2571 196? 8
>>#t2 12 7752 4209 360 28
>>#t3 19 7563 4086 336 18
>>#t4 14 8108 4218 312 26
>>
>>
>>??freq.i<-rbind(dci,dti)
>>??freq.f<-rbind(dcf,dtf)
>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>?print(freq.i)
>>##???? 1???? 2??? 3?? 4? 5
>>#c1? 93? 8356 3621 450 55
>>#c2 108 13513 6859 793 73
>>#c3? 97 13526 6724 739 82
>>#c4 101 13417 6574 761 62
>>#t1? 32? 8640 4098 429 36
>>#t2 128 13209 6723 788 75
>>#t3? 85 13043 6691 754 82
>>#t4 139 13750 7036 807 84
>>
>>?print(freq.f)
>>??###? 1??? 2??? 3?? 4? 5
>>#c1 10 4576 2100 199 17
>>#c2? 7 7831 4039 314 23
>>#c3 16 7887 4087 286 22
>>#c4 20 7824 4045 311 20
>>#t1? 5 4885 2571 196? 8
>>#t2 12 7752 4209 360 28
>>#t3 19 7563 4086 336 18
>>#t4 14 8108 4218 312 26
>>
>>?print(freq.rel.i)
>>###???????????? 1???????? 2???????? 3????????? 4?????????? 5
>>#c1 0.007395626 0.6644930 0.2879523 0.03578529 0.004373757
>>#c2 0.005059496 0.6330460 0.3213248 0.03714982 0.003419844
>>#c3 0.004582389 0.6389834 0.3176493 0.03491119 0.003873772
>>#c4 0.004829070 0.6415013 0.3143199 0.03638537 0.002964380
>>#t1 0.002417832 0.6528145 0.3096335 0.03241405 0.002720060
>>#t2 0.006117670 0.6313148 0.3213210 0.03766190 0.003584572
>>#t3 0.004115226 0.6314694 0.3239409 0.03650448 0.003969983
>>#t4 0.006371470 0.6302714 0.3225156 0.03699120 0.003850385
>>?print(freq.rel.f)
>>
>>###????????????? 1???????? 2???????? 3????????? 4?????????? 5
>>#c1 0.0014488554 0.6629962 0.3042596 0.02883222 0.002463054
>>#c2 0.0005731128 0.6411495 0.3306861 0.02570820 0.001883085
>>#c3 0.0013010246 0.6413238 0.3323305 0.02325581 0.001788909
>>#c4 0.0016366612 0.6402619 0.3310147 0.02545008 0.001636661
>>#t1 0.0006523157 0.6373125 0.3354207 0.02557078 0.001043705
>>#t2 0.0009707952 0.6271337 0.3405064 0.02912386 0.002265189
>>#t3 0.0015804359 0.6290967 0.3398769 0.02794876 0.001497255
>>#t4 0.0011042751 0.6395330 0.3327023 0.02460956 0.002050797
>>
>>#Graph plot
>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>par(mfrow=c(1,2))
>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>
>>#average of the group (except c1&t1)
>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>average<-apply(freqs,2,mean)
>>print(average)
>>
>>###???????? 1????????? 2????????? 3????????? 4????????? 5
>>?# 14.66667 7827.50000 4114.00000? 319.83333?? 22.83333
>>
>>#chisquare test function
>>chisq.test<-function(x,y){
>>?somax<-sum(x)
>>?somay<-sum(y)
>>?nj.<-x+y
>>?nj<-sum(nj.)
>>?ejx<-(nj./nj)*somax
>>?ejy<-(nj./nj)*somay
>>?ETx<-((x-ejx)^2)/ejx
>>?ETy<-((y-ejy)^2)/ejy
>>?ETobs<-sum(ETx)+sum(ETy)
>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>?return(pvalue)
>>?}
>>
>>#pvalues of the chisquare test between sample and average (H0: two
samples has the same distribution)
>>pvalues<-c()
>>for (i in 1:(nrow(freqs))){
>>a<-chisq.test(freqs[i,],average)
>>pvalues<-c(pvalues,a)
>>}
>>print(pvalues)
>>##[1] 0.5307206 0.6849480 0.8332661 0.3474956 0.5546527 0.9387602
>>
>>#data frame with final p-values
>>dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
>>colnames(dataframe)<-c("sample name","pvalue")
>>print(dataframe)
>>
>>###? sample name??? pvalue
>>#1????????? c2 0.5307206
>>#2????????? c3 0.6849480
>>#3????????? c4 0.8332661
>>#4????????? t2 0.3474956
>>#5????????? t3 0.5546527
>>#6????????? t4 0.9387602
>>}
>>z.plot("C:/Users/Vera Costa/Desktop/dados",23)
>>
>>###and two barplots...
>>
>>Thank you
>>
>>
>>
>>
>>2013/2/19 arun <smartpink111 at yahoo.com>
>>
>>Got it.
>>>
>>>So, if I run your codes that you sent yesterday, will I get the
correct results for relative frequency etc.? It would be also great if you can
sent me the output generated using your codes (on two groups as you showed
yesterday).? It will help me in checking results much faster than running your
code and see if that is the result (because I have to do some adjustment to your
code for running in linux especially the ?dir()).?
>>>
>>>I may be able to run it only later.
>>>
>>>Arun
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Tuesday, February 19, 2013 8:53 AM
>>>
>>>Subject: Re: reading data
>>>
>>>
>>>I sent in second email.
>>>
>>>But I send again.
>>>
>>>
>>>
>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>
>>>
>>>>
>>>>Your attachment didn't came through.
>>>>
>>>>Arun
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Tuesday, February 19, 2013 8:47 AM
>>>>
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>Sorry about a lot of questions.
>>>>
>>>>I attach a small part of my real data (I have a lot of row).
>>>>
>>>>My main objective is construct two?graph. The first with the
relative frequencies of each group (c1,c2,c3....). The second with the same
frequencies but with FDR<0.01.
>>>>
>>>>After that I need to do the average in each group (but without
the first group-c1,t1,a1....) and do the qui square test to see if the groups
has the?same distribution. You understand?
>>>>
>>>>At first, I had only two groups, and I did the code that I sent
you. But I need a general code, not for two groups that I know the names, but
for all groups (sometimes I can?have 7 or 8 or 9 groups).
>>>>
>>>>it?s better now my explanation??:-)
>>>>My English isn't also very good :-)
>>>>
>>>>Please not publish this data in forum...
>>>>
>>>>Thank you
>>>>
>>>>
>>>>
>>>>
>>>>2013/2/18 arun <smartpink111 at yahoo.com>
>>>>
>>>>Hi,
>>>>>
>>>>>I run the codes to understand what was going on.?
>>>>>
>>>>>I didn't fully understand it as you constructed the
codes for your original dataset and not for the 'data` directory you sent to
me.
>>>>>
>>>>>A.K.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Monday, February 18, 2013 4:02 PM
>>>>>
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>Thank you.
>>>>>I don't need the same,but equivalent. I will try your
suggestions.
>>>>>Thank you.
>>>>>No dia 18 de Fev de 2013 19:41, "arun"
<smartpink111 at yahoo.com> escreveu:
>>>>>
>>>>>Hi,
>>>>>>I am not able to open your graph.? I am using linux.
>>>>>>
>>>>>>Also, the codes in the function are not reproducible
>>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>>
>>>>>>It takes double the time to know what is going on.
>>>>>>
>>>>>>dir()
>>>>>>#[1] "a1" "a2" "a3"
"b1" "b2" "c1"
>>>>>>
>>>>>>direct<-
list.files(recursive=TRUE)[grepl("^a|^b",dir())]
>>>>>>
>>>>>>?direct
>>>>>>#[1] "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
>>>>>>#[4] "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt"
>>>>>>directA<-
list.files(recursive=TRUE)[grepl("^a",dir())]
>>>>>>directB<-
list.files(recursive=TRUE)[grepl("^b",dir())]
>>>>>>lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>
>>>>>>listaA<-lapply(directA, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>listaB<-lapply(directB, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>
>>>>>>#here I am changing the names listaT, z, etc..
>>>>>>
>>>>>>count different mm values
>>>>>>?cab <- vector()
>>>>>>??? for (i in 1:length(lista)) {
>>>>>>???????? dc<-lista[[i]][ifelse(lista[[i]]$b<0.01,
TRUE, FALSE),]
>>>>>>??????? dc<-table(dc$mm)
>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>? }
>>>>>>
>>>>>>?#Relative freqs to construct the graph
>>>>>>??? cab <- unique(cab)
>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>?dci<- d[-1,]
>>>>>>??? dcf <- d[-1,]
>>>>>>?dti <- d[-1,]
>>>>>>?dtf <- d[-1,]
>>>>>>
>>>>>>??? ########################################
>>>>>>?for (i in 1:length(listaA)) {
>>>>>>
>>>>>>? #Relative freq of all data
>>>>>>? dcc<-listaA[[i]]
>>>>>>? dcc<-table(factor(dcc$mm, levels=cab))
>>>>>>? dci<- rbind(dci, dcc)
>>>>>>? rownames(dci)<-rownames(1:(nrow(dci)), do.NULL =
FALSE, prefix = "a")
>>>>>>
>>>>>>
>>>>>>? #Relative freq of data with FDR<0.01
>>>>>>? dcc1<-listaA[[i]][ifelse(listaA[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>? dcc1<-table(factor(dcc1$mm, levels=cab))
>>>>>>? dcf<- rbind(dcf,dcc1)
>>>>>>? rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL =
FALSE, prefix = "a")
>>>>>>??????? }
>>>>>>
>>>>>>?for (i in 1:length(listaB)) {
>>>>>>
>>>>>>? #Relative freq of all data
>>>>>>? dct<-listaB[[i]]
>>>>>>? dct<-table(factor(dct$mm, levels=cab))
>>>>>>? dti<- rbind(dti, dct)
>>>>>>? rownames(dti)<-rownames(1:(nrow(dti)), do.NULL =
FALSE, prefix = "b")
>>>>>>
>>>>>>
>>>>>>? #Relative freq of data with FDR<0.01
>>>>>>? dct1<-listaB[[i]][ifelse(listaB[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>? dct1<-table(factor(dct1$mm, levels=cab))
>>>>>>? dtf<- rbind(dtf,dct1)
>>>>>>? rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL =
FALSE, prefix = "b")
>>>>>>??????? }
>>>>>>? freq.i<-rbind(dci,dti)
>>>>>>? freq.f<-rbind(dcf,dtf)
>>>>>>? freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>? freq.rel.f<-freq.f/apply(freq.f,1,sum)
>>>>>>
>>>>>>
>>>>>>?freq.i
>>>>>>#?? 2 3
>>>>>>#a1 4 1
>>>>>>#a2 4 1
>>>>>>#a3 4 1
>>>>>>#b1 4 1
>>>>>>#b2 4 1
>>>>>>#b3 4 1
>>>>>>#b4 4 1
>>>>>>#result from my code.??
>>>>>>?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>
>>>>>>res2<-split(lista,names(lista))
>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]], function(x)
table(x$mm[x[["b"]]<0.01]))))
>>>>>>?names(res4)<- names(res2)
>>>>>>
>>>>>>
>>>>>>res4
>>>>>>$group_a
>>>>>>#?? 2 3
>>>>>>#a1 3 1
>>>>>>#a2 3 1
>>>>>>#a3 3 1
>>>>>>
>>>>>>#$group_b
>>>>>>?#? 2 3
>>>>>>#b1 3 1
>>>>>>#b2 3 1
>>>>>>
>>>>>>#$group_c
>>>>>>?#? 2 3
>>>>>>#c1 3 1
>>>>>>
>>>>>>There is a difference in output from freq.i and res4.?
There were only two files under 'group_b`.? So, check your codes.
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>________________________________
>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>Sent: Monday, February 18, 2013 10:27 AM
>>>>>>Subject: Re: reading data
>>>>>>
>>>>>>
>>>>>>Hi!!!
>>>>>>
>>>>>>I'm coming to ask a new question.
>>>>>>
>>>>>>I want a function to do my statistics. I start with you
had send me:
>>>>>>
>>>>>>z.plot <- function(directory,number) {
>>>>>>??setwd(directory)
>>>>>>?indx<-gsub("[./]","",list.dirs())
>>>>>>?indx1<- indx[indx!=""]
>>>>>>?print(indx1)
>>>>>>?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
>>>>>>?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>?print(lista)
>>>>>>?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
>>>>>>?}
>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados.lixo",23)
>>>>>>
>>>>>>
>>>>>>In my lista?I?can?t merge rows to have the group,
because the idea is for each file count? frequencies of mm, when b<0.01.
after that I want a graph like the graph in attach.
>>>>>>
>>>>>>
>>>>>>When I had 2 groups and knew the name of the groups, I
did the code (but Know I have more groups and, maybe, I don?t know the name of
the groups):
>>>>>>
>>>>>>z.plot <- function(directory,number) {
>>>>>>?#reading data
>>>>>>??setwd(directory)
>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>>
>>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>
>>>>>>?#count different z values
>>>>>>?cab <- vector()
>>>>>>??? for (i in 1:length(lista)) {
>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>??????? dc<-table(dc$z)
>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>??}
>>>>>>
>>>>>>?#Relative freqs to construct the graph
>>>>>>??? cab <- unique(cab)
>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>?dci<- d[-1,]
>>>>>>??? dcf <- d[-1,]
>>>>>>?dti <- d[-1,]
>>>>>>?dtf <- d[-1,]
>>>>>>
>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>
>>>>>>??#Relative freq of all data
>>>>>>??dcc<-listaC[[i]]
>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>??dci<- rbind(dci, dcc)
>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL =
FALSE, prefix = "c")
>>>>>>
>>>>>>
>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL =
FALSE, prefix = "c")
>>>>>>??????? }
>>>>>>
>>>>>>?for (i in 1:length(listaT)) {
>>>>>>
>>>>>>??#Relative freq of all data
>>>>>>??dct<-listaT[[i]]
>>>>>>??dct<-table(factor(dct$z, levels=cab))
>>>>>>??dti<- rbind(dti, dct)
>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL =
FALSE, prefix = "t")
>>>>>>
>>>>>>
>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>??dct1<-table(factor(dct1$z, levels=cab))
>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL =
FALSE, prefix = "t")
>>>>>>??????? }
>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>
>>>>>>#Graph plot
>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>par(mfrow=c(1,2))
>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>#average of the group (except c1&t1)
>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>average<-apply(freqs,2,mean)
>>>>>>
>>>>>>#chisquare test function
>>>>>>chisq.test<-function(x,y){
>>>>>>?somax<-sum(x)
>>>>>>?somay<-sum(y)
>>>>>>?nj.<-x+y
>>>>>>?nj<-sum(nj.)
>>>>>>?ejx<-(nj./nj)*somax
>>>>>>?ejy<-(nj./nj)*somay
>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>?return(pvalue)
>>>>>>?}
>>>>>>
>>>>>>#pvalues of the chisquare test between sample and
average (H0: two samples has the same distribution)
>>>>>>pvalues<-c()
>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>pvalues<-c(pvalues,a)
>>>>>>}
>>>>>>#data frame with final p-values
>>>>>>dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>print(dataframe)
>>>>>>}
>>>>>>z.plot("C:/Users/Vera/Desktop/data",23)
>>>>>>
>>>>>>
>>>>>>
>>>>>>Thank you again
>>>>>>
>>>>>>
>>>>>>
>>>>>>2013/2/17 arun <smartpink111 at yahoo.com>
>>>>>>
>>>>>>HI Vera,
>>>>>>>
>>>>>>>No problem.? I am cc:ing to r-help.
>>>>>>>
>>>>>>>A.K.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>________________________________
>>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>Sent: Sunday, February 17, 2013 5:44 AM
>>>>>>>Subject: Re: reading data
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>Hi. Thank you. It works now:-)
>>>>>>>And yes, I use windows.
>>>>>>>Thank you very much.
>>>>>>>No dia 17 de Fev de 2013 00:44, "arun"
<smartpink111 at yahoo.com> escreveu:
>>>>>>>
>>>>>>>Hi Vera,
>>>>>>>>
>>>>>>>>Have you tried the suggestion?
>>>>>>>>
>>>>>>>>Are you using Windows?
>>>>>>>>Thanks,
>>>>>>>>Arun
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>________________________________
>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>Sent: Saturday, February 16, 2013 7:10 PM
>>>>>>>>Subject: Re: reading data
>>>>>>>>
>>>>>>>>
>>>>>>>>Thank you.
>>>>>>>>In mine, I have an error " 'what'
must be a character string or a function".
>>>>>>>>I need to do equivalent in my system.
>>>>>>>>Thank you and sorry one more time.
>>>>>>>>No dia 16 de Fev de 2013 23:53, "arun"
<smartpink111 at yahoo.com> escreveu:
>>>>>>>>
>>>>>>>>Hi,
>>>>>>>>>You didn't mention what the error
message or whether you are reading file names which are? not
"mmmmm11kk.txt".
>>>>>>>>>
>>>>>>>>>It is workiing on my system as I run it
again.
>>>>>>>>>?c() combine values into a vector or list.
>>>>>>>>>
>>>>>>>>>?sessionInfo()
>>>>>>>>>R version 2.15.1 (2012-06-22)
>>>>>>>>>Platform: x86_64-pc-linux-gnu (64-bit)
>>>>>>>>>
>>>>>>>>>locale:
>>>>>>>>>?[1] LC_CTYPE=en_CA.UTF-8??????
LC_NUMERIC=C?????????????
>>>>>>>>>?[3] LC_TIME=en_CA.UTF-8???????
LC_COLLATE=en_CA.UTF-8???
>>>>>>>>>?[5] LC_MONETARY=en_CA.UTF-8???
LC_MESSAGES=en_CA.UTF-8??
>>>>>>>>>?[7] LC_PAPER=C????????????????
LC_NAME=C????????????????
>>>>>>>>>?[9] LC_ADDRESS=C??????????????
LC_TELEPHONE=C???????????
>>>>>>>>>[11] LC_MEASUREMENT=en_CA.UTF-8
LC_IDENTIFICATION=C??????
>>>>>>>>>
>>>>>>>>>attached base packages:
>>>>>>>>>[1] stats???? graphics? grDevices utils????
datasets? methods?? base????
>>>>>>>>>
>>>>>>>>>other attached packages:
>>>>>>>>>[1] stringr_0.6.2? reshape2_1.2.2
>>>>>>>>>
>>>>>>>>>loaded via a namespace (and not attached):
>>>>>>>>>[1] plyr_1.8
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>#code
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>#result
>>>>>>>>>
>>>>>>>>>res3
>>>>>>>>>#$group_a
>>>>>>>>>#$group_a$a1
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>$group_a$a2
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>$group_a$a3
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>$group_b
>>>>>>>>>$group_b$b1
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>$group_b$b2
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>$group_c
>>>>>>>>>$group_c$c1
>>>>>>>>>???? Id? M mm??? x???????? b? u? k? j???
y??????? p??? v
>>>>>>>>>1?? aAA? 1? 2? 739 0.1257000? 2? 2 AA???
2???? 8867 8926
>>>>>>>>>2 aAAAA? 1? 2 2263 0.0004000? 2? 2 AR???
4???? 7640 8926
>>>>>>>>>3??? aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790
734,1092?? NA
>>>>>>>>>4?? aAA? 1? 2 1965 0.0007000? 4? 3 AR???
2??? 11616 8926
>>>>>>>>>5? aAAA? 1? 3 3660 0.0008600 18? 3 AA???
2??? 20392? 496
>>>>>>>>>6??? AA na? 2 1972 0.0007000 11? 3 AR??
25????? 509? 734
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>A.K.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>________________________________
>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>>Sent: Saturday, February 16, 2013 6:32 PM
>>>>>>>>>Subject: Re: reading data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Sorry again... In:
>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>>>>>>>What is this c? In do.call(c,?? When I put
this row im R, I have an error.
>>>>>>>>>Thank you
>>>>>>>>>No dia 15 de Fev de 2013 18:11,
"arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>
>>>>>>>>>Hi,
>>>>>>>>>>No problem.
>>>>>>>>>>
>>>>>>>>>>BTW, these questions are not stupid..
>>>>>>>>>>Arun
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>________________________________
>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>Sent: Friday, February 15, 2013 1:08 PM
>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Thank you very much.
>>>>>>>>>>
>>>>>>>>>>I will try to apply and after I tell you
if it is ok :-)
>>>>>>>>>>
>>>>>>>>>>Thank you and sorry about this questions
(sometimes stupid questions).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>2013/2/15 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>
>>>>>>>>>>HI,
>>>>>>>>>>>No problem.
>>>>>>>>>>>?c() for concatenate to vector or
list().
>>>>>>>>>>>If I use do.call(cbind,..) or
do.call(rbind,...)
>>>>>>>>>>>
>>>>>>>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>#?? [,1]??? [,2]??? [,3]??? [,4]???
[,5]??? [,6]??
>>>>>>>>>>>#a1 List,11 List,11 List,11 List,11
List,11 List,11
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>#???? a1????
>>>>>>>>>>>#[1,] List,11
>>>>>>>>>>>#[2,] List,11
>>>>>>>>>>>#[3,] List,11
>>>>>>>>>>>#[4,] List,11
>>>>>>>>>>>#[5,] List,11
>>>>>>>>>>>#[6,] List,11
>>>>>>>>>>>ie.
>>>>>>>>>>>list within in a list
>>>>>>>>>>>
>>>>>>>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>>>>>>>?str(restrial)
>>>>>>>>>>>#List of 6
>>>>>>>>>>># $ :List of 1
>>>>>>>>>>>? #..$ a1:'data.frame':??? 6
obs. of? 11 variables:
>>>>>>>>>>>? .#. ..$ Id: chr [1:6]
"aAA" "aAAAA" "aA" "aAA" ...
>>>>>>>>>>>? #.. ..$ M : chr [1:6]
"1" "1" "2" "1" ...
>>>>>>>>>>>? #. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>? #. ..$ x : int [1:6] 739 2263 1
1965 3660 1972
>>>>>>>>>>>?
-----------------------------------------------------------------
>>>>>>>>>>>str(res)
>>>>>>>>>>>#List of 6
>>>>>>>>>>># $ a1:'data.frame':??? 6
obs. of? 11 variables:
>>>>>>>>>>>?# ..$ Id: chr [1:6] "aAA"
"aAAAA" "aA" "aAA" ...
>>>>>>>>>>>? #..$ M : chr [1:6] "1"
"1" "2" "1" ...
>>>>>>>>>>>?# ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>?# ..$ x : int [1:6] 739 2263 1 1965
3660 1972
>>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>
>>>>>>>>>>>You mentioned about naming this to
"group_a","group_b". etc..
>>>>>>>>>>>
>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>
>>>>>>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>?res3$group_a
>>>>>>>>>>>$a1
>>>>>>>>>>>
>>>>>>>>>>>#???? Id? M mm??? x???????? b? u? k?
j??? y??????? p??? v
>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2
AA??? 2???? 8867 8926
>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2
AR??? 4???? 7640 8926
>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA?
2 6790 734,1092?? NA
>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3
AR??? 2??? 11616 8926
>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3
AA??? 2??? 20392? 496
>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3
AR?? 25????? 509? 734
>>>>>>>>>>>
>>>>>>>>>>>#$a2
>>>>>>>>>>>
>>>>>>>>>>>#???? Id? M mm??? x???????? b? u? k?
j??? y??????? p??? v
>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2
AA??? 2???? 8867 8926
>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2
AR??? 4???? 7640 8926
>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA?
2 6790 734,1092?? NA
>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3
AR??? 2??? 11616 8926
>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3
AA??? 2??? 20392? 496
>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3
AR?? 25????? 509? 734
>>>>>>>>>>>
>>>>>>>>>>>#$a3
>>>>>>>>>>>
>>>>>>>>>>>?# ?? Id? M mm??? x???????? b? u? k?
j??? y??????? p??? v
>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000? 2? 2
AA??? 2???? 8867 8926
>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000? 2? 2
AR??? 4???? 7640 8926
>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2 AA?
2 6790 734,1092?? NA
>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000? 4? 3
AR??? 2??? 11616 8926
>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600 18? 3
AA??? 2??? 20392? 496
>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000 11? 3
AR?? 25????? 509? 734
>>>>>>>>>>>A.K.
>>>>>>>>>>>
>>>>>>>>>>>________________________________
>>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>Sent: Friday, February 15, 2013
12:39 PM
>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Thank you very much and sorry my
questions.
>>>>>>>>>>>
>>>>>>>>>>>But this code isn't grouping for
letters sure? I mean, a1,a2,a3 is the same group, (the first letter give me the
name of the group)
>>>>>>>>>>>
>>>>>>>>>>>Another question, in do.call, you
did do.call (c,.....) .What is c?
>>>>>>>>>>>
>>>>>>>>>>>Sorry
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2013/2/15 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>
>>>>>>>>>>>HI,
>>>>>>>>>>>>
>>>>>>>>>>>>Just to add:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>
>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>res[grep("group_b",names(res))]
>>>>>>>>>>>>
>>>>>>>>>>>>I am not sure how you want the
grouped data to look like.? If you want something like this:
>>>>>>>>>>>>res1<-do.call(rbind,res)
>>>>>>>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>>>>>>>res2
>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>
>>>>>>>>>>>>?# ??? Id? M mm??? x???????? b?
u? k? j??? y??????? p??? v
>>>>>>>>>>>>#1??? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#2? aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#3???? aA? 2? 1??? 1 0.0845435?
2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#4??? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#5?? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#6???? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>#7??? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#8? aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#9???? aA? 2? 1??? 1 0.0845435?
2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#10?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#11? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#12??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>#13?? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#14 aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#15??? aA? 2? 1??? 1 0.0845435?
2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#16?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#17? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#18??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>?# ??? Id? M mm??? x???????? b?
u? k? j??? y??????? p??? v
>>>>>>>>>>>>#1??? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#2? aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#3???? aA? 2? 1??? 1 0.0845435?
2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#4??? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#5?? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#6???? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>#7??? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#8? aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#9???? aA? 2? 1??? 1 0.0845435?
2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#10?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#11? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#12??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>
>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>
>>>>>>>>>>>>?# ?? Id? M mm??? x???????? b?
u? k? j??? y??????? p??? v
>>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2
AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>#or if you want it like this:
>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>
>>>>>>>>>>>>res2[["group_b"]]
>>>>>>>>>>>>
>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>#???? Id? M mm??? x???????? b?
u? k? j??? y??????? p??? v
>>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2
AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>
>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>?# ?? Id? M mm??? x???????? b?
u? k? j??? y??????? p??? v
>>>>>>>>>>>>#1?? aAA? 1? 2? 739 0.1257000?
2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>#2 aAAAA? 1? 2 2263 0.0004000?
2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>#3??? aA? 2? 1??? 1 0.0845435? 2
AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>#4?? aAA? 1? 2 1965 0.0007000?
4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>#5? aAAA? 1? 3 3660 0.0008600
18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>#6??? AA na? 2 1972 0.0007000
11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>
>>>>>>>>>>>>Hope this helps.
>>>>>>>>>>>>
>>>>>>>>>>>>A.K.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>>From: "veracosta.rt at
gmail.com" <veracosta.rt at gmail.com>
>>>>>>>>>>>>To: smartpink111 at yahoo.com
>>>>>>>>>>>>Cc:
>>>>>>>>>>>>Sent: Friday, February 15, 2013
9:15 AM
>>>>>>>>>>>>Subject: reading data
>>>>>>>>>>>>
>>>>>>>>>>>>Hi,
>>>>>>>>>>>>I post yesterday and you helped
me. I have little problem.
>>>>>>>>>>>>
>>>>>>>>>>>>At first, I never worked with
regular expressions...
>>>>>>>>>>>>
>>>>>>>>>>>>The code that you gave me
it's ok, but my files are inside the folders a1,a2,a3. I try to explain
better.
>>>>>>>>>>>>
>>>>>>>>>>>>I have one folder named
"data". Inside this folder I have some other folders named
"a1","a2","b1",b2",...and inside of each one
of that I have some files. I want only the file "mmmmmm.txt" (in all
folders I have One file with this name).
>>>>>>>>>>>>The name of the folder give me
the name of the group,but I need to read the file inside. And after, have
"group_a", group_"b"...because I need to work with this data
grouped (and know the name of the group).
>>>>>>>>>>>>
>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>???
>>>>>>>>>>>???
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>????????????????????????????????
>>>>>>
>>>>>????????
>>>>?
>>>????????????????????????????????????????????
>>?
>????????????????????????????????????

arun

2013-Feb-25 15:28 UTC

head link

[R] reading data

HI,
>-I need to
>- read data (like the code that you did)
>- select only data with FDR<0.01 for?all files
>- remove first file of each group (a1,c1,t1,...)
>- select only column Seq, Mod, z, spec for all files
>-?for each file behind merge data with the same spec, mod an z (grouping the
spec)
>- table frequencies of spec like:
>???????????? seq???????c2?????????? c3????????? c4??????????? t1????? ....
>?????????? aaaaA???? 0??????????? 2??????????? 5????????????? 6???????????????? 
this table is how many number I have in spec (in total)

Try this:


files<-paste("MSMS_",23,"PepInfo.txt",sep="") 
read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
res2<-split(lista,names(lista)) 
res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
#Freq FDR<0.01 
#lev<-sort(unique(do.call(c,lapply(seq_along(res3),function(i)
do.call(c,lapply(res3[[i]],function(x) unique(x$z)))))))
res4<-lapply(seq_along(res3),function(i) lapply(res3[[i]],function(x)
x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
names(res4)<- names(res2) 
res5<- lapply(res4,function(x) if(length(x)>1) tail(x,-1) else NULL) 
library(plyr) 
library(data.table) 
res6<- lapply(res5,function(x) lapply(x,function(x1) {x1<-data.table(x1);
x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]}))
res7<-lapply(res6,function(x) lapply(x,function(x1)
{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s",
"", unlist(strsplit(x2,
",")))));x3<-as.data.frame(x1);x3[,-4]}))
?res8<-lapply(res7,function(x) Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),x))
?res9<-res8[lapply(res8,length)!=0]
?res10<- Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),res9)
names(res10)[4:6]<- c("c2","c3","t2") 
?head(res10)
#??????????????????????? Seq???????????????? Mod z c2 c3 t2
#1??? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2 NA NA? 1
#2???? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2 NA NA? 1
#3????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2? 1 NA? 1
#4?????????????? AAAAAAALQAK???????????????????? 2 NA? 1? 1
#5??????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2? 2 NA? 2
#6 aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2 NA NA? 1





________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Monday, February 25, 2013 8:56 AM
Subject: Re: reading data


You're correct, but my real data have +- 40000 row, and I can have
duplicated rows. I group number of spec if data has the same Seq, mod and z.

For the data in attach , if I do the code (only for c and t), 

c1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
c2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
c3 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c3/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
t1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
t2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
dc1<-c1[ifelse(c1$FDR<0.01, TRUE, FALSE),]
dc2<-c2[ifelse(c2$FDR<0.01, TRUE, FALSE),]
dc3<-c3[ifelse(c2$FDR<0.01, TRUE, FALSE),]
dt1<-t1[ifelse(t1$FDR<0.01, TRUE, FALSE),]
dt2<-t2[ifelse(t2$FDR<0.01, TRUE, FALSE),]
bc1<- aggregate(spec ~ Seq + Mod+z, data = dc1, paste, collapse =
",")
bc2<- aggregate(spec ~ Seq + Mod+z, data = dc2, paste, collapse =
",")
bc3<- aggregate(spec ~ Seq + Mod+z, data = dc3, paste, collapse =
",")
bt1<- aggregate(spec ~ Seq + Mod+z, data = dt1, paste, collapse =
",")
bt2<- aggregate(spec ~ Seq + Mod+z, data = dt2, paste, collapse =
",")
bc1$counts <- sapply(bc1$spec, function(x) length(gsub("\\s",
"", unlist(strsplit(x, ",")))))
bc2$counts <- sapply(bc2$spec, function(x) length(gsub("\\s",
"", unlist(strsplit(x, ",")))))
bc3$counts <- sapply(bc3$spec, function(x) length(gsub("\\s",
"", unlist(strsplit(x, ",")))))
bt1$counts <- sapply(bt1$spec, function(x) length(gsub("\\s",
"", unlist(strsplit(x, ",")))))
bt2$counts <- sapply(bt2$spec, function(x) length(gsub("\\s",
"", unlist(strsplit(x, ",")))))
bc1<-bc1[,-4]
bc2<-bc2[,-4]
bc3<-bc3[,-4]
bt1<-bt1[,-4]
bt2<-bt2[,-4]
a1<-merge(bc1,bc2,by=c("Seq","Mod","z"),all=TRUE)
a2<-merge(a1,bc3,by=c("Seq","Mod","z"),all=TRUE)
a3<-merge(bt1,bt2,by=c("Seq","Mod","z"),all=TRUE)
aa<-merge(a2,a3,by=c("Seq","Mod","z"),all=TRUE)
aa





I have the output 


?????????????????????????????????????? Seq???????????????? Mod z counts.x.x
counts.y.x counts.x.y counts.y.y
1???????????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2????????
NA????????? 1????????? 1????????? 1
2??????????????????? aAAAAAAAGAAGGRGSGPGRR????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
3?????????????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2????????
NA????????? 2????????? 1????????? 2
4????????????????????????? aAAAAATAAAAASIR????????? 1-n_acPro/ 2?????????
1???????? NA????????? 1???????? NA
5??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 2????????
NA????????? 1???????? NA???????? NA
6??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 3?????????
1????????? 2???????? NA????????? 1
7??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2????????
NA???????? NA???????? NA????????? 1
8??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 3????????
NA???????? NA???????? NA????????? 1
9????????????????????????????? AAAAAPGTAEK???????????????????? 2?????????
1????????? 1???????? NA???????? NA
10???????????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1????????
NA???????? NA???????? NA????????? 1
11???????????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
12??????????????????????????? AAAAEVLGLILR???????????????????? 2????????
NA????????? 1???????? NA????????? 1
13???????? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/ 3?????????
1???????? NA???????? NA????????? 1
14?????????????????????????? aAAAVVVPAEWIK????????? 1-n_acPro/ 2?????????
1???????? NA????????? 1???????? NA
15??????????????????? aAADGDDSLYPIAVLIDELR????????? 1-n_acPro/ 2?????????
1???????? NA????????? 1???????? NA
16??????????????????? aAADGDDSLYPIAVLIDELR????????? 1-n_acPro/ 3????????
NA???????? NA????????? 1???????? NA
17????????????????????????? AAADLMAYCEAHAK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
18????????????????????????? AAADLMAYCEAHAK???????????????????? 3?????????
1???????? NA????????? 1???????? NA
19???????? aAAEAANCIMEVSCGQAESSEKPNAEDMTSK????????? 1-n_acPro/ 3?????????
1???????? NA????????? 1???????? NA
20??????????????????? AAAEIYEEFLAAFEGSDGNK???????????????????? 2?????????
1???????? NA????????? 1???????? NA
21????????????????? AAAIGIDLGTTYSCVGVFQHGK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
22????????????????? AAAIGIDLGTTYSCVGVFQHGK???????????????????? 3?????????
1???????? NA???????? NA???????? NA
23?????????????????????? AAALATVNAWAEQTGMK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
24????????????????? AAAPAPEEEMDECEQALAAEPK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
25???????????????????? AAAQLLQSQAQQSGAQQTK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
26?????????????????????????? AAATPESQEPQAK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
27?????????????????? aAAVAAAGAGEPQSPDELLPK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
28?????????? AAAVVGInSETIMKPASISEEELLNLINK??? 8-N_Deamidation/ 3?????????
1???????? NA???????? NA???????? NA
29??????????????????????????? aADTQVSETLKR????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
30????????????????????????? AAEDDEDDDVDTKK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
31???????????????????????????? AAEEPSKVEEK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
32???????????????????????????? AAEEPSKVEEK???????????????????? 3?????????
1???????? NA???????? NA???????? NA
33???????????????? AAEGGLSSPEFSELCIWLGSQIK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
34??????????????????? AAELIANSLATAGDGLIELR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
35????????????????????????? aAEPNKTEIQTLFK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
36??????????????? AAEQILEDMITIDVENVMEDICSK???????????????????? 3?????????
1???????? NA???????? NA???????? NA
37???????????????????? AAESLADPTEYENLFPGLK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
38???????????????????? AAFDDAIAELDTLSEESYK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
39???????????????????? AAFDDAIAELDTLSEESYK???????????????????? 3?????????
1???????? NA???????? NA???????? NA
40??????????????????????? AAFECMYTLLDSCLDR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
41??????????????????????? AAGGGAGSSEDDAQSR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
42?????????????????????????? AAGHPGDPESQQR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
43?????????????????????????? AAGHPGDPESQQR???????????????????? 3?????????
2???????? NA???????? NA???????? NA
44???????????????? AAGLATMISTMRPDIDNMDEYVR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
45?????????????????????????? aAGTLYTYPENWR????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
46?????????????????????????? aAGTSSYWEDLRK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
47??????????????????????? aAGVEAAAEVAATEIK????????? 1-n_acPro/ 2????????
11???????? NA???????? NA???????? NA
48??????????????????????? AAGVNVEPFWPGLFAK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
49???????????????????????????? AAHLCAEAALR???????????????????? 2?????????
1???????? NA???????? NA???????? NA
50????????? AALAGGTTMIIDHVVPEPGTSLLAAFDQWR???????????????????? 3?????????
1???????? NA???????? NA???????? NA
51????????????????????????? AALCHFCIDMLNAK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
52????????????????????? aALDSLSLFTSLGLSEQK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
53?????????????????????????? AALEALGSCLNNK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
54?????????????????????????? AALEAQNALHNMK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
55???????????????????? AALETDENLLLCAPTGAGK???????????????????? 2?????????
1???????? NA???????? NA???????? NA
56????????????????????? aALGVLESDLPSAVTLLK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
57??????????? AALPGILSELDVDVnEGSLMELQGHIGR?? 15-N_Deamidation/ 3?????????
1???????? NA???????? NA???????? NA
58 AALPSHVVTMLDNFPTnLHPMSQLSAAVTALNSESNFAR?? 17-N_Deamidation/ 4?????????
1???????? NA???????? NA???????? NA
59 AALPSHVVTMLDNFPTNLHPMSQLSAAVTALNSESNFAR???????????????????? 3?????????
1???????? NA???????? NA???????? NA
60??????????????????????? aALTAEHFAALQSLLK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
61?????????????????????????? AAMADTFLEHMCR???????????????????? 3?????????
1???????? NA???????? NA???????? NA
62????????????????????? AAMFTAGSNFNHVVQNEK???????????????????? 3?????????
1???????? NA???????? NA???????? NA
63???????????????????? aANATTNPSQLLPLELVDK????????? 1-n_acPro/ 2?????????
1???????? NA???????? NA???????? NA
64???????????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2????????
NA????????? 1???????? NA????????? 1
65?????????????????????? AAAAAWEEPSSGNGTAR???????????????????? 2????????
NA????????? 1???????? NA????????? 1
66???????????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3????????
NA????????? 1???????? NA????????? 1
67??????????????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 2????????
NA????????? 1????????? 1???????? NA
68??????????????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 3????????
NA???????? NA????????? 1???????? NA
69????????????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
70?????????????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
71???????????????????????????? AAAAAAALQAK???????????????????? 2????????
NA???????? NA???????? NA????????? 1
72?????????????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
73??????? aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3????????
NA???????? NA???????? NA????????? 1




2013/2/25 arun <smartpink111 at yahoo.com>

Hi,>What i said was:the `spec` column didn't change before and after the
aggregate() step.? I think you did aggregate to group it based on Seq, Mod, z.?
In the example you provided, it was already grouped.? May be it is not in your
original dataset.? Anyway, please email me the output you are getting for your
codes.
>
>Arun
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Monday, February 25, 2013 5:36 AM
>
>Subject: Re: reading data
>
>
>Sorry, I don't understand what you said.
>
>I need to
>- read data (like the code that you did)
>- select only data with FDR<0.01 for?all files
>- remove first file of each group (a1,c1,t1,...)
>- select only column Seq, Mod, z, spec for all files
>-?for each file behind merge data with the same spec, mod an z (grouping the
spec)
>- table frequencies of spec like:
>
>???????????? seq???????c2?????????? c3????????? c4??????????? t1????? ....
>?????????? aaaaA???? 0??????????? 2??????????? 5?????????????
6???????????????? this table is how many number I have in spec (in total)
>
>
>I think my small code isn't correct...
>
>Thank you
>
>
>
>2013/2/23 arun <smartpink111 at yahoo.com>
>
>One more thing:
>>The last column 'spec' in the output is already aggregated based
on `Seq`, `Mod`, `z` in the data.new directory.
>>?res5[[3]][[1]]
>>
>>??????????????????????????????? Seq???????????????? Mod z????? spec
>>1??????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2???? 11833
>>2???????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2???? 11833
>>3????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2???? 13103
>>4?????????????????????? AAAAAAALQAK???????????????????? 2????? 3084
>>5??????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2 9646,9821
#################check here
>>
>>6???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2???? 33650
>>7???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 3???? 33607
>>9???????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 3???? 33769
>>11??????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2???? 20602
>>12????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2???? 10018
>>13??????????????? AAAAAWEEPSSGNGTAR???????????????????? 2????? 5576
>>14????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1???? 19662
>>16???????????????????? AAAAEVLGLILR???????????????????? 2???? 22857
>>17????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3???? 26060
>>18? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/ 3???? 21479
>>19 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3???? 21159
>>
>>aggregate() doesn't change anything here, especially in this
dataset.
>>In the next line you used sapply(....., ), which gives an output,
>>sapply(res6[[3]][[1]]$spec,function(x)
length(gsub("\\s","",unlist(strsplit(x,","))))) #
this I believe is not correct
>>#??? 11833???? 11833???? 13103????? 3084 9646,9821???? 33650????
33607???? 33769?? #here you have two `11833` and one `9646.9821`.? Not really
sure what you want here
>>?# ????? 1???????? 1???????? 1???????? 1???????? 2???????? 1????????
1???????? 1
>>? #? 20602???? 10018????? 5576???? 19662???? 22857???? 26060????
21479???? 21159
>>?? # ??? 1???????? 1???????? 1???????? 1???????? 1???????? 1????????
1???????? 1
>>
>>
>>
>>If it is:
>>?table(unlist(strsplit(res6[[3]][[1]]$spec,","))) #this makes
sense
>>
>>#10018 11833 13103 19662 20602 21159 21479 22857 26060? 3084 33607 33650
33769
>>?# ? 1???? 2???? 1???? 1???? 1???? 1???? 1???? 1???? 1???? 1???? 1????
1???? 1
>># 5576? 9646? 9821
>>?# ? 1???? 1???? 1?
>>
>>Now coming to the last `merge` section:
>>do you want to merge the counts in each group by "spec" name:
#in this case "Var1"
>>
>>$group_c
>>$group_c$c2
>>??? Var1 Freq
>>1? 10039??? 1
>>2? 13200??? 1
>>3? 22929??? 1
>>4? 26117??? 1
>>5? 33712??? 1
>>6? 33774??? 1
>>7? 33867??? 1
>>8??? 379??? 1
>>9?? 4102??? 1
>>10? 5664??? 1
>>11? 9703??? 1
>>12? 9876??? 1
>>
>>$group_c$c3
>>??? Var1 Freq
>>1? 10325??? 1
>>2? 21555??? 1
>>3? 22994??? 1
>>4? 26142??? 1
>>5?? 3341??? 1
>>6? 33708??? 1
>>7? 33870??? 1
>>8? 34095??? 1
>>9?? 4397??? 1
>>10? 4416??? 1
>>11? 5960??? 1
>>
>>
>>
>>A.K.
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Friday, February 22, 2013 8:36 PM
>>
>>Subject: Re: reading data
>>
>>
>>Oh,sorry.
>>Now,I'm in phone. Tomorrow, i will send.
>>Thank you
>>No dia 22 de Fev de 2013 22:06, "arun" <smartpink111 at
yahoo.com> escreveu:
>>
>>Hi,
>>>
>>>As I mentioned in my earlier post, results that you got from your
code in the same dataset 'data.new' will make it easy for me rather than
figuring out how your code works.
>>>Thanks,
>>>A.K.
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Friday, February 22, 2013 1:13 PM
>>>Subject: Re: reading data
>>>
>>>
>>>Hi.
>>>
>>>I use you code and it was a good, good help. Thank you.
>>>
>>>I'm now doing a new study of the data but I need to optimize my
code.
>>>
>>>For the same data, I need:
>>>
>>>- read data (like the code that you did)
>>>- select only data with FDR<0.01 for?all files
>>>- remove first file of each group (a1,c1,t1,...)
>>>- select only column Seq, Mod, z, spec for all files
>>>-?for each file behind merge data with the same spec, mod an z
(grouping the spec)
>>>- table frequencies of spec like:
>>>???????????? seq???????c2?????????? c3????????? c4???????????
t1????? ....
>>>?????????? aaaaA???? 0??????????? 2??????????? 5?????????????
6???????????????? this table is how many number I have in spec (in total)
>>>??????????? .....
>>>
>>>
>>>
>>>
>>>I start doing the code.....
>>>
>>>
>>>spec <- function(directory,number) {
>>>??setwd(directory)
>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>?directT <- direct[grepl("^t", direct)]
>>>?directC <- direct[grepl("^c", direct)]
>>>
>>>?lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep
= "\t"))
>>>?listaC<-lapply(directC, function(x) read.table(x,header=TRUE,
sep = "\t"))
>>>?listaT<-lapply(directT, function(x) read.table(x,header=TRUE,
sep = "\t"))
>>>
>>>?#boxplots for each run
>>>?dcf<-c()
>>>?dtf<-c()
>>>
>>>?for(i in 1:length(lista)){
>>>
>>>
>>>?}
>>>
>>>?for (i in 2:length(listaC)) {
>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<Pfdr, TRUE,
FALSE),]
>>>??dcc1<- aggregate(spec ~ Seq + Mod+z, data = dcc1, paste,
collapse = ",")
>>>??dcc1$counts <- sapply(dcc1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>??dcc1<-dcc1[,-4]
>>>??dcf<-list(dcf,dcc1)
>>>
>>>??}
>>>?print(dcf)
>>>
>>>merg<-merge(dcf[[1]][[2]],dcf[[2]],by=c("Seq","Mod","z"),all=TRUE)
>>>print(merg)
>>>?for (i in 2:length(listaT)) {
>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<Pfdr, TRUE,
FALSE),]
>>>??dct1<- aggregate(spec ~ Seq + Mod+z, data = dct1, paste,
collapse = ",")
>>>??dct1$counts <- sapply(dct1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>??dct1<-dct1[,-4]
>>>??dtf<-list(dtf,dct1)
>>>??}
>>>}
>>>spec("C:/Users/Vera Costa/Desktop/data.new",23)
>>>
>>>
>>>I can doing the new code. The problem is that I need a lot of time
to do this row:
>>>dcc1<- aggregate(spec ~ Seq + Mod+z, data = dcc1, paste, collapse
= ",")
>>>
>>>
>>>I have near than 40000 rows.
>>>
>>>Could you help me to optimize this?
>>>
>>>Thank you.
>>>Vera
>>>
>>>
>>>
>>>
>>>2013/2/20 Vera Costa <veracosta.rt at gmail.com>
>>>
>>>Thank you very much.
>>>>?
>>>>I will try.
>>>>?
>>>>thank you
>>>>
>>>>
>>>>
>>>>2013/2/20 arun <smartpink111 at yahoo.com>
>>>>
>>>>
>>>>>
>>>>>Hi,
>>>>>
>>>>>You can change `res4` to:
>>>>>lev<-sort(unique(do.call(c,lapply(seq_along(res3),function(i)
do.call(c,lapply(res3[[i]],function(x) unique(x$z)))))))
>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=lev))))))
>>>>>
>>>>>freqs1<-do.call(rbind,lapply(split(freq.f1,gsub("\\d+","",freq.f1$id)),function(x)
x[-1,])) #here there is only level for a1.? So, it is removed
>>>>>?average1<- colMeans(freqs1[,-1])
>>>>>?average1
>>>>>#??????? 1???????? 2???????? 3
>>>>>#0.3333333 8.0000000 3.6666667
>>>>>pvalues1<-do.call(rbind,lapply(seq_len(nrow(freqs1)),function(x)
chisq.test(freqs1[x,-1],average1)))
>>>>>?row.names(pvalues1)<- row.names(freqs1)
>>>>>?pvalues1
>>>>>#???????????????? [,1]
>>>>>#c.group_c.2 0.7235907
>>>>>#c.group_c.3 0.7963287
>>>>>#t?????????? 0.9079200
>>>>>
>>>>>
>>>>>A.K.
>>>>>
>>>>>----- Original Message -----
>>>>>
>>>>>From: arun <smartpink111 at yahoo.com>
>>>>>To: Vera Costa <veracosta.rt at gmail.com>
>>>>>Cc: R help <r-help at r-project.org>
>>>>>Sent: Tuesday, February 19, 2013 7:29 PM
>>>>>Subject: Re: reading data
>>>>>
>>>>>Hi,
>>>>>Try this:
>>>>>
>>>>>
>>>>>files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>res2<-split(lista,names(lista))
>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>#Freq whole data
>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=1:3))))))
>>>>>names(res4)<- names(res2)
>>>>>library(reshape2)
>>>>>freq.i1<-do.call(rbind,lapply(res4,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>freq.i1
>>>>>#????????? id 1? 2 3
>>>>>#group_a?? a1 1 12 6
>>>>>#group_c.1 c1 0 10 3
>>>>>#group_c.2 c2 0 12 3
>>>>>#group_c.3 c3 0 13 4
>>>>>#group_t.1 t1 0 10 4
>>>>>#group_t.2 t2 1 12 6
>>>>>
>>>>>freq.rel.i1<-
as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) )
>>>>>?freq.rel.i1
>>>>>?# ???????????????? 1???????? 2???????? 3
>>>>>#group_a?? 0.05263158 0.6315789 0.3157895
>>>>>#group_c.1 0.00000000 0.7692308 0.2307692
>>>>>#group_c.2 0.00000000 0.8000000 0.2000000
>>>>>#group_c.3 0.00000000 0.7647059 0.2352941
>>>>>#group_t.1 0.00000000 0.7142857 0.2857143
>>>>>#group_t.2 0.05263158 0.6315789 0.3157895
>>>>>
>>>>>
>>>>>
>>>>>#Freq with FDR< 0.01
>>>>>res5<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z[x[["FDR"]]<0.01],levels=1:3))))))
>>>>>names(res5)<- names(res2)
>>>>>
>>>>>freq.f1<- do.call(rbind,lapply(res5,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>
>>>>>?freq.f1
>>>>>?# ??????? id 1? 2 3
>>>>>#group_a?? a1 1 10 5
>>>>>#group_c.1 c1 0? 7 2
>>>>>#group_c.2 c2 0? 8 2
>>>>>#group_c.3 c3 0? 6 4
>>>>>#group_t.1 t1 0? 7 4
>>>>>#group_t.2 t2 1 10 5
>>>>>
>>>>>
>>>>>freq.rel.f1<-
as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1]))
>>>>>
>>>>>colour<-sample(rainbow(nrow(freq.rel.i1)))
>>>>>par(mfrow=c(1,2))
>>>>>barplot(freq.rel.i1,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i1))
>>>>>barplot(freq.rel.f1,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f1))
>>>>>#change the legend position
>>>>>
>>>>>Also, didn't check the rest of the code from chisquare
test.
>>>>>A.K.
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Tuesday, February 19, 2013 4:19 PM
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>Here is the code and some outputs.
>>>>>
>>>>>z.plot <- function(directory,number) {
>>>>>?#reading data
>>>>>??setwd(directory)
>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>
>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>
>>>>>?#count different z values
>>>>>?cab <- vector()
>>>>>??? for (i in 1:length(lista)) {
>>>>>???????? dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>??????? dc<-table(dc$z)
>>>>>??????? cab <- c(cab, names(dc))
>>>>>??}
>>>>>
>>>>>?#Relative freqs to construct the graph
>>>>>??? cab <- unique(cab)
>>>>>?print(cab)
>>>>>
>>>>>###[1] "2" "3" "1"
>>>>>
>>>>>
>>>>>
>>>>>??? d <- matrix(ncol=length(cab))
>>>>>?dci<- d[-1,]
>>>>>??? dcf <- d[-1,]
>>>>>?dti <- d[-1,]
>>>>>?dtf <- d[-1,]
>>>>>
>>>>>??? for (i in 1:length(listaC)) {
>>>>>
>>>>>??#Relative freq of all data
>>>>>??dcc<-listaC[[i]]
>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>??dci<- rbind(dci, dcc)
>>>>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE,
prefix = "c")
>>>>>
>>>>>
>>>>>??#Relative freq of data with FDR<0.01
>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01, TRUE,
FALSE),]
>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE,
prefix = "c")
>>>>>???????? }
>>>>>
>>>>>
>>>>>?for (i in 1:length(listaT)) {
>>>>>
>>>>>??#Relative freq of all data
>>>>>??dct<-listaT[[i]]
>>>>>??dct<-table(factor(dct$z, levels=cab))
>>>>>??dti<- rbind(dti, dct)
>>>>>??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE,
prefix = "t")
>>>>>
>>>>>
>>>>>??#Relative freq of data with FDR<0.01
>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01, TRUE,
FALSE),]
>>>>>??dct1<-table(factor(dct1$z, levels=cab))
>>>>>??dtf<- rbind(dtf,dct1)
>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE,
prefix = "t")
>>>>>??????? }
>>>>>
>>>>>??freq.i<-rbind(dci,dti)
>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>
>>>>>?print(freq.i)
>>>>>##????? 2 3 1
>>>>>#c1 10 3 0
>>>>>#c2 12 3 0
>>>>>#c3 13 4 0
>>>>>#t1 10 4 0
>>>>>#t2 12 6 1
>>>>>
>>>>>?print(freq.f)
>>>>>??###???? 2 3 1
>>>>>#c1? 7 2 0
>>>>>#c2? 8 2 0
>>>>>#c3? 6 4 0
>>>>>#t1? 7 4 0
>>>>>#t2 10 5 1
>>>>>
>>>>>?print(freq.rel.i)
>>>>>###?????????????? 2???????? 3????????? 1
>>>>>#c1 0.7692308 0.2307692 0.00000000
>>>>>#c2 0.8000000 0.2000000 0.00000000
>>>>>#c3 0.7647059 0.2352941 0.00000000
>>>>>#t1 0.7142857 0.2857143 0.00000000
>>>>>#t2 0.6315789 0.3157895 0.05263158
>>>>>?print(freq.rel.f)
>>>>>
>>>>>###???????????????? 2???????? 3????? 1
>>>>>#c1 0.7777778 0.2222222 0.0000
>>>>>#c2 0.8000000 0.2000000 0.0000
>>>>>#c3 0.6000000 0.4000000 0.0000
>>>>>#t1 0.6363636 0.3636364 0.0000
>>>>>#t2 0.6250000 0.3125000 0.0625
>>>>>
>>>>>#Graph plot
>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>par(mfrow=c(1,2))
>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>
>>>>>#average of the group (except c1&t1)
>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>average<-apply(freqs,2,mean)
>>>>>print(average)
>>>>>
>>>>>###???????????? 2???????? 3???????? 1
>>>>>#8.0000000 3.6666667 0.3333333
>>>>>
>>>>>#chisquare test function
>>>>>chisq.test<-function(x,y){
>>>>>?somax<-sum(x)
>>>>>?somay<-sum(y)
>>>>>?nj.<-x+y
>>>>>?nj<-sum(nj.)
>>>>>?ejx<-(nj./nj)*somax
>>>>>?ejy<-(nj./nj)*somay
>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>?return(pvalue)
>>>>>?}
>>>>>
>>>>>#pvalues of the chisquare test between sample and average
(H0: two samples has the same distribution)
>>>>>pvalues<-c()
>>>>>for (i in 1:(nrow(freqs))){
>>>>>a<-chisq.test(freqs[i,],average)
>>>>>pvalues<-c(pvalues,a)
>>>>>}
>>>>>
>>>>>
>>>>>#data frame with final p-values
>>>>>dataframe<-data.frame(c(rownames(freqs)), c(pvalues))
>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>print(dataframe)
>>>>>
>>>>>###? ? sample name??? pvalue
>>>>>#1????????? c2 0.7235907
>>>>>#2????????? c3 0.7963287
>>>>>#3???????????? 0.9079200
>>>>>}
>>>>>z.plot("C:/Users/Vera Costa/Desktop/dados",23)
>>>>>
>>>>>###and two barplots..
>>>>>
>>>>>
>>>>>Here, I remove the group a1.
>>>>>
>>>>>Thank you
>>>>>
>>>>>
>>>>>
>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>
>>>>>Hi,
>>>>>>
>>>>>>Could you send the results for the folder that was sent
to me?? It will be easy for me.
>>>>>>
>>>>>>Arun
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>________________________________
>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>Sent: Tuesday, February 19, 2013 3:47 PM
>>>>>>
>>>>>>Subject: Re: reading data
>>>>>>
>>>>>>
>>>>>>Oh sorry, I change the folder.
>>>>>>
>>>>>>I send for your folder
>>>>>>
>>>>>>
>>>>>>
>>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>>
>>>>>>Hello,
>>>>>>>
>>>>>>>
>>>>>>>? Regarding the results, is it from the same folder
that you sent to me??
>>>>>>>I am getting different results by running your
steps.
>>>>>>>
>>>>>>>
>>>>>>>direct<- list.files(recursive=TRUE)
>>>>>>>? direct
>>>>>>>#[1] "a1/MSMS_23PepInfo.txt"
"c1/MSMS_23PepInfo.txt" "c2/MSMS_23PepInfo.txt"
>>>>>>>#[4] "c3/MSMS_23PepInfo.txt"
"t1/MSMS_23PepInfo.txt" "t2/MSMS_23PepInfo.txt"
>>>>>>>
>>>>>>>?directT<-
list.files(recursive=TRUE)[grepl("^t",dir())]
>>>>>>>
>>>>>>>directT
>>>>>>>#[1] "t1/MSMS_23PepInfo.txt"
"t2/MSMS_23PepInfo.txt"
>>>>>>>
>>>>>>>
>>>>>>>directC<-
list.files(recursive=TRUE)[grepl("^c",dir())]
>>>>>>>
>>>>>>>directC
>>>>>>>#[1] "c1/MSMS_23PepInfo.txt"
"c2/MSMS_23PepInfo.txt" "c3/MSMS_23PepInfo.txt"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>?
>>>>>>>listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>
>>>>>>>?#count different z values
>>>>>>>?cab <- vector()
>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>??????? dc<-table(dc$z)
>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>? }
>>>>>>>?
>>>>>>>?#Relative freqs to construct the graph
>>>>>>>??? cab <- unique(cab)
>>>>>>>?print(cab)
>>>>>>>
>>>>>>>#[1] "1" "2" "3"?
#Here results are not correct
>>>>>>>
>>>>>>>
>>>>>>>d <- matrix(ncol=length(cab))
>>>>>>>?dci<- d[-1,]
>>>>>>>??? dcf <- d[-1,]
>>>>>>>?dti <- d[-1,]
>>>>>>>?dtf <- d[-1,]
>>>>>>>
>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>
>>>>>>>??#Relative freq of all data
>>>>>>>??dcc<-listaC[[i]]
>>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL
= FALSE, prefix = "c")
>>>>>>>
>>>>>>>
>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL
= FALSE, prefix = "c")
>>>>>>>??????? }
>>>>>>>?print(dci) #here too.
>>>>>>>
>>>>>>>#?? 1? 2 3
>>>>>>>#c1 0 10 3
>>>>>>>#c2 0 12 3
>>>>>>>#c3 0 13 4
>>>>>>>
>>>>>>>
>>>>>>>It is important to clear this before I make any
changes to the script.? You need to send me the output of the same data folder
to understand what is going on.
>>>>>>>
>>>>>>>
>>>>>>>Arun
>>>>>>>________________________________
>>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>Sent: Tuesday, February 19, 2013 9:24 AM
>>>>>>>
>>>>>>>Subject: Re: reading data
>>>>>>>
>>>>>>>
>>>>>>>Ok.
>>>>>>>
>>>>>>>Here is the code and some outputs.
>>>>>>>
>>>>>>>z.plot <- function(directory,number) {
>>>>>>>?#reading data
>>>>>>>??setwd(directory)
>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>>>
>>>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>
>>>>>>>?#count different z values
>>>>>>>?cab <- vector()
>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>??????? dc<-table(dc$z)
>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>??}
>>>>>>>
>>>>>>>?#Relative freqs to construct the graph
>>>>>>>??? cab <- unique(cab)
>>>>>>>?print(cab)
>>>>>>>
>>>>>>>###[1] "1" "2" "3"
"4" "5"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>?dci<- d[-1,]
>>>>>>>??? dcf <- d[-1,]
>>>>>>>?dti <- d[-1,]
>>>>>>>?dtf <- d[-1,]
>>>>>>>
>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>
>>>>>>>??#Relative freq of all data
>>>>>>>??dcc<-listaC[[i]]
>>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL
= FALSE, prefix = "c")
>>>>>>>
>>>>>>>
>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL
= FALSE, prefix = "c")
>>>>>>>??????? }
>>>>>>>?print(dci)
>>>>>>>
>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>
>>>>>>>?print(dcf)
>>>>>>>
>>>>>>>###??? 1??? 2??? 3?? 4? 5
>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>
>>>>>>>?for (i in 1:length(listaT)) {
>>>>>>>
>>>>>>>??#Relative freq of all data
>>>>>>>??dct<-listaT[[i]]
>>>>>>>??dct<-table(factor(dct$z, levels=cab))
>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL
= FALSE, prefix = "t")
>>>>>>>
>>>>>>>
>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>??dct1<-table(factor(dct1$z, levels=cab))
>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL
= FALSE, prefix = "t")
>>>>>>>??????? }
>>>>>>>
>>>>>>>?print(dti)
>>>>>>>
>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>
>>>>>>>?print(dtf)
>>>>>>>
>>>>>>>
>>>>>>>####??? 1??? 2??? 3?? 4? 5
>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>
>>>>>>>
>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>?print(freq.i)
>>>>>>>##???? 1???? 2??? 3?? 4? 5
>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>
>>>>>>>?print(freq.f)
>>>>>>>??###? 1??? 2??? 3?? 4? 5
>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>
>>>>>>>?print(freq.rel.i)
>>>>>>>###???????????? 1???????? 2???????? 3?????????
4?????????? 5
>>>>>>>#c1 0.007395626 0.6644930 0.2879523 0.03578529
0.004373757
>>>>>>>#c2 0.005059496 0.6330460 0.3213248 0.03714982
0.003419844
>>>>>>>#c3 0.004582389 0.6389834 0.3176493 0.03491119
0.003873772
>>>>>>>#c4 0.004829070 0.6415013 0.3143199 0.03638537
0.002964380
>>>>>>>#t1 0.002417832 0.6528145 0.3096335 0.03241405
0.002720060
>>>>>>>#t2 0.006117670 0.6313148 0.3213210 0.03766190
0.003584572
>>>>>>>#t3 0.004115226 0.6314694 0.3239409 0.03650448
0.003969983
>>>>>>>#t4 0.006371470 0.6302714 0.3225156 0.03699120
0.003850385
>>>>>>>?print(freq.rel.f)
>>>>>>>
>>>>>>>###????????????? 1???????? 2???????? 3?????????
4?????????? 5
>>>>>>>#c1 0.0014488554 0.6629962 0.3042596 0.02883222
0.002463054
>>>>>>>#c2 0.0005731128 0.6411495 0.3306861 0.02570820
0.001883085
>>>>>>>#c3 0.0013010246 0.6413238 0.3323305 0.02325581
0.001788909
>>>>>>>#c4 0.0016366612 0.6402619 0.3310147 0.02545008
0.001636661
>>>>>>>#t1 0.0006523157 0.6373125 0.3354207 0.02557078
0.001043705
>>>>>>>#t2 0.0009707952 0.6271337 0.3405064 0.02912386
0.002265189
>>>>>>>#t3 0.0015804359 0.6290967 0.3398769 0.02794876
0.001497255
>>>>>>>#t4 0.0011042751 0.6395330 0.3327023 0.02460956
0.002050797
>>>>>>>
>>>>>>>#Graph plot
>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>par(mfrow=c(1,2))
>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>
>>>>>>>#average of the group (except c1&t1)
>>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>print(average)
>>>>>>>
>>>>>>>###???????? 1????????? 2????????? 3?????????
4????????? 5
>>>>>>>?# 14.66667 7827.50000 4114.00000? 319.83333??
22.83333
>>>>>>>
>>>>>>>#chisquare test function
>>>>>>>chisq.test<-function(x,y){
>>>>>>>?somax<-sum(x)
>>>>>>>?somay<-sum(y)
>>>>>>>?nj.<-x+y
>>>>>>>?nj<-sum(nj.)
>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>?return(pvalue)
>>>>>>>?}
>>>>>>>
>>>>>>>#pvalues of the chisquare test between sample and
average (H0: two samples has the same distribution)
>>>>>>>pvalues<-c()
>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>}
>>>>>>>print(pvalues)
>>>>>>>##[1] 0.5307206 0.6849480 0.8332661 0.3474956
0.5546527 0.9387602
>>>>>>>
>>>>>>>#data frame with final p-values
>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>print(dataframe)
>>>>>>>
>>>>>>>###? sample name??? pvalue
>>>>>>>#1????????? c2 0.5307206
>>>>>>>#2????????? c3 0.6849480
>>>>>>>#3????????? c4 0.8332661
>>>>>>>#4????????? t2 0.3474956
>>>>>>>#5????????? t3 0.5546527
>>>>>>>#6????????? t4 0.9387602
>>>>>>>}
>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados",23)
>>>>>>>
>>>>>>>###and two barplots...
>>>>>>>
>>>>>>>Thank you
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>>>
>>>>>>>Got it.
>>>>>>>>
>>>>>>>>So, if I run your codes that you sent yesterday,
will I get the correct results for relative frequency etc.? It would be also
great if you can sent me the output generated using your codes (on two groups as
you showed yesterday).? It will help me in checking results much faster than
running your code and see if that is the result (because I have to do some
adjustment to your code for running in linux especially the ?dir()).?
>>>>>>>>
>>>>>>>>I may be able to run it only later.
>>>>>>>>
>>>>>>>>Arun
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>________________________________
>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>Sent: Tuesday, February 19, 2013 8:53 AM
>>>>>>>>
>>>>>>>>Subject: Re: reading data
>>>>>>>>
>>>>>>>>
>>>>>>>>I sent in second email.
>>>>>>>>
>>>>>>>>But I send again.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>Your attachment didn't came through.
>>>>>>>>>
>>>>>>>>>Arun
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>________________________________
>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>>Sent: Tuesday, February 19, 2013 8:47 AM
>>>>>>>>>
>>>>>>>>>Subject: Re: reading data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Sorry about a lot of questions.
>>>>>>>>>
>>>>>>>>>I attach a small part of my real data (I
have a lot of row).
>>>>>>>>>
>>>>>>>>>My main objective is construct two?graph.
The first with the relative frequencies of each group (c1,c2,c3....). The second
with the same frequencies but with FDR<0.01.
>>>>>>>>>
>>>>>>>>>After that I need to do the average in each
group (but without the first group-c1,t1,a1....) and do the qui square test to
see if the groups has the?same distribution. You understand?
>>>>>>>>>
>>>>>>>>>At first, I had only two groups, and I did
the code that I sent you. But I need a general code, not for two groups that I
know the names, but for all groups (sometimes I can?have 7 or 8 or 9 groups).
>>>>>>>>>
>>>>>>>>>it?s better now my explanation??:-)
>>>>>>>>>My English isn't also very good :-)
>>>>>>>>>
>>>>>>>>>Please not publish this data in forum...
>>>>>>>>>
>>>>>>>>>Thank you
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2013/2/18 arun <smartpink111 at
yahoo.com>
>>>>>>>>>
>>>>>>>>>Hi,
>>>>>>>>>>
>>>>>>>>>>I run the codes to understand what was
going on.?
>>>>>>>>>>
>>>>>>>>>>I didn't fully understand it as you
constructed the codes for your original dataset and not for the 'data`
directory you sent to me.
>>>>>>>>>>
>>>>>>>>>>A.K.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>________________________________
>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>Sent: Monday, February 18, 2013 4:02 PM
>>>>>>>>>>
>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>Thank you.
>>>>>>>>>>I don't need the same,but
equivalent. I will try your suggestions.
>>>>>>>>>>Thank you.
>>>>>>>>>>No dia 18 de Fev de 2013 19:41,
"arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>>I am not able to open your graph.? I
am using linux.
>>>>>>>>>>>
>>>>>>>>>>>Also, the codes in the function are
not reproducible
>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>
>>>>>>>>>>>It takes double the time to know
what is going on.
>>>>>>>>>>>
>>>>>>>>>>>dir()
>>>>>>>>>>>#[1] "a1" "a2"
"a3" "b1" "b2" "c1"
>>>>>>>>>>>
>>>>>>>>>>>direct<-
list.files(recursive=TRUE)[grepl("^a|^b",dir())]
>>>>>>>>>>>
>>>>>>>>>>>?direct
>>>>>>>>>>>#[1] "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
>>>>>>>>>>>#[4] "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt"
>>>>>>>>>>>directA<-
list.files(recursive=TRUE)[grepl("^a",dir())]
>>>>>>>>>>>directB<-
list.files(recursive=TRUE)[grepl("^b",dir())]
>>>>>>>>>>>lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>>>>>
>>>>>>>>>>>listaA<-lapply(directA,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>listaB<-lapply(directB,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>
>>>>>>>>>>>#here I am changing the names
listaT, z, etc..
>>>>>>>>>>>
>>>>>>>>>>>count different mm values
>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$b<0.01, TRUE, FALSE),]
>>>>>>>>>>>??????? dc<-table(dc$mm)
>>>>>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>>>>>? }
>>>>>>>>>>>
>>>>>>>>>>>?#Relative freqs to construct the
graph
>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>
>>>>>>>>>>>???
########################################
>>>>>>>>>>>?for (i in 1:length(listaA)) {
>>>>>>>>>>>
>>>>>>>>>>>? #Relative freq of all data
>>>>>>>>>>>? dcc<-listaA[[i]]
>>>>>>>>>>>? dcc<-table(factor(dcc$mm,
levels=cab))
>>>>>>>>>>>? dci<- rbind(dci, dcc)
>>>>>>>>>>>?
rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>? #Relative freq of data with
FDR<0.01
>>>>>>>>>>>?
dcc1<-listaA[[i]][ifelse(listaA[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>? dcc1<-table(factor(dcc1$mm,
levels=cab))
>>>>>>>>>>>? dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>?
rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>??????? }
>>>>>>>>>>>
>>>>>>>>>>>?for (i in 1:length(listaB)) {
>>>>>>>>>>>
>>>>>>>>>>>? #Relative freq of all data
>>>>>>>>>>>? dct<-listaB[[i]]
>>>>>>>>>>>? dct<-table(factor(dct$mm,
levels=cab))
>>>>>>>>>>>? dti<- rbind(dti, dct)
>>>>>>>>>>>?
rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>? #Relative freq of data with
FDR<0.01
>>>>>>>>>>>?
dct1<-listaB[[i]][ifelse(listaB[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>? dct1<-table(factor(dct1$mm,
levels=cab))
>>>>>>>>>>>? dtf<- rbind(dtf,dct1)
>>>>>>>>>>>?
rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>??????? }
>>>>>>>>>>>? freq.i<-rbind(dci,dti)
>>>>>>>>>>>? freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>?
freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>?
freq.rel.f<-freq.f/apply(freq.f,1,sum)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>?freq.i
>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>#a1 4 1
>>>>>>>>>>>#a2 4 1
>>>>>>>>>>>#a3 4 1
>>>>>>>>>>>#b1 4 1
>>>>>>>>>>>#b2 4 1
>>>>>>>>>>>#b3 4 1
>>>>>>>>>>>#b4 4 1
>>>>>>>>>>>#result from my code.??
>>>>>>>>>>>?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>>>>>>
>>>>>>>>>>>res2<-split(lista,names(lista))
>>>>>>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]], function(x)
table(x$mm[x[["b"]]<0.01]))))
>>>>>>>>>>>?names(res4)<- names(res2)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>res4
>>>>>>>>>>>$group_a
>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>#a1 3 1
>>>>>>>>>>>#a2 3 1
>>>>>>>>>>>#a3 3 1
>>>>>>>>>>>
>>>>>>>>>>>#$group_b
>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>#b1 3 1
>>>>>>>>>>>#b2 3 1
>>>>>>>>>>>
>>>>>>>>>>>#$group_c
>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>#c1 3 1
>>>>>>>>>>>
>>>>>>>>>>>There is a difference in output from
freq.i and res4.? There were only two files under 'group_b`.? So, check your
codes.
>>>>>>>>>>>A.K.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>________________________________
>>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>Sent: Monday, February 18, 2013
10:27 AM
>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Hi!!!
>>>>>>>>>>>
>>>>>>>>>>>I'm coming to ask a new
question.
>>>>>>>>>>>
>>>>>>>>>>>I want a function to do my
statistics. I start with you had send me:
>>>>>>>>>>>
>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>?indx<-gsub("[./]","",list.dirs())
>>>>>>>>>>>?indx1<- indx[indx!=""]
>>>>>>>>>>>?print(indx1)
>>>>>>>>>>>?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
>>>>>>>>>>>?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>?print(lista)
>>>>>>>>>>>?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
>>>>>>>>>>>?}
>>>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados.lixo",23)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>In my lista?I?can?t merge rows to
have the group, because the idea is for each file count? frequencies of mm, when
b<0.01. after that I want a graph like the graph in attach.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>When I had 2 groups and knew the
name of the groups, I did the code (but Know I have more groups and, maybe, I
don?t know the name of the groups):
>>>>>>>>>>>
>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>?#reading data
>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>
>>>>>>>>>>>?lista<-lapply(direct,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>?listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>?listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>
>>>>>>>>>>>?#count different z values
>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>>>>>??}
>>>>>>>>>>>
>>>>>>>>>>>?#Relative freqs to construct the
graph
>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>
>>>>>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of data with
FDR<0.01
>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>??????? }
>>>>>>>>>>>
>>>>>>>>>>>?for (i in 1:length(listaT)) {
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>>>??dct<-table(factor(dct$z,
levels=cab))
>>>>>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of data with
FDR<0.01
>>>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>??dct1<-table(factor(dct1$z,
levels=cab))
>>>>>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>??????? }
>>>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>>>
>>>>>>>>>>>#Graph plot
>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>>>#average of the group (except
c1&t1)
>>>>>>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>>>
>>>>>>>>>>>#chisquare test function
>>>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>>>?somax<-sum(x)
>>>>>>>>>>>?somay<-sum(y)
>>>>>>>>>>>?nj.<-x+y
>>>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>>>?return(pvalue)
>>>>>>>>>>>?}
>>>>>>>>>>>
>>>>>>>>>>>#pvalues of the chisquare test
between sample and average (H0: two samples has the same distribution)
>>>>>>>>>>>pvalues<-c()
>>>>>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>>>}
>>>>>>>>>>>#data frame with final p-values
>>>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>>>print(dataframe)
>>>>>>>>>>>}
>>>>>>>>>>>z.plot("C:/Users/Vera/Desktop/data",23)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Thank you again
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2013/2/17 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>
>>>>>>>>>>>HI Vera,
>>>>>>>>>>>>
>>>>>>>>>>>>No problem.? I am cc:ing to
r-help.
>>>>>>>>>>>>
>>>>>>>>>>>>A.K.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>________________________________
>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>Sent: Sunday, February 17, 2013
5:44 AM
>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>Hi. Thank you. It works now:-)
>>>>>>>>>>>>And yes, I use windows.
>>>>>>>>>>>>Thank you very much.
>>>>>>>>>>>>No dia 17 de Fev de 2013 00:44,
"arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>
>>>>>>>>>>>>Hi Vera,
>>>>>>>>>>>>>
>>>>>>>>>>>>>Have you tried the
suggestion?
>>>>>>>>>>>>>
>>>>>>>>>>>>>Are you using Windows?
>>>>>>>>>>>>>Thanks,
>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>>Sent: Saturday, February 16,
2013 7:10 PM
>>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>>In mine, I have an error
" 'what' must be a character string or a function".
>>>>>>>>>>>>>I need to do equivalent in
my system.
>>>>>>>>>>>>>Thank you and sorry one more
time.
>>>>>>>>>>>>>No dia 16 de Fev de 2013
23:53, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>
>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>You didn't mention
what the error message or whether you are reading file names which are? not
"mmmmm11kk.txt".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>It is workiing on my
system as I run it again.
>>>>>>>>>>>>>>?c() combine values into
a vector or list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>?sessionInfo()
>>>>>>>>>>>>>>R version 2.15.1
(2012-06-22)
>>>>>>>>>>>>>>Platform:
x86_64-pc-linux-gnu (64-bit)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>locale:
>>>>>>>>>>>>>>?[1]
LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>>>>>>>>>>>>>?[3]
LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>>>>>>>>>>>>>?[5]
LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>>>>>>>>>>>>>?[7]
LC_PAPER=C???????????????? LC_NAME=C????????????????
>>>>>>>>>>>>>>?[9]
LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>>>>>>>>>>>>>[11]
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>attached base packages:
>>>>>>>>>>>>>>[1] stats???? graphics?
grDevices utils???? datasets? methods?? base????
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>other attached packages:
>>>>>>>>>>>>>>[1] stringr_0.6.2?
reshape2_1.2.2
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>loaded via a namespace
(and not attached):
>>>>>>>>>>>>>>[1] plyr_1.8
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>#code
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>#result
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>res3
>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>#$group_a$a1
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>$group_a$a2
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>$group_a$a3
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>$group_b
>>>>>>>>>>>>>>$group_b$b1
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>$group_b$b2
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>$group_c
>>>>>>>>>>>>>>$group_c$c1
>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>1?? aAA? 1? 2? 739
0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>2 aAAAA? 1? 2 2263
0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>3??? aA? 2? 1??? 1
0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>4?? aAA? 1? 2 1965
0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>5? aAAA? 1? 3 3660
0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>6??? AA na? 2 1972
0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>Sent: Saturday, February
16, 2013 6:32 PM
>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Sorry again... In:
>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>>>>>>>>>>>>What is this c? In
do.call(c,?? When I put this row im R, I have an error.
>>>>>>>>>>>>>>Thank you
>>>>>>>>>>>>>>No dia 15 de Fev de 2013
18:11, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>No problem.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>BTW, these questions
are not stupid..
>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>Sent: Friday,
February 15, 2013 1:08 PM
>>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Thank you very much.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>I will try to apply
and after I tell you if it is ok :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Thank you and sorry
about this questions (sometimes stupid questions).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>2013/2/15 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>No problem.
>>>>>>>>>>>>>>>>?c() for
concatenate to vector or list().
>>>>>>>>>>>>>>>>If I use
do.call(cbind,..) or do.call(rbind,...)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>#?? [,1]???
[,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>>>>>>>>>>>>>#a1 List,11
List,11 List,11 List,11 List,11 List,11
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>#???? a1????
>>>>>>>>>>>>>>>>#[1,] List,11
>>>>>>>>>>>>>>>>#[2,] List,11
>>>>>>>>>>>>>>>>#[3,] List,11
>>>>>>>>>>>>>>>>#[4,] List,11
>>>>>>>>>>>>>>>>#[5,] List,11
>>>>>>>>>>>>>>>>#[6,] List,11
>>>>>>>>>>>>>>>>ie.
>>>>>>>>>>>>>>>>list within in a
list
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>>>>>>>>>>>>?str(restrial)
>>>>>>>>>>>>>>>>#List of 6
>>>>>>>>>>>>>>>># $ :List of 1
>>>>>>>>>>>>>>>>? #..$
a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>? .#. ..$ Id:
chr [1:6] "aAA" "aAAAA" "aA" "aAA" ...
>>>>>>>>>>>>>>>>? #.. ..$ M :
chr [1:6] "1" "1" "2" "1" ...
>>>>>>>>>>>>>>>>? #. ..$ mm: int
[1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>? #. ..$ x : int
[1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>?
-----------------------------------------------------------------
>>>>>>>>>>>>>>>>str(res)
>>>>>>>>>>>>>>>>#List of 6
>>>>>>>>>>>>>>>># $
a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>?# ..$ Id: chr
[1:6] "aAA" "aAAAA" "aA" "aAA" ...
>>>>>>>>>>>>>>>>? #..$ M : chr
[1:6] "1" "1" "2" "1" ...
>>>>>>>>>>>>>>>>?# ..$ mm: int
[1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>?# ..$ x : int
[1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>You mentioned
about naming this to "group_a","group_b". etc..
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>?res3$group_a
>>>>>>>>>>>>>>>>$a1
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>#???? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>#1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>#2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>#4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>#5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>#6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>#$a2
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>#???? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>#1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>#2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>#4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>#5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>#6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>#$a3
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>?# ?? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>#1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>#2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>#4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>#5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>#6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>Sent: Friday,
February 15, 2013 12:39 PM
>>>>>>>>>>>>>>>>Subject: Re:
reading data
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Thank you very
much and sorry my questions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>But this code
isn't grouping for letters sure? I mean, a1,a2,a3 is the same group, (the
first letter give me the name of the group)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Another
question, in do.call, you did do.call (c,.....) .What is c?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Sorry
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>2013/2/15 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Just to add:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>res[grep("group_b",names(res))]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I am not
sure how you want the grouped data to look like.? If you want something like
this:
>>>>>>>>>>>>>>>>>res1<-do.call(rbind,res)
>>>>>>>>>>>>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>>>>>>>>>>>>res2
>>>>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?# ??? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>#1??? aAA?
1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#2? aAAAA?
1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#3???? aA?
2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#4??? aAA?
1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#5?? aAAA?
1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#6???? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>#7??? aAA?
1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#8? aAAAA?
1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#9???? aA?
2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#10?? aAA?
1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#11? aAAA?
1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#12??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>#13?? aAA?
1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#14 aAAAA?
1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#15??? aA?
2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#16?? aAA?
1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#17? aAAA?
1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#18??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>?# ??? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>#1??? aAA?
1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#2? aAAAA?
1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#3???? aA?
2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#4??? aAA?
1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#5?? aAAA?
1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#6???? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>#7??? aAA?
1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#8? aAAAA?
1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#9???? aA?
2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#10?? aAA?
1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#11? aAAA?
1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#12??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?# ?? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>#1?? aAA? 1?
2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#2 aAAAA? 1?
2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#4?? aAA? 1?
2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#5? aAAA? 1?
3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#6??? AA na?
2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>#or if you
want it like this:
>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>res2[["group_b"]]
>>>>>>>>>>>>>>>>>
>
>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>#???? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>#1?? aAA? 1?
2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#2 aAAAA? 1?
2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#4?? aAA? 1?
2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#5? aAAA? 1?
3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#6??? AA na?
2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>?# ?? Id? M
mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>#1?? aAA? 1?
2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>#2 aAAAA? 1?
2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>#3??? aA? 2?
1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>#4?? aAA? 1?
2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>#5? aAAA? 1?
3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>#6??? AA na?
2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Hope this
helps.
>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>-----
Original Message -----
>
>>>>>>>>>>>>>>>>>From:
"veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>To:
smartpink111 at yahoo.com
>>>>>>>>>>>>>>>>>Cc:
>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 9:15 AM
>>>>>>>>>>>>>>>>>Subject:
reading data
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>I post
yesterday and you helped me. I have little problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>At first, I
never worked with regular expressions...
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>The code
that you gave me it's ok, but my files are inside the folders a1,a2,a3. I
try to explain better.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I have one
folder named "data". Inside this folder I have some other folders
named "a1","a2","b1",b2",...and inside of
each one of that I have some files. I want only the file "mmmmmm.txt"
(in all folders I have One file with this name).
>>>>>>>>>>>>>>>>>The name of
the folder give me the name of the group,but I need to read the file inside. And
after, have "group_a", group_"b"...because I need to work
with this data grouped (and know the name of the group).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>????????????????????????????????
>>>>>>>>>>>
>>>>>>>>>>????????
>>>>>>>>>?
>>>>>>>>????????????????????????????????????????????
>>>>>>>?
>>>>>>????????????????????????????????????
>>>>>
>>>>>
>>>>?????????????????????????
>>>
>>???? ???
>?????????

arun

2013-Feb-26 14:23 UTC

head link

[R] reading data

Hi,
Try this:

files<-paste("MSMS_",23,"PepInfo.txt",sep="") 
read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
res2<-split(lista,names(lista)) 
res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
#Freq FDR<0.01 
res4<-lapply(seq_along(res3),function(i) lapply(res3[[i]],function(x)
x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
names(res4)<- names(res2)
?res4New<-lapply(res4,function(x) lapply(names(x),function(i)
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
?
res5<- lapply(res4New,function(x) if(length(x)>1) tail(x,-1) else NULL) 
library(plyr) 
library(data.table) 
res6<- lapply(res5,function(x) lapply(x,function(x1) {x1<-data.table(x1);
x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]}))
?res7<-lapply(res6,function(x) lapply(x,function(x1)
{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s",
"", unlist(strsplit(x2,
",")))));x3<-as.data.frame(x1);names(x3)[6]<-
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
?res8<-lapply(res7,function(x) Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),x))
?res9<-res8[lapply(res8,length)!=0] 
?res10<- Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),res9)
head(res10,3)
?# ?????????????????? Seq??????? Mod z c2 c3 t2
#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2 NA NA? 1
#2? aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2 NA NA? 1
#3?????? aAAAAAAAAAGAAGGR 1-n_acPro/ 2? 1 NA? 1
A.K.
________________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, February 26, 2013 5:15 AM
Subject: Re: reading data


Sorry, I only see now your last email.

I have at the moment 8 folder, but I can have more. I need to work in general.

Thank you



2013/2/25 arun <smartpink111 at yahoo.com>

I sent the solution.? But, I need to know how many folders you have for the
analysis because I manually inserted the names at the end.? It works if there
are not many folders.? Otherwise, need to add it in the
program.>
>
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Monday, February 25, 2013 10:01 AM
>Subject: Re: reading data
>
>
>Hi.
>
>Is?from the attached dataset, but without a1
>
>
>
>
>2013/2/25 arun <smartpink111 at yahoo.com>
>
>
>>
>>Hi,
>>Are you sure that the output is from the attached dataset:
>>
>>I am getting the result for aa: with 111 rows:
>>?aa
>>???????????????????????????????????????
Seq??????????????????????????????? Mod
>>1??????????????????? aAAAAAAAAAAAAAATATAGPR????????????????????????
1-n_acPro/
>>2???????????????????? aAAAAAAAAAAASSPVGVGQR????????????????????????
1-n_acPro/
>>3????????????????????????? aAAAAAAAAAGAAGGR????????????????????????
1-n_acPro/
>>4???????????????????? aAAAAAAAGAAGGRGSGPGRR????????????????????????
1-n_acPro/
>>5?????????????????????????????? AAAAAAAkAAK???????????????????????????
8-K_ac/
>>6??????????????????????????????
AAAAAAALQAK??????????????????????????????????
>>7??????????????????????????? aAAAAAGAGPEMVR????????????????????????
1-n_acPro/
>>8?????????????????????????? aAAAAATAAAAASIR????????????????????????
1-n_acPro/
>>9???????????????? AAAAAEQQQFyLLLGNLLSPDNVVR??????????????????????????
11-Y_ph/
>>10??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR???????????????
1-<_Carbamoylation/
>>11??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR???????????????
1-<_Carbamoylation/
>>12??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????????????????????
1-n_acPro/
>>13??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????????????????????
1-n_acPro/
>>14?????????????????????????????
AAAAAPGTAEK??????????????????????????????????
>>15?????????????????????????? aAAAAQGGGGGEPR????????????????????????
1-n_acPro/
>>16??????????????????? aAAAASAPQQLSDEELFSQLR????????????????????????
1-n_acPro/
>>17????????????????????????? aAAAAVGNAVPCGAR????????????????????????
1-n_acPro/
>>18???????????????????????
AAAAAWEEPSSGNGTAR??????????????????????????????????
>>19????????????????????????????? aAAAELSLLEK????????????????????????
1-n_acPro/
>>20????????????????????????????? aAAAELSLLEK????????????????????????
1-n_acPro/
>>21????????????????????????????
AAAAEVLGLILR??????????????????????????????????
>>22????????????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????????????????????
1-n_acPro/
>>23????????? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR???????????????
1-<_Carbamoylation/
>>24????????? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????????????????????
1-n_acPro/
>>25???????????????????? aAAAKPNNLSLVVHGPGDLR????????????????????????
1-n_acPro/
>>26???????? aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????????????????????
1-n_acPro/
>>27???????????????? aAAAVGAGHGAGGPGAASSSGGAR????????????????????????
1-n_acPro/
>>28???????????????? aAAAVGAGHGAGGPGAASSSGGAR????????????????????????
1-n_acPro/
>>29??????????????????????????????? aAAAVQGGR????????????????????????
1-n_acPro/
>>30?????????????????????????????? aAAAVVEFQR???????????????
1-<_Carbamoylation/
>>31?????????????????????????????? aAAAVVEFQR????????????????????????
1-n_acPro/
>>32??????????????????????????? aAAAVVVPAEWIK????????????????????????
1-n_acPro/
>>33???????????????????? aAADGDDSLYPIAVLIDELR????????????????????????
1-n_acPro/
>>34???????????????????? aAADGDDSLYPIAVLIDELR????????????????????????
1-n_acPro/
>>35??????????????????????????
AAADLMAYCEAHAK??????????????????????????????????
>>36??????????????????????????
AAADLMAYCEAHAK??????????????????????????????????
>>37????????? aAAEAANCIMEVSCGQAESSEKPNAEDMTSK????????????????????????
1-n_acPro/
>>38????????????????????
AAAEIYEEFLAAFEGSDGNK??????????????????????????????????
>>39????????????????????????????
AAAEVAGQFVIK??????????????????????????????????
>>40??????????????????
AAAIGIDLGTTYSCVGVFQHGK??????????????????????????????????
>>41??????????????????
AAAIGIDLGTTYSCVGVFQHGK??????????????????????????????????
>>42???????????????????????
AAALATVNAWAEQTGMK??????????????????????????????????
>>43??????????????????????????????
AAAMANNLQK??????????????????????????????????
>>44??????????????????
AAAPAPEEEMDECEQALAAEPK??????????????????????????????????
>>45?????????????????????
AAAQLLQSQAQQSGAQQTK??????????????????????????????????
>>46???????????????????????????
AAATPESQEPQAK??????????????????????????????????
>>47??????????????????? aAAVAAAGAGEPQSPDELLPK????????????????????????
1-n_acPro/
>>48????????? aAAVLSGPSAGSAAGVPGGTGGLSAVSSGPR????????????????????????
1-n_acPro/
>>49??????????? AAAVVGInSETIMKPASISEEELLNLINK??????????????????
8-N_Deamidation/
>>50??????????
AAAYNLVQHGITNLCVIGGDGSLTGANIFR??????????????????????????????????
>>51???????????????????????????? aADTQVSETLKR????????????????????????
1-n_acPro/
>>52?????????????????????? aAEAADLGLGAAVPVELR????????????????????????
1-n_acPro/
>>53??????????????????????????
AAEDDEDDDVDTKK??????????????????????????????????
>>54?????????????????????????????
AAEEPSKVEEK??????????????????????????????????
>>55?????????????????????????????
AAEEPSKVEEK??????????????????????????????????
>>56?????????????????
AAEGGLSSPEFSELCIWLGSQIK??????????????????????????????????
>>57????????????????????
AAELIANSLATAGDGLIELR??????????????????????????????????
>>58????????????????????
AAELIANSLATAGDGLIELR??????????????????????????????????
>>59??????????????????????????????
AAELLMSCFR??????????????????????????????????
>>60?????????????????????????? aAEPNKTEIQTLFK????????????????????????
1-n_acPro/
>>61????????????????
AAEQILEDMITIDVENVMEDICSK??????????????????????????????????
>>62???????????????????????? AAEsETPGKSPEKKPK???????????????????????????
4-S_ph/
>>63???????????????????????? AAEsETPGKSPEKKPK???????????????????????????
4-S_ph/
>>64?????????????????????
AAESLADPTEYENLFPGLK??????????????????????????????????
>>65?????????????????????
AAFDDAIAELDTLSEESYK??????????????????????????????????
>>66?????????????????????
AAFDDAIAELDTLSEESYK??????????????????????????????????
>>67????????????????????????
AAFECMYTLLDSCLDR??????????????????????????????????
>>68??????????
AAGAGLPESVIWAVNAGGEAHVDVHGIHFR??????????????????????????????????
>>69???????????????????????? aAGGDGAEAPAKKDVK????????????????????????
1-n_acPro/
>>70????????????????????????
AAGGGAGSSEDDAQSR??????????????????????????????????
>>71???????????????????????????
AAGHPGDPESQQR??????????????????????????????????
>>72???????????????????????????
AAGHPGDPESQQR??????????????????????????????????
>>73?????????????????????????????????? AAGkFK???????????????????????????
4-K_me/
>>74?????????????????
AAGLATMISTMRPDIDNMDEYVR??????????????????????????????????
>>75???????????????????????? aAGTAAALAFLSQESR????????????????????????
1-n_acPro/
>>76??????????????????????????? aAGTLYTYPENWR????????????????????????
1-n_acPro/
>>77??????????????????????????? aAGTSSYWEDLRK????????????????????????
1-n_acPro/
>>78?????????????????????????
AAGTVFTTVEDLGSK??????????????????????????????????
>>79???????????????????????? aAGVEAAAEVAATEIK????????????????????????
1-n_acPro/
>>80???????????????????????????
AAGVGDMVMATVK??????????????????????????????????
>>81????????????????????????
AAGVNVEPFWPGLFAK??????????????????????????????????
>>82??????????????????????????????
AAGVVLEMIR??????????????????????????????????
>>83????????????????????
AAHIFFTDTCPEPLFSELGR??????????????????????????????????
>>84?????????????????????????????
AAHLCAEAALR??????????????????????????????????
>>85??????????????????????????????? AAHnKDVLR??????????????????
4-N_Deamidation/
>>86?????????????????????????????
AAHVEYSTAAR??????????????????????????????????
>>87?????????????????????????????
AAHVEYSTAAR??????????????????????????????????
>>88??????????????????????
AAIAQALAGEVSVVPPSR??????????????????????????????????
>>89?????????????????????????????
AAIISAEGDSK??????????????????????????????????
>>90??????? AALAAEVKkPAAAAAPGTAEkLSPkATTASQAk
9-K_me2/21-K_me2/25-K_me2/33-K_me/
>>91????????????????????????????
AALAFGFLDLLK??????????????????????????????????
>>92??????????
AALAGGTTMIIDHVVPEPGTSLLAAFDQWR??????????????????????????????????
>>93??????????????????????
AALAHSEEVTASQVAATK??????????????????????????????????
>>94??????????????????????????
AALCHFCIDMLNAK??????????????????????????????????
>>95?????????????????????? aALDSLSLFTSLGLSEQK????????????????????????
1-n_acPro/
>>96???????????????????????????
AALEALGSCLNNK??????????????????????????????????
>>97???????????????????????????
AALEAQNALHNMK??????????????????????????????????
>>98?????????????????????
AALETDENLLLCAPTGAGK??????????????????????????????????
>>99???????????????????????
AALGPLVTGLYDVQAFK??????????????????????????????????
>>100????????????????????? aALGVLESDLPSAVTLLK????????????????????????
1-n_acPro/
>>101??????????????????????????
AALLETLSLLLAK??????????????????????????????????
>>102??????????? AALPGILSELDVDVnEGSLMELQGHIGR?????????????????
15-N_Deamidation/
>>103
AALPSHVVTMLDNFPTNLHPMSQLSAAVTALNSESNFAR??????????????????????????????????
>>104 AALPSHVVTMLDNFPTnLHPMSQLSAAVTALNSESNFAR?????????????????
17-N_Deamidation/
>>105????????????????????????????
AALSALESFLK??????????????????????????????????
>>106????????????????????????????
AALSEEELEKK??????????????????????????????????
>>107??????????????????????? aALTAEHFAALQSLLK????????????????????????
1-n_acPro/
>>108??????????????????????????
AAMADTFLEHMCR??????????????????????????????????
>>109???????????????????????????
AAMEALVVEVTK??????????????????????????????????
>>110?????????????????????
AAMFTAGSNFNHVVQNEK??????????????????????????????????
>>111???????????????????? aANATTNPSQLLPLELVDK????????????????????????
1-n_acPro/
>>??? z counts.x.x counts.y.x counts counts.x.y counts.y.y
>>1?? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>2?? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>3?? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>4?? 2????????? 1????????? 1???? NA???????? NA???????? NA
>>5?? 2???????? NA????????? 1???? NA???????? NA???????? NA
>>6?? 2???????? NA???????? NA????? 1???????? NA????????? 1
>>7?? 2????????? 1????????? 2????? 1????????? 1????????? 2
>>8?? 2????????? 1????????? 1???? NA????????? 1???????? NA
>>9?? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>10? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>11? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>12? 2???????? NA????????? 1????? 1???????? NA???????? NA
>>13? 3????????? 1????????? 2????? 2???????? NA????????? 1
>>14? 2????????? 1????????? 1???? NA???????? NA????????? 1
>>15? 2???????? NA???????? NA???? NA????????? 1???????? NA
>>16? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>17? 2???????? NA????????? 1????? 1???????? NA????????? 1
>>18? 2???????? NA????????? 1????? 1???????? NA????????? 1
>>19? 1???????? NA???????? NA???? NA???????? NA????????? 1
>>20? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>21? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>22? 3???????? NA????????? 1????? 1???????? NA????????? 1
>>23? 3???????? NA???????? NA????? 1???????? NA???????? NA
>>24? 3????????? 1???????? NA???? NA???????? NA????????? 1
>>25? 3???????? NA????????? 1???? NA???????? NA???????? NA
>>26? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>27? 2???????? NA????????? 1????? 1????????? 1???????? NA
>>28? 3???????? NA???????? NA????? 1????????? 1???????? NA
>>29? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>30? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>31? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>32? 2????????? 1???????? NA???? NA????????? 1???????? NA
>>33? 2????????? 1???????? NA????? 1????????? 1???????? NA
>>34? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>35? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>36? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>37? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>38? 2????????? 1???????? NA???? NA????????? 1???????? NA
>>39? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>40? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>41? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>42? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>43? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>44? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>45? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>46? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>47? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>48? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>49? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>50? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>51? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>52? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>53? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>54? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>55? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>56? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>57? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>58? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>59? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>60? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>61? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>62? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>63? 4????????? 1???????? NA???? NA???????? NA???????? NA
>>64? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>65? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>66? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>67? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>68? 5????????? 1???????? NA???? NA???????? NA???????? NA
>>69? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>70? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>71? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>72? 3????????? 2???????? NA???? NA???????? NA???????? NA
>>73? 1????????? 1???????? NA???? NA???????? NA???????? NA
>>74? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>75? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>76? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>77? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>78? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>79? 2???????? 11???????? NA???? NA???????? NA???????? NA
>>80? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>81? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>82? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>83? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>84? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>85? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>86? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>87? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>88? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>89? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>90? 5????????? 1???????? NA???? NA???????? NA???????? NA
>>91? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>92? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>93? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>94? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>95? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>96? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>97? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>98? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>99? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>100 2????????? 1???????? NA???? NA???????? NA???????? NA
>>101 2????????? 1???????? NA???? NA???????? NA???????? NA
>>102 3????????? 1???????? NA???? NA???????? NA???????? NA
>>103 3????????? 1???????? NA???? NA???????? NA???????? NA
>>104 4????????? 1???????? NA???? NA???????? NA???????? NA
>>105 2????????? 1???????? NA???? NA???????? NA???????? NA
>>106 2????????? 1???????? NA???? NA???????? NA???????? NA
>>107 2????????? 1???????? NA???? NA???????? NA???????? NA
>>108 3????????? 1???????? NA???? NA???????? NA???????? NA
>>109 2????????? 1???????? NA???? NA???????? NA???????? NA
>>110 3????????? 1???????? NA???? NA???????? NA???????? NA
>>111 2????????? 1???????? NA???? NA???????? NA???????? NA
>>
>>________________________________
>>From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Monday, February 25, 2013 8:56 AM
>>Subject: Re: reading data
>>
>>
>>You're correct, but my real data have +- 40000 row, and I can have
duplicated rows. I group number of spec if data has the same Seq, mod and z.
>>
>>For the data in attach , if I do the code (only for c and t),
>>
>>c1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>c2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>c3 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c3/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>t1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>t2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>dc1<-c1[ifelse(c1$FDR<0.01, TRUE, FALSE),]
>>dc2<-c2[ifelse(c2$FDR<0.01, TRUE, FALSE),]
>>dc3<-c3[ifelse(c2$FDR<0.01, TRUE, FALSE),]
>>dt1<-t1[ifelse(t1$FDR<0.01, TRUE, FALSE),]
>>dt2<-t2[ifelse(t2$FDR<0.01, TRUE, FALSE),]
>>bc1<- aggregate(spec ~ Seq + Mod+z, data = dc1, paste, collapse =
",")
>>bc2<- aggregate(spec ~ Seq + Mod+z, data = dc2, paste, collapse =
",")
>>bc3<- aggregate(spec ~ Seq + Mod+z, data = dc3, paste, collapse =
",")
>>bt1<- aggregate(spec ~ Seq + Mod+z, data = dt1, paste, collapse =
",")
>>bt2<- aggregate(spec ~ Seq + Mod+z, data = dt2, paste, collapse =
",")
>>bc1$counts <- sapply(bc1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>bc2$counts <- sapply(bc2$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>bc3$counts <- sapply(bc3$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>bt1$counts <- sapply(bt1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>bt2$counts <- sapply(bt2$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>bc1<-bc1[,-4]
>>bc2<-bc2[,-4]
>>bc3<-bc3[,-4]
>>bt1<-bt1[,-4]
>>bt2<-bt2[,-4]
>>a1<-merge(bc1,bc2,by=c("Seq","Mod","z"),all=TRUE)
>>a2<-merge(a1,bc3,by=c("Seq","Mod","z"),all=TRUE)
>>a3<-merge(bt1,bt2,by=c("Seq","Mod","z"),all=TRUE)
>>aa<-merge(a2,a3,by=c("Seq","Mod","z"),all=TRUE)
>>aa
>>
>>
>>
>>
>>
>>I have the output
>>
>>
>>?????????????????????????????????????? Seq???????????????? Mod z
counts.x.x counts.y.x counts.x.y counts.y.y
>>1???????????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2????????
NA????????? 1????????? 1????????? 1
>>2??????????????????? aAAAAAAAGAAGGRGSGPGRR????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>3?????????????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2????????
NA????????? 2????????? 1????????? 2
>>4????????????????????????? aAAAAATAAAAASIR????????? 1-n_acPro/
2????????? 1???????? NA????????? 1???????? NA
>>5??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 2????????
NA????????? 1???????? NA???????? NA
>>6??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/
3????????? 1????????? 2???????? NA????????? 1
>>7??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/
2???????? NA???????? NA???????? NA????????? 1
>>8??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/
3???????? NA???????? NA???????? NA????????? 1
>>9????????????????????????????? AAAAAPGTAEK????????????????????
2????????? 1????????? 1???????? NA???????? NA
>>10???????????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1????????
NA???????? NA???????? NA????????? 1
>>11???????????????????????????? aAAAELSLLEK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>12??????????????????????????? AAAAEVLGLILR???????????????????? 2????????
NA????????? 1???????? NA????????? 1
>>13???????? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/
3????????? 1???????? NA???????? NA????????? 1
>>14?????????????????????????? aAAAVVVPAEWIK????????? 1-n_acPro/
2????????? 1???????? NA????????? 1???????? NA
>>15??????????????????? aAADGDDSLYPIAVLIDELR????????? 1-n_acPro/
2????????? 1???????? NA????????? 1???????? NA
>>16??????????????????? aAADGDDSLYPIAVLIDELR????????? 1-n_acPro/ 3????????
NA???????? NA????????? 1???????? NA
>>17????????????????????????? AAADLMAYCEAHAK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>18????????????????????????? AAADLMAYCEAHAK????????????????????
3????????? 1???????? NA????????? 1???????? NA
>>19???????? aAAEAANCIMEVSCGQAESSEKPNAEDMTSK????????? 1-n_acPro/
3????????? 1???????? NA????????? 1???????? NA
>>20??????????????????? AAAEIYEEFLAAFEGSDGNK????????????????????
2????????? 1???????? NA????????? 1???????? NA
>>21????????????????? AAAIGIDLGTTYSCVGVFQHGK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>22????????????????? AAAIGIDLGTTYSCVGVFQHGK????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>23?????????????????????? AAALATVNAWAEQTGMK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>24????????????????? AAAPAPEEEMDECEQALAAEPK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>25???????????????????? AAAQLLQSQAQQSGAQQTK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>26?????????????????????????? AAATPESQEPQAK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>27?????????????????? aAAVAAAGAGEPQSPDELLPK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>28?????????? AAAVVGInSETIMKPASISEEELLNLINK??? 8-N_Deamidation/
3????????? 1???????? NA???????? NA???????? NA
>>29??????????????????????????? aADTQVSETLKR????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>30????????????????????????? AAEDDEDDDVDTKK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>31???????????????????????????? AAEEPSKVEEK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>32???????????????????????????? AAEEPSKVEEK????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>33???????????????? AAEGGLSSPEFSELCIWLGSQIK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>34??????????????????? AAELIANSLATAGDGLIELR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>35????????????????????????? aAEPNKTEIQTLFK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>36??????????????? AAEQILEDMITIDVENVMEDICSK????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>37???????????????????? AAESLADPTEYENLFPGLK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>38???????????????????? AAFDDAIAELDTLSEESYK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>39???????????????????? AAFDDAIAELDTLSEESYK????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>40??????????????????????? AAFECMYTLLDSCLDR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>41??????????????????????? AAGGGAGSSEDDAQSR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>42?????????????????????????? AAGHPGDPESQQR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>43?????????????????????????? AAGHPGDPESQQR????????????????????
3????????? 2???????? NA???????? NA???????? NA
>>44???????????????? AAGLATMISTMRPDIDNMDEYVR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>45?????????????????????????? aAGTLYTYPENWR????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>46?????????????????????????? aAGTSSYWEDLRK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>47??????????????????????? aAGVEAAAEVAATEIK????????? 1-n_acPro/ 2????????
11???????? NA???????? NA???????? NA
>>48??????????????????????? AAGVNVEPFWPGLFAK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>49???????????????????????????? AAHLCAEAALR????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>50????????? AALAGGTTMIIDHVVPEPGTSLLAAFDQWR????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>51????????????????????????? AALCHFCIDMLNAK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>52????????????????????? aALDSLSLFTSLGLSEQK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>53?????????????????????????? AALEALGSCLNNK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>54?????????????????????????? AALEAQNALHNMK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>55???????????????????? AALETDENLLLCAPTGAGK????????????????????
2????????? 1???????? NA???????? NA???????? NA
>>56????????????????????? aALGVLESDLPSAVTLLK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>57??????????? AALPGILSELDVDVnEGSLMELQGHIGR?? 15-N_Deamidation/
3????????? 1???????? NA???????? NA???????? NA
>>58 AALPSHVVTMLDNFPTnLHPMSQLSAAVTALNSESNFAR?? 17-N_Deamidation/
4????????? 1???????? NA???????? NA???????? NA
>>59 AALPSHVVTMLDNFPTNLHPMSQLSAAVTALNSESNFAR????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>60??????????????????????? aALTAEHFAALQSLLK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>61?????????????????????????? AAMADTFLEHMCR????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>62????????????????????? AAMFTAGSNFNHVVQNEK????????????????????
3????????? 1???????? NA???????? NA???????? NA
>>63???????????????????? aANATTNPSQLLPLELVDK????????? 1-n_acPro/
2????????? 1???????? NA???????? NA???????? NA
>>64???????????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2????????
NA????????? 1???????? NA????????? 1
>>65?????????????????????? AAAAAWEEPSSGNGTAR???????????????????? 2????????
NA????????? 1???????? NA????????? 1
>>66???????????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3????????
NA????????? 1???????? NA????????? 1
>>67??????????????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 2????????
NA????????? 1????????? 1???????? NA
>>68??????????????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 3????????
NA???????? NA????????? 1???????? NA
>>69????????????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
>>70?????????????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
>>71???????????????????????????? AAAAAAALQAK???????????????????? 2????????
NA???????? NA???????? NA????????? 1
>>72?????????????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2????????
NA???????? NA???????? NA????????? 1
>>73??????? aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3????????
NA???????? NA???????? NA????????? 1
>>
>>
>>
>>
>>2013/2/25 arun <smartpink111 at yahoo.com>
>>
>>Hi,
>>>What i said was:the `spec` column didn't change before and after
the aggregate() step.? I think you did aggregate to group it based on Seq, Mod,
z.? In the example you provided, it was already grouped.? May be it is not in
your original dataset.? Anyway, please email me the output you are getting for
your codes.
>>>
>>>Arun
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Monday, February 25, 2013 5:36 AM
>>>
>>>Subject: Re: reading data
>>>
>>>
>>>Sorry, I don't understand what you said.
>>>
>>>I need to
>>>- read data (like the code that you did)
>>>- select only data with FDR<0.01 for?all files
>>>- remove first file of each group (a1,c1,t1,...)
>>>- select only column Seq, Mod, z, spec for all files
>>>-?for each file behind merge data with the same spec, mod an z
(grouping the spec)
>>>- table frequencies of spec like:
>>>
>>>???????????? seq???????c2?????????? c3????????? c4???????????
t1????? ....
>>>?????????? aaaaA???? 0??????????? 2??????????? 5?????????????
6???????????????? this table is how many number I have in spec (in total)
>>>
>>>
>>>I think my small code isn't correct...
>>>
>>>Thank you
>>>
>>>
>>>
>>>2013/2/23 arun <smartpink111 at yahoo.com>
>>>
>>>One more thing:
>>>>The last column 'spec' in the output is already
aggregated based on `Seq`, `Mod`, `z` in the data.new directory.
>>>>?res5[[3]][[1]]
>>>>
>>>>??????????????????????????????? Seq???????????????? Mod z?????
spec
>>>>1??????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2????
11833
>>>>2???????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2????
11833
>>>>3????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2????
13103
>>>>4?????????????????????? AAAAAAALQAK???????????????????? 2?????
3084
>>>>5??????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2
9646,9821 #################check here
>>>>
>>>>6???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2????
33650
>>>>7???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 3????
33607
>>>>9???????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 3????
33769
>>>>11??????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2????
20602
>>>>12????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2????
10018
>>>>13??????????????? AAAAAWEEPSSGNGTAR???????????????????? 2?????
5576
>>>>14????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1????
19662
>>>>16???????????????????? AAAAEVLGLILR???????????????????? 2????
22857
>>>>17????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3????
26060
>>>>18? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/ 3????
21479
>>>>19 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3????
21159
>>>>
>>>>aggregate() doesn't change anything here, especially in this
dataset.
>>>>In the next line you used sapply(....., ), which gives an
output,
>>>>sapply(res6[[3]][[1]]$spec,function(x)
length(gsub("\\s","",unlist(strsplit(x,","))))) #
this I believe is not correct
>>>>#??? 11833???? 11833???? 13103????? 3084 9646,9821???? 33650????
33607???? 33769?? #here you have two `11833` and one `9646.9821`.? Not really
sure what you want here
>>>>?# ????? 1???????? 1???????? 1???????? 1???????? 2????????
1???????? 1???????? 1
>>>>? #? 20602???? 10018????? 5576???? 19662???? 22857???? 26060????
21479???? 21159
>>>>?? # ??? 1???????? 1???????? 1???????? 1???????? 1????????
1???????? 1???????? 1
>>>>
>>>>
>>>>
>>>>If it is:
>>>>?table(unlist(strsplit(res6[[3]][[1]]$spec,",")))
#this makes sense
>>>>
>>>>#10018 11833 13103 19662 20602 21159 21479 22857 26060? 3084
33607 33650 33769
>>>>?# ? 1???? 2???? 1???? 1???? 1???? 1???? 1???? 1???? 1???? 1????
1???? 1???? 1
>>>># 5576? 9646? 9821
>>>>?# ? 1???? 1???? 1?
>>>>
>>>>Now coming to the last `merge` section:
>>>>do you want to merge the counts in each group by
"spec" name: #in this case "Var1"
>>>>
>>>>$group_c
>>>>$group_c$c2
>>>>??? Var1 Freq
>>>>1? 10039??? 1
>>>>2? 13200??? 1
>>>>3? 22929??? 1
>>>>4? 26117??? 1
>>>>5? 33712??? 1
>>>>6? 33774??? 1
>>>>7? 33867??? 1
>>>>8??? 379??? 1
>>>>9?? 4102??? 1
>>>>10? 5664??? 1
>>>>11? 9703??? 1
>>>>12? 9876??? 1
>>>>
>>>>$group_c$c3
>>>>??? Var1 Freq
>>>>1? 10325??? 1
>>>>2? 21555??? 1
>>>>3? 22994??? 1
>>>>4? 26142??? 1
>>>>5?? 3341??? 1
>>>>6? 33708??? 1
>>>>7? 33870??? 1
>>>>8? 34095??? 1
>>>>9?? 4397??? 1
>>>>10? 4416??? 1
>>>>11? 5960??? 1
>>>>
>>>>
>>>>
>>>>A.K.
>>>>
>>>>
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Friday, February 22, 2013 8:36 PM
>>>>
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>Oh,sorry.
>>>>Now,I'm in phone. Tomorrow, i will send.
>>>>Thank you
>>>>No dia 22 de Fev de 2013 22:06, "arun"
<smartpink111 at yahoo.com> escreveu:
>>>>
>>>>Hi,
>>>>>
>>>>>As I mentioned in my earlier post, results that you got from
your code in the same dataset 'data.new' will make it easy for me rather
than figuring out how your code works.
>>>>>Thanks,
>>>>>A.K.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Friday, February 22, 2013 1:13 PM
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>Hi.
>>>>>
>>>>>I use you code and it was a good, good help. Thank you.
>>>>>
>>>>>I'm now doing a new study of the data but I need to
optimize my code.
>>>>>
>>>>>For the same data, I need:
>>>>>
>>>>>- read data (like the code that you did)
>>>>>- select only data with FDR<0.01 for?all files
>>>>>- remove first file of each group (a1,c1,t1,...)
>>>>>- select only column Seq, Mod, z, spec for all files
>>>>>-?for each file behind merge data with the same spec, mod an
z (grouping the spec)
>>>>>- table frequencies of spec like:
>>>>>???????????? seq???????c2?????????? c3?????????
c4??????????? t1????? ....
>>>>>?????????? aaaaA???? 0??????????? 2???????????
5????????????? 6???????????????? this table is how many number I have in spec
(in total)
>>>>>??????????? .....
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>I start doing the code.....
>>>>>
>>>>>
>>>>>spec <- function(directory,number) {
>>>>>??setwd(directory)
>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>
>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>
>>>>>?#boxplots for each run
>>>>>?dcf<-c()
>>>>>?dtf<-c()
>>>>>
>>>>>?for(i in 1:length(lista)){
>>>>>
>>>>>
>>>>>?}
>>>>>
>>>>>?for (i in 2:length(listaC)) {
>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<Pfdr, TRUE,
FALSE),]
>>>>>??dcc1<- aggregate(spec ~ Seq + Mod+z, data = dcc1,
paste, collapse = ",")
>>>>>??dcc1$counts <- sapply(dcc1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>??dcc1<-dcc1[,-4]
>>>>>??dcf<-list(dcf,dcc1)
>>>>>
>>>>>??}
>>>>>?print(dcf)
>>>>>
>>>>>merg<-merge(dcf[[1]][[2]],dcf[[2]],by=c("Seq","Mod","z"),all=TRUE)
>>>>>print(merg)
>>>>>?for (i in 2:length(listaT)) {
>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<Pfdr, TRUE,
FALSE),]
>>>>>??dct1<- aggregate(spec ~ Seq + Mod+z, data = dct1,
paste, collapse = ",")
>>>>>??dct1$counts <- sapply(dct1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>??dct1<-dct1[,-4]
>>>>>??dtf<-list(dtf,dct1)
>>>>>??}
>>>>>}
>>>>>spec("C:/Users/Vera Costa/Desktop/data.new",23)
>>>>>
>>>>>
>>>>>I can doing the new code. The problem is that I need a lot
of time to do this row:
>>>>>dcc1<- aggregate(spec ~ Seq + Mod+z, data = dcc1, paste,
collapse = ",")
>>>>>
>>>>>
>>>>>I have near than 40000 rows.
>>>>>
>>>>>Could you help me to optimize this?
>>>>>
>>>>>Thank you.
>>>>>Vera
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>2013/2/20 Vera Costa <veracosta.rt at gmail.com>
>>>>>
>>>>>Thank you very much.
>>>>>>?
>>>>>>I will try.
>>>>>>?
>>>>>>thank you
>>>>>>
>>>>>>
>>>>>>
>>>>>>2013/2/20 arun <smartpink111 at yahoo.com>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>Hi,
>>>>>>>
>>>>>>>You can change `res4` to:
>>>>>>>lev<-sort(unique(do.call(c,lapply(seq_along(res3),function(i)
do.call(c,lapply(res3[[i]],function(x) unique(x$z)))))))
>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=lev))))))
>>>>>>>
>>>>>>>freqs1<-do.call(rbind,lapply(split(freq.f1,gsub("\\d+","",freq.f1$id)),function(x)
x[-1,])) #here there is only level for a1.? So, it is removed
>>>>>>>?average1<- colMeans(freqs1[,-1])
>>>>>>>?average1
>>>>>>>#??????? 1???????? 2???????? 3
>>>>>>>#0.3333333 8.0000000 3.6666667
>>>>>>>pvalues1<-do.call(rbind,lapply(seq_len(nrow(freqs1)),function(x)
chisq.test(freqs1[x,-1],average1)))
>>>>>>>?row.names(pvalues1)<- row.names(freqs1)
>>>>>>>?pvalues1
>>>>>>>#???????????????? [,1]
>>>>>>>#c.group_c.2 0.7235907
>>>>>>>#c.group_c.3 0.7963287
>>>>>>>#t?????????? 0.9079200
>>>>>>>
>>>>>>>
>>>>>>>A.K.
>>>>>>>
>>>>>>>----- Original Message -----
>>>>>>>
>>>>>>>From: arun <smartpink111 at yahoo.com>
>>>>>>>To: Vera Costa <veracosta.rt at gmail.com>
>>>>>>>Cc: R help <r-help at r-project.org>
>>>>>>>Sent: Tuesday, February 19, 2013 7:29 PM
>>>>>>>Subject: Re: reading data
>>>>>>>
>>>>>>>Hi,
>>>>>>>Try this:
>>>>>>>
>>>>>>>
>>>>>>>files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>>res2<-split(lista,names(lista))
>>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>#Freq whole data
>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=1:3))))))
>>>>>>>names(res4)<- names(res2)
>>>>>>>library(reshape2)
>>>>>>>freq.i1<-do.call(rbind,lapply(res4,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>>>freq.i1
>>>>>>>#????????? id 1? 2 3
>>>>>>>#group_a?? a1 1 12 6
>>>>>>>#group_c.1 c1 0 10 3
>>>>>>>#group_c.2 c2 0 12 3
>>>>>>>#group_c.3 c3 0 13 4
>>>>>>>#group_t.1 t1 0 10 4
>>>>>>>#group_t.2 t2 1 12 6
>>>>>>>
>>>>>>>freq.rel.i1<-
as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) )
>>>>>>>?freq.rel.i1
>>>>>>>?# ???????????????? 1???????? 2???????? 3
>>>>>>>#group_a?? 0.05263158 0.6315789 0.3157895
>>>>>>>#group_c.1 0.00000000 0.7692308 0.2307692
>>>>>>>#group_c.2 0.00000000 0.8000000 0.2000000
>>>>>>>#group_c.3 0.00000000 0.7647059 0.2352941
>>>>>>>#group_t.1 0.00000000 0.7142857 0.2857143
>>>>>>>#group_t.2 0.05263158 0.6315789 0.3157895
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>#Freq with FDR< 0.01
>>>>>>>res5<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z[x[["FDR"]]<0.01],levels=1:3))))))
>>>>>>>names(res5)<- names(res2)
>>>>>>>
>>>>>>>freq.f1<- do.call(rbind,lapply(res5,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>>>
>>>>>>>?freq.f1
>>>>>>>?# ??????? id 1? 2 3
>>>>>>>#group_a?? a1 1 10 5
>>>>>>>#group_c.1 c1 0? 7 2
>>>>>>>#group_c.2 c2 0? 8 2
>>>>>>>#group_c.3 c3 0? 6 4
>>>>>>>#group_t.1 t1 0? 7 4
>>>>>>>#group_t.2 t2 1 10 5
>>>>>>>
>>>>>>>
>>>>>>>freq.rel.f1<-
as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1]))
>>>>>>>
>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i1)))
>>>>>>>par(mfrow=c(1,2))
>>>>>>>barplot(freq.rel.i1,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i1))
>>>>>>>barplot(freq.rel.f1,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f1))
>>>>>>>#change the legend position
>>>>>>>
>>>>>>>Also, didn't check the rest of the code from
chisquare test.
>>>>>>>A.K.
>>>>>>>________________________________
>>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>Sent: Tuesday, February 19, 2013 4:19 PM
>>>>>>>Subject: Re: reading data
>>>>>>>
>>>>>>>
>>>>>>>Here is the code and some outputs.
>>>>>>>
>>>>>>>z.plot <- function(directory,number) {
>>>>>>>?#reading data
>>>>>>>??setwd(directory)
>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>?directT <- direct[grepl("^t", direct)]
>>>>>>>?directC <- direct[grepl("^c", direct)]
>>>>>>>
>>>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>
>>>>>>>?#count different z values
>>>>>>>?cab <- vector()
>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>??????? dc<-table(dc$z)
>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>??}
>>>>>>>
>>>>>>>?#Relative freqs to construct the graph
>>>>>>>??? cab <- unique(cab)
>>>>>>>?print(cab)
>>>>>>>
>>>>>>>###[1] "2" "3" "1"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>?dci<- d[-1,]
>>>>>>>??? dcf <- d[-1,]
>>>>>>>?dti <- d[-1,]
>>>>>>>?dtf <- d[-1,]
>>>>>>>
>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>
>>>>>>>??#Relative freq of all data
>>>>>>>??dcc<-listaC[[i]]
>>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)), do.NULL
= FALSE, prefix = "c")
>>>>>>>
>>>>>>>
>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL
= FALSE, prefix = "c")
>>>>>>>???????? }
>>>>>>>
>>>>>>>
>>>>>>>?for (i in 1:length(listaT)) {
>>>>>>>
>>>>>>>??#Relative freq of all data
>>>>>>>??dct<-listaT[[i]]
>>>>>>>??dct<-table(factor(dct$z, levels=cab))
>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)), do.NULL
= FALSE, prefix = "t")
>>>>>>>
>>>>>>>
>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>??dct1<-table(factor(dct1$z, levels=cab))
>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL
= FALSE, prefix = "t")
>>>>>>>??????? }
>>>>>>>
>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>
>>>>>>>?print(freq.i)
>>>>>>>##????? 2 3 1
>>>>>>>#c1 10 3 0
>>>>>>>#c2 12 3 0
>>>>>>>#c3 13 4 0
>>>>>>>#t1 10 4 0
>>>>>>>#t2 12 6 1
>>>>>>>
>>>>>>>?print(freq.f)
>>>>>>>??###???? 2 3 1
>>>>>>>#c1? 7 2 0
>>>>>>>#c2? 8 2 0
>>>>>>>#c3? 6 4 0
>>>>>>>#t1? 7 4 0
>>>>>>>#t2 10 5 1
>>>>>>>
>>>>>>>?print(freq.rel.i)
>>>>>>>###?????????????? 2???????? 3????????? 1
>>>>>>>#c1 0.7692308 0.2307692 0.00000000
>>>>>>>#c2 0.8000000 0.2000000 0.00000000
>>>>>>>#c3 0.7647059 0.2352941 0.00000000
>>>>>>>#t1 0.7142857 0.2857143 0.00000000
>>>>>>>#t2 0.6315789 0.3157895 0.05263158
>>>>>>>?print(freq.rel.f)
>>>>>>>
>>>>>>>###???????????????? 2???????? 3????? 1
>>>>>>>#c1 0.7777778 0.2222222 0.0000
>>>>>>>#c2 0.8000000 0.2000000 0.0000
>>>>>>>#c3 0.6000000 0.4000000 0.0000
>>>>>>>#t1 0.6363636 0.3636364 0.0000
>>>>>>>#t2 0.6250000 0.3125000 0.0625
>>>>>>>
>>>>>>>#Graph plot
>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>par(mfrow=c(1,2))
>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample with
FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>
>>>>>>>#average of the group (except c1&t1)
>>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>print(average)
>>>>>>>
>>>>>>>###???????????? 2???????? 3???????? 1
>>>>>>>#8.0000000 3.6666667 0.3333333
>>>>>>>
>>>>>>>#chisquare test function
>>>>>>>chisq.test<-function(x,y){
>>>>>>>?somax<-sum(x)
>>>>>>>?somay<-sum(y)
>>>>>>>?nj.<-x+y
>>>>>>>?nj<-sum(nj.)
>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>?return(pvalue)
>>>>>>>?}
>>>>>>>
>>>>>>>#pvalues of the chisquare test between sample and
average (H0: two samples has the same distribution)
>>>>>>>pvalues<-c()
>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>}
>>>>>>>
>>>>>>>
>>>>>>>#data frame with final p-values
>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>print(dataframe)
>>>>>>>
>>>>>>>###? ? sample name??? pvalue
>>>>>>>#1????????? c2 0.7235907
>>>>>>>#2????????? c3 0.7963287
>>>>>>>#3???????????? 0.9079200
>>>>>>>}
>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados",23)
>>>>>>>
>>>>>>>###and two barplots..
>>>>>>>
>>>>>>>
>>>>>>>Here, I remove the group a1.
>>>>>>>
>>>>>>>Thank you
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>>>
>>>>>>>Hi,
>>>>>>>>
>>>>>>>>Could you send the results for the folder that
was sent to me?? It will be easy for me.
>>>>>>>>
>>>>>>>>Arun
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>________________________________
>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>Sent: Tuesday, February 19, 2013 3:47 PM
>>>>>>>>
>>>>>>>>Subject: Re: reading data
>>>>>>>>
>>>>>>>>
>>>>>>>>Oh sorry, I change the folder.
>>>>>>>>
>>>>>>>>I send for your folder
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>2013/2/19 arun <smartpink111 at yahoo.com>
>>>>>>>>
>>>>>>>>Hello,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>? Regarding the results, is it from the same
folder that you sent to me??
>>>>>>>>>I am getting different results by running
your steps.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>direct<- list.files(recursive=TRUE)
>>>>>>>>>? direct
>>>>>>>>>#[1] "a1/MSMS_23PepInfo.txt"
"c1/MSMS_23PepInfo.txt" "c2/MSMS_23PepInfo.txt"
>>>>>>>>>#[4] "c3/MSMS_23PepInfo.txt"
"t1/MSMS_23PepInfo.txt" "t2/MSMS_23PepInfo.txt"
>>>>>>>>>
>>>>>>>>>?directT<-
list.files(recursive=TRUE)[grepl("^t",dir())]
>>>>>>>>>
>>>>>>>>>directT
>>>>>>>>>#[1] "t1/MSMS_23PepInfo.txt"
"t2/MSMS_23PepInfo.txt"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>directC<-
list.files(recursive=TRUE)[grepl("^c",dir())]
>>>>>>>>>
>>>>>>>>>directC
>>>>>>>>>#[1] "c1/MSMS_23PepInfo.txt"
"c2/MSMS_23PepInfo.txt" "c3/MSMS_23PepInfo.txt"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>lista<- lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>>>?
>>>>>>>>>listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>
>>>>>>>>>?#count different z values
>>>>>>>>>?cab <- vector()
>>>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>>>? }
>>>>>>>>>?
>>>>>>>>>?#Relative freqs to construct the graph
>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>?print(cab)
>>>>>>>>>
>>>>>>>>>#[1] "1" "2"
"3"? #Here results are not correct
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>d <- matrix(ncol=length(cab))
>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>
>>>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>>>
>>>>>>>>>??#Relative freq of all data
>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>??????? }
>>>>>>>>>?print(dci) #here too.
>>>>>>>>>
>>>>>>>>>#?? 1? 2 3
>>>>>>>>>#c1 0 10 3
>>>>>>>>>#c2 0 12 3
>>>>>>>>>#c3 0 13 4
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>It is important to clear this before I make
any changes to the script.? You need to send me the output of the same data
folder to understand what is going on.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Arun
>>>>>>>>>________________________________
>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>>Sent: Tuesday, February 19, 2013 9:24 AM
>>>>>>>>>
>>>>>>>>>Subject: Re: reading data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Ok.
>>>>>>>>>
>>>>>>>>>Here is the code and some outputs.
>>>>>>>>>
>>>>>>>>>z.plot <- function(directory,number) {
>>>>>>>>>?#reading data
>>>>>>>>>??setwd(directory)
>>>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>?directT <- direct[grepl("^t",
direct)]
>>>>>>>>>?directC <- direct[grepl("^c",
direct)]
>>>>>>>>>
>>>>>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>
>>>>>>>>>?#count different z values
>>>>>>>>>?cab <- vector()
>>>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>>>??}
>>>>>>>>>
>>>>>>>>>?#Relative freqs to construct the graph
>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>?print(cab)
>>>>>>>>>
>>>>>>>>>###[1] "1" "2"
"3" "4" "5"
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>
>>>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>>>
>>>>>>>>>??#Relative freq of all data
>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>??dcc<-table(factor(dcc$z, levels=cab))
>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>??dcc1<-table(factor(dcc1$z, levels=cab))
>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>??????? }
>>>>>>>>>?print(dci)
>>>>>>>>>
>>>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>>>
>>>>>>>>>?print(dcf)
>>>>>>>>>
>>>>>>>>>###??? 1??? 2??? 3?? 4? 5
>>>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>>>
>>>>>>>>>?for (i in 1:length(listaT)) {
>>>>>>>>>
>>>>>>>>>??#Relative freq of all data
>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>??dct<-table(factor(dct$z, levels=cab))
>>>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>??#Relative freq of data with FDR<0.01
>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>??dct1<-table(factor(dct1$z, levels=cab))
>>>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>??????? }
>>>>>>>>>
>>>>>>>>>?print(dti)
>>>>>>>>>
>>>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>>>
>>>>>>>>>?print(dtf)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>####??? 1??? 2??? 3?? 4? 5
>>>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>?print(freq.i)
>>>>>>>>>##???? 1???? 2??? 3?? 4? 5
>>>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>>>
>>>>>>>>>?print(freq.f)
>>>>>>>>>??###? 1??? 2??? 3?? 4? 5
>>>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>>>
>>>>>>>>>?print(freq.rel.i)
>>>>>>>>>###???????????? 1???????? 2????????
3????????? 4?????????? 5
>>>>>>>>>#c1 0.007395626 0.6644930 0.2879523
0.03578529 0.004373757
>>>>>>>>>#c2 0.005059496 0.6330460 0.3213248
0.03714982 0.003419844
>>>>>>>>>#c3 0.004582389 0.6389834 0.3176493
0.03491119 0.003873772
>>>>>>>>>#c4 0.004829070 0.6415013 0.3143199
0.03638537 0.002964380
>>>>>>>>>#t1 0.002417832 0.6528145 0.3096335
0.03241405 0.002720060
>>>>>>>>>#t2 0.006117670 0.6313148 0.3213210
0.03766190 0.003584572
>>>>>>>>>#t3 0.004115226 0.6314694 0.3239409
0.03650448 0.003969983
>>>>>>>>>#t4 0.006371470 0.6302714 0.3225156
0.03699120 0.003850385
>>>>>>>>>?print(freq.rel.f)
>>>>>>>>>
>>>>>>>>>###????????????? 1???????? 2????????
3????????? 4?????????? 5
>>>>>>>>>#c1 0.0014488554 0.6629962 0.3042596
0.02883222 0.002463054
>>>>>>>>>#c2 0.0005731128 0.6411495 0.3306861
0.02570820 0.001883085
>>>>>>>>>#c3 0.0013010246 0.6413238 0.3323305
0.02325581 0.001788909
>>>>>>>>>#c4 0.0016366612 0.6402619 0.3310147
0.02545008 0.001636661
>>>>>>>>>#t1 0.0006523157 0.6373125 0.3354207
0.02557078 0.001043705
>>>>>>>>>#t2 0.0009707952 0.6271337 0.3405064
0.02912386 0.002265189
>>>>>>>>>#t3 0.0015804359 0.6290967 0.3398769
0.02794876 0.001497255
>>>>>>>>>#t4 0.0011042751 0.6395330 0.3327023
0.02460956 0.002050797
>>>>>>>>>
>>>>>>>>>#Graph plot
>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>
>>>>>>>>>#average of the group (except c1&t1)
>>>>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>print(average)
>>>>>>>>>
>>>>>>>>>###???????? 1????????? 2????????? 3?????????
4????????? 5
>>>>>>>>>?# 14.66667 7827.50000 4114.00000?
319.83333?? 22.83333
>>>>>>>>>
>>>>>>>>>#chisquare test function
>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>?somax<-sum(x)
>>>>>>>>>?somay<-sum(y)
>>>>>>>>>?nj.<-x+y
>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>?return(pvalue)
>>>>>>>>>?}
>>>>>>>>>
>>>>>>>>>#pvalues of the chisquare test between
sample and average (H0: two samples has the same distribution)
>>>>>>>>>pvalues<-c()
>>>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>}
>>>>>>>>>print(pvalues)
>>>>>>>>>##[1] 0.5307206 0.6849480 0.8332661
0.3474956 0.5546527 0.9387602
>>>>>>>>>
>>>>>>>>>#data frame with final p-values
>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>print(dataframe)
>>>>>>>>>
>>>>>>>>>###? sample name??? pvalue
>>>>>>>>>#1????????? c2 0.5307206
>>>>>>>>>#2????????? c3 0.6849480
>>>>>>>>>#3????????? c4 0.8332661
>>>>>>>>>#4????????? t2 0.3474956
>>>>>>>>>#5????????? t3 0.5546527
>>>>>>>>>#6????????? t4 0.9387602
>>>>>>>>>}
>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados",23)
>>>>>>>>>
>>>>>>>>>###and two barplots...
>>>>>>>>>
>>>>>>>>>Thank you
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2013/2/19 arun <smartpink111 at
yahoo.com>
>>>>>>>>>
>>>>>>>>>Got it.
>>>>>>>>>>
>>>>>>>>>>So, if I run your codes that you sent
yesterday, will I get the correct results for relative frequency etc.? It would
be also great if you can sent me the output generated using your codes (on two
groups as you showed yesterday).? It will help me in checking results much
faster than running your code and see if that is the result (because I have to
do some adjustment to your code for running in linux especially the ?dir()).?
>>>>>>>>>>
>>>>>>>>>>I may be able to run it only later.
>>>>>>>>>>
>>>>>>>>>>Arun
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>________________________________
>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>Sent: Tuesday, February 19, 2013 8:53 AM
>>>>>>>>>>
>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>I sent in second email.
>>>>>>>>>>
>>>>>>>>>>But I send again.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>2013/2/19 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Your attachment didn't came
through.
>>>>>>>>>>>
>>>>>>>>>>>Arun
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>________________________________
>>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>Sent: Tuesday, February 19, 2013
8:47 AM
>>>>>>>>>>>
>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Sorry about a lot of questions.
>>>>>>>>>>>
>>>>>>>>>>>I attach a small part of my real
data (I have a lot of row).
>>>>>>>>>>>
>>>>>>>>>>>My main objective is construct
two?graph. The first with the relative frequencies of each group (c1,c2,c3....).
The second with the same frequencies but with FDR<0.01.
>>>>>>>>>>>
>>>>>>>>>>>After that I need to do the average
in each group (but without the first group-c1,t1,a1....) and do the qui square
test to see if the groups has the?same distribution. You understand?
>>>>>>>>>>>
>>>>>>>>>>>At first, I had only two groups, and
I did the code that I sent you. But I need a general code, not for two groups
that I know the names, but for all groups (sometimes I can?have 7 or 8 or 9
groups).
>>>>>>>>>>>
>>>>>>>>>>>it?s better now my explanation??:-)
>>>>>>>>>>>My English isn't also very good
:-)
>>>>>>>>>>>
>>>>>>>>>>>Please not publish this data in
forum...
>>>>>>>>>>>
>>>>>>>>>>>Thank you
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2013/2/18 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>
>>>>>>>>>>>Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>I run the codes to understand
what was going on.?
>>>>>>>>>>>>
>>>>>>>>>>>>I didn't fully understand it
as you constructed the codes for your original dataset and not for the
'data` directory you sent to me.
>>>>>>>>>>>>
>>>>>>>>>>>>A.K.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>________________________________
>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>Sent: Monday, February 18, 2013
4:02 PM
>>>>>>>>>>>>
>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>I don't need the same,but
equivalent. I will try your suggestions.
>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>No dia 18 de Fev de 2013 19:41,
"arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>
>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>I am not able to open your
graph.? I am using linux.
>>>>>>>>>>>>>
>>>>>>>>>>>>>Also, the codes in the
function are not reproducible
>>>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>>>
>>>>>>>>>>>>>It takes double the time to
know what is going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>>dir()
>>>>>>>>>>>>>#[1] "a1"
"a2" "a3" "b1" "b2" "c1"
>>>>>>>>>>>>>
>>>>>>>>>>>>>direct<-
list.files(recursive=TRUE)[grepl("^a|^b",dir())]
>>>>>>>>>>>>>
>>>>>>>>>>>>>?direct
>>>>>>>>>>>>>#[1]
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt"
>>>>>>>>>>>>>#[4]
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
>>>>>>>>>>>>>directA<-
list.files(recursive=TRUE)[grepl("^a",dir())]
>>>>>>>>>>>>>directB<-
list.files(recursive=TRUE)[grepl("^b",dir())]
>>>>>>>>>>>>>lista<-
lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>>>>>>>
>>>>>>>>>>>>>listaA<-lapply(directA,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>listaB<-lapply(directB,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>
>>>>>>>>>>>>>#here I am changing the
names listaT, z, etc..
>>>>>>>>>>>>>
>>>>>>>>>>>>>count different mm values
>>>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>>>??? for (i in
1:length(lista)) {
>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$b<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>??????? dc<-table(dc$mm)
>>>>>>>>>>>>>??????? cab <- c(cab,
names(dc))
>>>>>>>>>>>>>? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#Relative freqs to
construct the graph
>>>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>>>??? d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>>>
>>>>>>>>>>>>>???
########################################
>>>>>>>>>>>>>?for (i in 1:length(listaA))
{
>>>>>>>>>>>>>
>>>>>>>>>>>>>? #Relative freq of all data
>>>>>>>>>>>>>? dcc<-listaA[[i]]
>>>>>>>>>>>>>?
dcc<-table(factor(dcc$mm, levels=cab))
>>>>>>>>>>>>>? dci<- rbind(dci, dcc)
>>>>>>>>>>>>>?
rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>? #Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>?
dcc1<-listaA[[i]][ifelse(listaA[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>?
dcc1<-table(factor(dcc1$mm, levels=cab))
>>>>>>>>>>>>>? dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>>>?
rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>?for (i in 1:length(listaB))
{
>>>>>>>>>>>>>
>>>>>>>>>>>>>? #Relative freq of all data
>>>>>>>>>>>>>? dct<-listaB[[i]]
>>>>>>>>>>>>>?
dct<-table(factor(dct$mm, levels=cab))
>>>>>>>>>>>>>? dti<- rbind(dti, dct)
>>>>>>>>>>>>>?
rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>? #Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>?
dct1<-listaB[[i]][ifelse(listaB[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>?
dct1<-table(factor(dct1$mm, levels=cab))
>>>>>>>>>>>>>? dtf<- rbind(dtf,dct1)
>>>>>>>>>>>>>?
rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>? freq.i<-rbind(dci,dti)
>>>>>>>>>>>>>? freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>>>?
freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>>>?
freq.rel.f<-freq.f/apply(freq.f,1,sum)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>?freq.i
>>>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>>>#a1 4 1
>>>>>>>>>>>>>#a2 4 1
>>>>>>>>>>>>>#a3 4 1
>>>>>>>>>>>>>#b1 4 1
>>>>>>>>>>>>>#b2 4 1
>>>>>>>>>>>>>#b3 4 1
>>>>>>>>>>>>>#b4 4 1
>>>>>>>>>>>>>#result from my code.??
>>>>>>>>>>>>>?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>>>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>>>>>>>>
>>>>>>>>>>>>>res2<-split(lista,names(lista))
>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]], function(x)
table(x$mm[x[["b"]]<0.01]))))
>>>>>>>>>>>>>?names(res4)<-
names(res2)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>res4
>>>>>>>>>>>>>$group_a
>>>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>>>#a1 3 1
>>>>>>>>>>>>>#a2 3 1
>>>>>>>>>>>>>#a3 3 1
>>>>>>>>>>>>>
>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>>>#b1 3 1
>>>>>>>>>>>>>#b2 3 1
>>>>>>>>>>>>>
>>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>>>#c1 3 1
>>>>>>>>>>>>>
>>>>>>>>>>>>>There is a difference in
output from freq.i and res4.? There were only two files under 'group_b`.?
So, check your codes.
>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>>Sent: Monday, February 18,
2013 10:27 AM
>>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Hi!!!
>>>>>>>>>>>>>
>>>>>>>>>>>>>I'm coming to ask a new
question.
>>>>>>>>>>>>>
>>>>>>>>>>>>>I want a function to do my
statistics. I start with you had send me:
>>>>>>>>>>>>>
>>>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>>>?indx<-gsub("[./]","",list.dirs())
>>>>>>>>>>>>>?indx1<-
indx[indx!=""]
>>>>>>>>>>>>>?print(indx1)
>>>>>>>>>>>>>?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
>>>>>>>>>>>>>?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>>>?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>>>?print(lista)
>>>>>>>>>>>>>?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
>>>>>>>>>>>>>?}
>>>>>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados.lixo",23)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>In my lista?I?can?t merge
rows to have the group, because the idea is for each file count? frequencies of
mm, when b<0.01. after that I want a graph like the graph in attach.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>When I had 2 groups and knew
the name of the groups, I did the code (but Know I have more groups and, maybe,
I don?t know the name of the groups):
>>>>>>>>>>>>>
>>>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>>>?#reading data
>>>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>>>?direct<-dir(directory,pattern
= paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>>>
>>>>>>>>>>>>>?lista<-lapply(direct,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>?listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>?listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#count different z values
>>>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>>>??? for (i in
1:length(lista)) {
>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>>>>>??????? cab <- c(cab,
names(dc))
>>>>>>>>>>>>>??}
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#Relative freqs to
construct the graph
>>>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>>>??? d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>>>
>>>>>>>>>>>>>??? for (i in
1:length(listaC)) {
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>?for (i in 1:length(listaT))
{
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>>>>>??dct<-table(factor(dct$z,
levels=cab))
>>>>>>>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>??dct1<-table(factor(dct1$z,
levels=cab))
>>>>>>>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>>>>>
>>>>>>>>>>>>>#Graph plot
>>>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>>>>>#average of the group
(except c1&t1)
>>>>>>>>>>>>>freqs<-rbind(dcf[-1,],
dtf[-1,])
>>>>>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>>>>>
>>>>>>>>>>>>>#chisquare test function
>>>>>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>>>>>?somax<-sum(x)
>>>>>>>>>>>>>?somay<-sum(y)
>>>>>>>>>>>>>?nj.<-x+y
>>>>>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>>>>>?return(pvalue)
>>>>>>>>>>>>>?}
>>>>>>>>>>>>>
>>>>>>>>>>>>>#pvalues of the chisquare
test between sample and average (H0: two samples has the same distribution)
>>>>>>>>>>>>>pvalues<-c()
>>>>>>>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>>>>>}
>>>>>>>>>>>>>#data frame with final
p-values
>>>>>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>>>>>print(dataframe)
>>>>>>>>>>>>>}
>>>>>>>>>>>>>z.plot("C:/Users/Vera/Desktop/data",23)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Thank you again
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>2013/2/17 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>>HI Vera,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>No problem.? I am cc:ing
to r-help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>Sent: Sunday, February
17, 2013 5:44 AM
>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Hi. Thank you. It works
now:-)
>>>>>>>>>>>>>>And yes, I use windows.
>>>>>>>>>>>>>>Thank you very much.
>>>>>>>>>>>>>>No dia 17 de Fev de 2013
00:44, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Hi Vera,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Have you tried the
suggestion?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Are you using
Windows?
>>>>>>>>>>>>>>>Thanks,
>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>Sent: Saturday,
February 16, 2013 7:10 PM
>>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>>>>In mine, I have an
error " 'what' must be a character string or a function".
>>>>>>>>>>>>>>>I need to do
equivalent in my system.
>>>>>>>>>>>>>>>Thank you and sorry
one more time.
>>>>>>>>>>>>>>>No dia 16 de Fev de
2013 23:53, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>You didn't
mention what the error message or whether you are reading file names which are?
not "mmmmm11kk.txt".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>It is workiing
on my system as I run it again.
>>>>>>>>>>>>>>>>?c() combine
values into a vector or list.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>?sessionInfo()
>>>>>>>>>>>>>>>>R version 2.15.1
(2012-06-22)
>>>>>>>>>>>>>>>>Platform:
x86_64-pc-linux-gnu (64-bit)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>locale:
>>>>>>>>>>>>>>>>?[1]
LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>>>>>>>>>>>>>>>?[3]
LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>>>>>>>>>>>>>>>?[5]
LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>>>>>>>>>>>>>>>?[7]
LC_PAPER=C???????????????? LC_NAME=C????????????????
>>>>>>>>>>>>>>>>?[9]
LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>>>>>>>>>>>>>>>[11]
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>attached base
packages:
>>>>>>>>>>>>>>>>[1] stats????
graphics? grDevices utils???? datasets? methods?? base????
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>other attached
packages:
>>>>>>>>>>>>>>>>[1]
stringr_0.6.2? reshape2_1.2.2
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>loaded via a
namespace (and not attached):
>>>>>>>>>>>>>>>>[1] plyr_1.8
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>#code
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>#result
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>res3
>>>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>>>#$group_a$a1
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>$group_a$a2
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>$group_a$a3
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>$group_b
>>>>>>>>>>>>>>>>$group_b$b1
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>$group_b$b2
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>$group_c
>>>>>>>>>>>>>>>>$group_c$c1
>>>>>>>>>>>>>>>>???? Id? M mm???
x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>1?? aAA? 1? 2?
739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>2 aAAAA? 1? 2
2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>3??? aA? 2? 1???
1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>4?? aAA? 1? 2
1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>5? aAAA? 1? 3
3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>6??? AA na? 2
1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>Sent: Saturday,
February 16, 2013 6:32 PM
>>>>>>>>>>>>>>>>Subject: Re:
reading data
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Sorry again...
In:
>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>>>>>>>>>>>>>>What is this c?
In do.call(c,?? When I put this row im R, I have an error.
>>>>>>>>>>>>>>>>Thank you
>>>>>>>>>>>>>>>>No dia 15 de Fev
de 2013 18:11, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>No problem.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>BTW, these
questions are not stupid..
>>>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>From: Vera
Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 1:08 PM
>>>>>>>>>>>>>>>>>Subject: Re:
reading data
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Thank you
very much.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I will try
to apply and after I tell you if it is ok :-)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Thank you
and sorry about this questions (sometimes stupid questions).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>2013/2/15
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>>>No
problem.
>>>>>>>>>>>>>>>>>>?c() for
concatenate to vector or list().
>>>>>>>>>>>>>>>>>>If I use
do.call(cbind,..) or do.call(rbind,...)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>>>#??
[,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>>>>>>>>>>>>>>>#a1
List,11 List,11 List,11 List,11 List,11 List,11
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>>>#????
a1????
>>>>>>>>>>>>>>>>>>#[1,]
List,11
>>>>>>>>>>>>>>>>>>#[2,]
List,11
>>>>>>>>>>>>>>>>>>#[3,]
List,11
>>>>>>>>>>>>>>>>>>#[4,]
List,11
>>>>>>>>>>>>>>>>>>#[5,]
List,11
>>>>>>>>>>>>>>>>>>#[6,]
List,11
>>>>>>>>>>>>>>>>>>ie.
>>>>>>>>>>>>>>>>>>list
within in a list
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>>>>>>>>>>>>>>?str(restrial)
>>>>>>>>>>>>>>>>>>#List of
6
>>>>>>>>>>>>>>>>>># $
:List of 1
>>>>>>>>>>>>>>>>>>? #..$
a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>>>? .#.
..$ Id: chr [1:6] "aAA" "aAAAA" "aA"
"aAA" ...
>>>>>>>>>>>>>>>>>>? #..
..$ M : chr [1:6] "1" "1" "2" "1" ...
>>>>>>>>>>>>>>>>>>? #. ..$
mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>>>? #. ..$
x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>>>?
-----------------------------------------------------------------
>>>>>>>>>>>>>>>>>>str(res)
>>>>>>>>>>>>>>>>>>#List of
6
>>>>>>>>>>>>>>>>>># $
a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>>>?# ..$
Id: chr [1:6] "aAA" "aAAAA" "aA" "aAA"
...
>>>>>>>>>>>>>>>>>>? #..$ M
: chr [1:6] "1" "1" "2" "1" ...
>>>>>>>>>>>>>>>>>>?# ..$
mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>>>?# ..$ x
: int [1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>You
mentioned about naming this to "group_a","group_b". etc..
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>>>?res3$group_a
>>>>>>>>>>>>>>>>>>$a1
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>#6??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>#$a2
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>#6??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>#$a3
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>?# ??
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>#6??? AA
na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 12:39 PM
>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Thank
you very much and sorry my questions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>But this
code isn't grouping for letters sure? I mean, a1,a2,a3 is the same group,
(the first letter give me the name of the group)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Another
question, in do.call, you did do.call (c,.....) .What is c?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Sorry
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>2013/2/15
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Just
to add:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>>>res[grep("group_b",names(res))]
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>I am
not sure how you want the grouped data to look like.? If you want something like
this:
>>>>>>>>>>>>>>>>>>>res1<-do.call(rbind,res)
>>>>>>>>>>>>>>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>>>>>>>>>>>>>>res2
>>>>>>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>?#
??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>#1???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#2?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#3????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#4???
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#5??
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#6????
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>#7???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#8?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#9????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#10??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#11?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#12???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>#13??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#14
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#15???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#16??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#17?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#18???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>?#
??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>#1???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#2?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#3????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#4???
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#5??
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#6????
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>#7???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#8?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#9????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#10??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#11?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#12???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>?#
?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>#or
if you want it like this:
>>>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>res2[["group_b"]]
>>>>>>>>>>>>>>>>>>>
>>>
>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>?#
?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Hope
this helps.
>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>-----
Original Message -----
>>>
>>>>>>>>>>>>>>>>>>>From:
"veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>To:
smartpink111 at yahoo.com
>>>>>>>>>>>>>>>>>>>Cc:
>>>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 9:15 AM
>>>>>>>>>>>>>>>>>>>Subject:
reading data
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>>>I
post yesterday and you helped me. I have little problem.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>At
first, I never worked with regular expressions...
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>The
code that you gave me it's ok, but my files are inside the folders a1,a2,a3.
I try to explain better.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>I
have one folder named "data". Inside this folder I have some other
folders named "a1","a2","b1",b2",...and
inside of each one of that I have some files. I want only the file
"mmmmmm.txt" (in all folders I have One file with this name).
>>>>>>>>>>>>>>>>>>>The
name of the folder give me the name of the group,but I need to read the file
inside. And after, have "group_a", group_"b"...because I
need to work with this data grouped (and know the name of the group).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Thank
you.
>>>>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>????????????????????????????????
>>>>>>>>>>>>>
>>>>>>>>>>>>????????
>>>>>>>>>>>?
>>>>>>>>>>????????????????????????????????????????????
>>>>>>>>>?
>>>>>>>>????????????????????????????????????
>>>>>>>
>>>>>>>
>>>>>>?????????????????????????
>>>>>
>>>>???? ???
>>>?????????
>>?
>??

arun

2013-Feb-26 18:51 UTC

head link

[R] reading data

Hi,
res10[is.na(res10)] <- 0
apply(res10[,c(4,6)],1,t.test)
#Error in t.test.default(newX[, i], ...) : data are essentially constant

To overcome this, something like:

t.test.p.value <- function(...) {
??? obj<-try(t.test(...), silent=TRUE)
??? if (is(obj, "try-error")) return(NA) else return(obj$p.value)
?}

In this particular case, you have only two groups, 'c' and 't'
#Making up one more group
res11<- res10
?set.seed(45)
?res11$d1<- sample(0:4, 22,replace=TRUE)

resNew<-do.call(cbind,lapply(split(names(res11)[4:7],gsub("[0-9]","",names(res11)[4:7])),
function(i) {x<-if(ncol(res11[i])>1) rowSums(res11[i]) else res11[i];
colnames(x)<-NULL;x}))
indx<-combn(names(resNew),2)
resPval<-do.call(cbind,lapply(seq_len(ncol(indx)),function(i)
{x<-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value));
colnames(x)<-paste("Pvalue",paste(indx[,i],collapse=""),sep="_");x}))
resF<-cbind(res11,resPval)
head(resF,3)? #####head 
#???????????????????? Seq??????? Mod z c2 c3 t2 d1 Pvalue_cd Pvalue_ct Pvalue_dt
#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2? 0? 0? 1? 3?????? 0.5?????? 0.5 0.2951672
#2? aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2? 0? 0? 1? 1?????? 0.5?????? 0.5??????? NA
#3?????? aAAAAAAAAAGAAGGR 1-n_acPro/ 2? 1? 0? 1? 1??????? NA??????? NA??????? NA
A.K.

___________________________
From: Vera Costa <veracosta.rt at gmail.com>
To: arun <smartpink111 at yahoo.com> 
Sent: Tuesday, February 26, 2013 12:05 PM
Subject: Re: reading data


I think, I didn't understand your question...

By each row, I need to "compare" groups. Groups is c, t, a,....

I'm thinking...but we can sum by group and apply a t test to compare
means...



2013/2/26 arun <smartpink111 at yahoo.com>

Just a doubt.>
>If you want to compare by rows:
>then did you mean to add the numbers in rows of c2 and c3 and compare it
with t2 (if t has 1 more group, add t2 and t3)
>
>A.K.
>
>
>
>
>
>________________________________
>From: Vera Costa <veracosta.rt at gmail.com>
>To: arun <smartpink111 at yahoo.com>
>Sent: Tuesday, February 26, 2013 10:52 AM
>Subject: Re: reading data
>
>
>May I ask a new problem (continuation of this)? (if you could help a little
bit more)
>
>I insert the row
>
>res10[is.na(res10)] <- 0
>
>in your code. Atfer I need to apply a statistical test (for example
chisq.test) to all rows to compare groups.
>I will try to explain correctly.
>For example, for the first row, I need a new column with the pvalues that
compare groups (in this case c and t - but we can have more)
>??????????????????????????????? Seq??????????????????????????????????? ????
Mod?????????????? z?????? c2????? c3????? ?t2???? p-value
>1??????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/????? ?2??????? 0??
?????0??? ?? 1?????????? """"""
>
>
>
>and after remove all non-significant (with pvalues<0.05).
>
>Thank?you again (and I think this problem finish here (I think:-)).
>
>Vera??
>
>
>
>2013/2/26 arun <smartpink111 at yahoo.com>
>
>No problem.
>>Arun
>>
>>
>>
>>
>>________________________________
>> From: Vera Costa <veracosta.rt at gmail.com>
>>To: arun <smartpink111 at yahoo.com>
>>Sent: Tuesday, February 26, 2013 9:47 AM
>>Subject: Re: reading data
>>
>>
>>Ah ok, no problem :-)
>>?
>>I'm seeing the code.
>>Thank you very much for your big helps
>>
>>
>>
>>2013/2/26 arun <smartpink111 at yahoo.com>
>>
>>I used head(res10,3)
>>>?res10
>>>??????????????????????????????? Seq???????????????? Mod z c2 c3 t2
>>>1??????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2 NA NA? 1
>>>2???????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2 NA NA? 1
>>>3????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2? 1 NA? 1
>>>4?????????????????????? AAAAAAALQAK???????????????????? 2 NA? 1? 1
>>>5??????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2? 2 NA? 2
>>>6???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2 NA NA?
1
>>>7???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 3 NA NA?
1
>>>8???????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 2? 1 NA NA
>>>9???????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 3? 2? 2? 1
>>>10????????????????????? AAAAAPGTAEK???????????????????? 2? 1 NA NA
>>>11??????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2 NA NA? 1
>>>12????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2? 1? 1? 1
>>>13??????????????? AAAAAWEEPSSGNGTAR???????????????????? 2? 1? 1? 1
>>>14????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1 NA NA? 1
>>>15???????????????????? AAAAEVLGLILR???????????????????? 2? 1? 1? 1
>>>16????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3? 1? 1? 1
>>>17? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR 1-<_Carbamoylation/ 3 NA? 1
NA
>>>18? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/ 3 NA NA? 1
>>>19 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3 NA NA? 1
>>>20???????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 2? 1? 1 NA
>>>21???????? aAAAVGAGHGAGGPGAASSSGGAR????????? 1-n_acPro/ 3 NA? 1 NA
>>>22???????????? aAADGDDSLYPIAVLIDELR????????? 1-n_acPro/ 2 NA? 1 NA
>>>
>>>In the folder you gave to me, there was only a1, which was deleted
according to what you said.
>>>
>>>res4[[1]]
>>>$a1
>>>??????????????????????????????? Seq???????????????? Mod z????? spec
>>>1??????????? aAAAAAAAAAAAAAATATAGPR????????? 1-n_acPro/ 2???? 11833
>>>2???????????? aAAAAAAAAAAASSPVGVGQR????????? 1-n_acPro/ 2???? 11833
>>>3????????????????? aAAAAAAAAAGAAGGR????????? 1-n_acPro/ 2???? 13103
>>>4?????????????????????? AAAAAAALQAK???????????????????? 2????? 3084
>>>5??????????????????? aAAAAAGAGPEMVR????????? 1-n_acPro/ 2 9646,9821
>>>6???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 2????
33650
>>>7???????? aAAAAEQQQFYLLLGNLLSPDNVVR 1-<_Carbamoylation/ 3????
33607
>>>9???????? aAAAAEQQQFYLLLGNLLSPDNVVR????????? 1-n_acPro/ 3???? 33769
>>>11??????????? aAAAASAPQQLSDEELFSQLR????????? 1-n_acPro/ 2???? 20602
>>>12????????????????? aAAAAVGNAVPCGAR????????? 1-n_acPro/ 2???? 10018
>>>13??????????????? AAAAAWEEPSSGNGTAR???????????????????? 2????? 5576
>>>14????????????????????? aAAAELSLLEK????????? 1-n_acPro/ 1???? 19662
>>>16???????????????????? AAAAEVLGLILR???????????????????? 2???? 22857
>>>17????? aAAAGAAAAAAAEGEAPAEMGALLLEK????????? 1-n_acPro/ 3???? 26060
>>>18? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR????????? 1-n_acPro/ 3???? 21479
>>>19 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK????????? 1-n_acPro/ 3???? 21159
>>>
>>>
>>>
>>>A.K.
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>To: arun <smartpink111 at yahoo.com>
>>>Sent: Tuesday, February 26, 2013 9:30 AM
>>>Subject: Re: reading data
>>>
>>>
>>>Hi, thank you
>>>
>>>But here is a small think that I didn't understand....Why we
have only this output with 3 rows? a2 for example has a lot of rows...you
didn't use the last attach?
>>>
>>>But if you used the first, I think we will have more...
>>>
>>>Vera
>>>
>>>
>>>
>>>2013/2/26 arun <smartpink111 at yahoo.com>
>>>
>>>
>>>>
>>>>Hi,
>>>>Try this:
>>>>
>>>>files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>res2<-split(lista,names(lista))
>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>#Freq FDR<0.01
>>>>res4<-lapply(seq_along(res3),function(i)
lapply(res3[[i]],function(x)
x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
>>>>names(res4)<- names(res2)
>>>>?res4New<-lapply(res4,function(x) lapply(names(x),function(i)
do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
>>>>?
>>>>res5<- lapply(res4New,function(x) if(length(x)>1)
tail(x,-1) else NULL)
>>>>library(plyr)
>>>>library(data.table)
>>>>res6<- lapply(res5,function(x) lapply(x,function(x1)
{x1<-data.table(x1);
x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]}))
>>>>?res7<-lapply(res6,function(x) lapply(x,function(x1)
{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s",
"", unlist(strsplit(x2,
",")))));x3<-as.data.frame(x1);names(x3)[6]<-
as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
>>>>?res8<-lapply(res7,function(x) Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),x))
>>>>?res9<-res8[lapply(res8,length)!=0]
>>>>?res10<- Reduce(function(...)
merge(...,by=c("Seq","Mod","z"),all=TRUE),res9)
>>>>head(res10,3)
>>>>?# ?????????????????? Seq??????? Mod z c2 c3 t2
>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2 NA NA? 1
>>>>#2? aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2 NA NA? 1
>>>>#3?????? aAAAAAAAAAGAAGGR 1-n_acPro/ 2? 1 NA? 1
>>>>A.K.
>>>>________________________________
>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>To: arun <smartpink111 at yahoo.com>
>>>>Sent: Tuesday, February 26, 2013 5:15 AM
>>>>Subject: Re: reading data
>>>>
>>>>
>>>>Sorry, I only see now your last email.
>>>>
>>>>I have at the moment 8 folder, but I can have more. I need to
work in general.
>>>>
>>>>Thank you
>>>>
>>>>
>>>>
>>>>2013/2/25 arun <smartpink111 at yahoo.com>
>>>>
>>>>I sent the solution.? But, I need to know how many folders you
have for the analysis because I manually inserted the names at the end.? It
works if there are not many folders.? Otherwise, need to add it in the program.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>Sent: Monday, February 25, 2013 10:01 AM
>>>>>Subject: Re: reading data
>>>>>
>>>>>
>>>>>Hi.
>>>>>
>>>>>Is?from the attached dataset, but without a1
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>2013/2/25 arun <smartpink111 at yahoo.com>
>>>>>
>>>>>
>>>>>>
>>>>>>Hi,
>>>>>>Are you sure that the output is from the attached
dataset:
>>>>>>
>>>>>>I am getting the result for aa: with 111 rows:
>>>>>>?aa
>>>>>>???????????????????????????????????????
Seq??????????????????????????????? Mod
>>>>>>1???????????????????
aAAAAAAAAAAAAAATATAGPR???????????????????????? 1-n_acPro/
>>>>>>2????????????????????
aAAAAAAAAAAASSPVGVGQR???????????????????????? 1-n_acPro/
>>>>>>3?????????????????????????
aAAAAAAAAAGAAGGR???????????????????????? 1-n_acPro/
>>>>>>4????????????????????
aAAAAAAAGAAGGRGSGPGRR???????????????????????? 1-n_acPro/
>>>>>>5??????????????????????????????
AAAAAAAkAAK??????????????????????????? 8-K_ac/
>>>>>>6??????????????????????????????
AAAAAAALQAK??????????????????????????????????
>>>>>>7???????????????????????????
aAAAAAGAGPEMVR???????????????????????? 1-n_acPro/
>>>>>>8??????????????????????????
aAAAAATAAAAASIR???????????????????????? 1-n_acPro/
>>>>>>9????????????????
AAAAAEQQQFyLLLGNLLSPDNVVR?????????????????????????? 11-Y_ph/
>>>>>>10???????????????
aAAAAEQQQFYLLLGNLLSPDNVVR??????????????? 1-<_Carbamoylation/
>>>>>>11???????????????
aAAAAEQQQFYLLLGNLLSPDNVVR??????????????? 1-<_Carbamoylation/
>>>>>>12???????????????
aAAAAEQQQFYLLLGNLLSPDNVVR???????????????????????? 1-n_acPro/
>>>>>>13???????????????
aAAAAEQQQFYLLLGNLLSPDNVVR???????????????????????? 1-n_acPro/
>>>>>>14?????????????????????????????
AAAAAPGTAEK??????????????????????????????????
>>>>>>15??????????????????????????
aAAAAQGGGGGEPR???????????????????????? 1-n_acPro/
>>>>>>16???????????????????
aAAAASAPQQLSDEELFSQLR???????????????????????? 1-n_acPro/
>>>>>>17?????????????????????????
aAAAAVGNAVPCGAR???????????????????????? 1-n_acPro/
>>>>>>18???????????????????????
AAAAAWEEPSSGNGTAR??????????????????????????????????
>>>>>>19?????????????????????????????
aAAAELSLLEK???????????????????????? 1-n_acPro/
>>>>>>20?????????????????????????????
aAAAELSLLEK???????????????????????? 1-n_acPro/
>>>>>>21????????????????????????????
AAAAEVLGLILR??????????????????????????????????
>>>>>>22?????????????
aAAAGAAAAAAAEGEAPAEMGALLLEK???????????????????????? 1-n_acPro/
>>>>>>23?????????
aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR??????????????? 1-<_Carbamoylation/
>>>>>>24?????????
aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR???????????????????????? 1-n_acPro/
>>>>>>25????????????????????
aAAAKPNNLSLVVHGPGDLR???????????????????????? 1-n_acPro/
>>>>>>26????????
aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK???????????????????????? 1-n_acPro/
>>>>>>27????????????????
aAAAVGAGHGAGGPGAASSSGGAR???????????????????????? 1-n_acPro/
>>>>>>28????????????????
aAAAVGAGHGAGGPGAASSSGGAR???????????????????????? 1-n_acPro/
>>>>>>29???????????????????????????????
aAAAVQGGR???????????????????????? 1-n_acPro/
>>>>>>30??????????????????????????????
aAAAVVEFQR??????????????? 1-<_Carbamoylation/
>>>>>>31??????????????????????????????
aAAAVVEFQR???????????????????????? 1-n_acPro/
>>>>>>32???????????????????????????
aAAAVVVPAEWIK???????????????????????? 1-n_acPro/
>>>>>>33????????????????????
aAADGDDSLYPIAVLIDELR???????????????????????? 1-n_acPro/
>>>>>>34????????????????????
aAADGDDSLYPIAVLIDELR???????????????????????? 1-n_acPro/
>>>>>>35??????????????????????????
AAADLMAYCEAHAK??????????????????????????????????
>>>>>>36??????????????????????????
AAADLMAYCEAHAK??????????????????????????????????
>>>>>>37?????????
aAAEAANCIMEVSCGQAESSEKPNAEDMTSK???????????????????????? 1-n_acPro/
>>>>>>38????????????????????
AAAEIYEEFLAAFEGSDGNK??????????????????????????????????
>>>>>>39????????????????????????????
AAAEVAGQFVIK??????????????????????????????????
>>>>>>40??????????????????
AAAIGIDLGTTYSCVGVFQHGK??????????????????????????????????
>>>>>>41??????????????????
AAAIGIDLGTTYSCVGVFQHGK??????????????????????????????????
>>>>>>42???????????????????????
AAALATVNAWAEQTGMK??????????????????????????????????
>>>>>>43??????????????????????????????
AAAMANNLQK??????????????????????????????????
>>>>>>44??????????????????
AAAPAPEEEMDECEQALAAEPK??????????????????????????????????
>>>>>>45?????????????????????
AAAQLLQSQAQQSGAQQTK??????????????????????????????????
>>>>>>46???????????????????????????
AAATPESQEPQAK??????????????????????????????????
>>>>>>47???????????????????
aAAVAAAGAGEPQSPDELLPK???????????????????????? 1-n_acPro/
>>>>>>48?????????
aAAVLSGPSAGSAAGVPGGTGGLSAVSSGPR???????????????????????? 1-n_acPro/
>>>>>>49???????????
AAAVVGInSETIMKPASISEEELLNLINK?????????????????? 8-N_Deamidation/
>>>>>>50??????????
AAAYNLVQHGITNLCVIGGDGSLTGANIFR??????????????????????????????????
>>>>>>51????????????????????????????
aADTQVSETLKR???????????????????????? 1-n_acPro/
>>>>>>52??????????????????????
aAEAADLGLGAAVPVELR???????????????????????? 1-n_acPro/
>>>>>>53??????????????????????????
AAEDDEDDDVDTKK??????????????????????????????????
>>>>>>54?????????????????????????????
AAEEPSKVEEK??????????????????????????????????
>>>>>>55?????????????????????????????
AAEEPSKVEEK??????????????????????????????????
>>>>>>56?????????????????
AAEGGLSSPEFSELCIWLGSQIK??????????????????????????????????
>>>>>>57????????????????????
AAELIANSLATAGDGLIELR??????????????????????????????????
>>>>>>58????????????????????
AAELIANSLATAGDGLIELR??????????????????????????????????
>>>>>>59??????????????????????????????
AAELLMSCFR??????????????????????????????????
>>>>>>60??????????????????????????
aAEPNKTEIQTLFK???????????????????????? 1-n_acPro/
>>>>>>61????????????????
AAEQILEDMITIDVENVMEDICSK??????????????????????????????????
>>>>>>62????????????????????????
AAEsETPGKSPEKKPK??????????????????????????? 4-S_ph/
>>>>>>63????????????????????????
AAEsETPGKSPEKKPK??????????????????????????? 4-S_ph/
>>>>>>64?????????????????????
AAESLADPTEYENLFPGLK??????????????????????????????????
>>>>>>65?????????????????????
AAFDDAIAELDTLSEESYK??????????????????????????????????
>>>>>>66?????????????????????
AAFDDAIAELDTLSEESYK??????????????????????????????????
>>>>>>67????????????????????????
AAFECMYTLLDSCLDR??????????????????????????????????
>>>>>>68??????????
AAGAGLPESVIWAVNAGGEAHVDVHGIHFR??????????????????????????????????
>>>>>>69????????????????????????
aAGGDGAEAPAKKDVK???????????????????????? 1-n_acPro/
>>>>>>70????????????????????????
AAGGGAGSSEDDAQSR??????????????????????????????????
>>>>>>71???????????????????????????
AAGHPGDPESQQR??????????????????????????????????
>>>>>>72???????????????????????????
AAGHPGDPESQQR??????????????????????????????????
>>>>>>73??????????????????????????????????
AAGkFK??????????????????????????? 4-K_me/
>>>>>>74?????????????????
AAGLATMISTMRPDIDNMDEYVR??????????????????????????????????
>>>>>>75????????????????????????
aAGTAAALAFLSQESR???????????????????????? 1-n_acPro/
>>>>>>76???????????????????????????
aAGTLYTYPENWR???????????????????????? 1-n_acPro/
>>>>>>77???????????????????????????
aAGTSSYWEDLRK???????????????????????? 1-n_acPro/
>>>>>>78?????????????????????????
AAGTVFTTVEDLGSK??????????????????????????????????
>>>>>>79????????????????????????
aAGVEAAAEVAATEIK???????????????????????? 1-n_acPro/
>>>>>>80???????????????????????????
AAGVGDMVMATVK??????????????????????????????????
>>>>>>81????????????????????????
AAGVNVEPFWPGLFAK??????????????????????????????????
>>>>>>82??????????????????????????????
AAGVVLEMIR??????????????????????????????????
>>>>>>83????????????????????
AAHIFFTDTCPEPLFSELGR??????????????????????????????????
>>>>>>84?????????????????????????????
AAHLCAEAALR??????????????????????????????????
>>>>>>85???????????????????????????????
AAHnKDVLR?????????????????? 4-N_Deamidation/
>>>>>>86?????????????????????????????
AAHVEYSTAAR??????????????????????????????????
>>>>>>87?????????????????????????????
AAHVEYSTAAR??????????????????????????????????
>>>>>>88??????????????????????
AAIAQALAGEVSVVPPSR??????????????????????????????????
>>>>>>89?????????????????????????????
AAIISAEGDSK??????????????????????????????????
>>>>>>90??????? AALAAEVKkPAAAAAPGTAEkLSPkATTASQAk
9-K_me2/21-K_me2/25-K_me2/33-K_me/
>>>>>>91????????????????????????????
AALAFGFLDLLK??????????????????????????????????
>>>>>>92??????????
AALAGGTTMIIDHVVPEPGTSLLAAFDQWR??????????????????????????????????
>>>>>>93??????????????????????
AALAHSEEVTASQVAATK??????????????????????????????????
>>>>>>94??????????????????????????
AALCHFCIDMLNAK??????????????????????????????????
>>>>>>95??????????????????????
aALDSLSLFTSLGLSEQK???????????????????????? 1-n_acPro/
>>>>>>96???????????????????????????
AALEALGSCLNNK??????????????????????????????????
>>>>>>97???????????????????????????
AALEAQNALHNMK??????????????????????????????????
>>>>>>98?????????????????????
AALETDENLLLCAPTGAGK??????????????????????????????????
>>>>>>99???????????????????????
AALGPLVTGLYDVQAFK??????????????????????????????????
>>>>>>100?????????????????????
aALGVLESDLPSAVTLLK???????????????????????? 1-n_acPro/
>>>>>>101??????????????????????????
AALLETLSLLLAK??????????????????????????????????
>>>>>>102???????????
AALPGILSELDVDVnEGSLMELQGHIGR????????????????? 15-N_Deamidation/
>>>>>>103
AALPSHVVTMLDNFPTNLHPMSQLSAAVTALNSESNFAR??????????????????????????????????
>>>>>>104
AALPSHVVTMLDNFPTnLHPMSQLSAAVTALNSESNFAR????????????????? 17-N_Deamidation/
>>>>>>105????????????????????????????
AALSALESFLK??????????????????????????????????
>>>>>>106????????????????????????????
AALSEEELEKK??????????????????????????????????
>>>>>>107???????????????????????
aALTAEHFAALQSLLK???????????????????????? 1-n_acPro/
>>>>>>108??????????????????????????
AAMADTFLEHMCR??????????????????????????????????
>>>>>>109???????????????????????????
AAMEALVVEVTK??????????????????????????????????
>>>>>>110?????????????????????
AAMFTAGSNFNHVVQNEK??????????????????????????????????
>>>>>>111????????????????????
aANATTNPSQLLPLELVDK???????????????????????? 1-n_acPro/
>>>>>>??? z counts.x.x counts.y.x counts counts.x.y counts.y.y
>>>>>>1?? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>2?? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>3?? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>>>>>4?? 2????????? 1????????? 1???? NA???????? NA???????? NA
>>>>>>5?? 2???????? NA????????? 1???? NA???????? NA???????? NA
>>>>>>6?? 2???????? NA???????? NA????? 1???????? NA????????? 1
>>>>>>7?? 2????????? 1????????? 2????? 1????????? 1????????? 2
>>>>>>8?? 2????????? 1????????? 1???? NA????????? 1???????? NA
>>>>>>9?? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>10? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>11? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>12? 2???????? NA????????? 1????? 1???????? NA???????? NA
>>>>>>13? 3????????? 1????????? 2????? 2???????? NA????????? 1
>>>>>>14? 2????????? 1????????? 1???? NA???????? NA????????? 1
>>>>>>15? 2???????? NA???????? NA???? NA????????? 1???????? NA
>>>>>>16? 2???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>17? 2???????? NA????????? 1????? 1???????? NA????????? 1
>>>>>>18? 2???????? NA????????? 1????? 1???????? NA????????? 1
>>>>>>19? 1???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>20? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>>>>>21? 2????????? 1????????? 1????? 1????????? 1????????? 1
>>>>>>22? 3???????? NA????????? 1????? 1???????? NA????????? 1
>>>>>>23? 3???????? NA???????? NA????? 1???????? NA???????? NA
>>>>>>24? 3????????? 1???????? NA???? NA???????? NA????????? 1
>>>>>>25? 3???????? NA????????? 1???? NA???????? NA???????? NA
>>>>>>26? 3???????? NA???????? NA???? NA???????? NA????????? 1
>>>>>>27? 2???????? NA????????? 1????? 1????????? 1???????? NA
>>>>>>28? 3???????? NA???????? NA????? 1????????? 1???????? NA
>>>>>>29? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>>>>>30? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>>>>>31? 2???????? NA???????? NA????? 1???????? NA???????? NA
>>>>>>32? 2????????? 1???????? NA???? NA????????? 1???????? NA
>>>>>>33? 2????????? 1???????? NA????? 1????????? 1???????? NA
>>>>>>34? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>>>>>35? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>36? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>>>>>37? 3????????? 1???????? NA???? NA????????? 1???????? NA
>>>>>>38? 2????????? 1???????? NA???? NA????????? 1???????? NA
>>>>>>39? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>40? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>41? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>42? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>43? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>44? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>45? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>46? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>47? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>48? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>49? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>50? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>51? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>52? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>53? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>54? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>55? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>56? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>57? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>58? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>59? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>60? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>61? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>62? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>63? 4????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>64? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>65? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>66? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>67? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>68? 5????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>69? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>70? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>71? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>72? 3????????? 2???????? NA???? NA???????? NA???????? NA
>>>>>>73? 1????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>74? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>75? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>76? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>77? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>78? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>79? 2???????? 11???????? NA???? NA???????? NA???????? NA
>>>>>>80? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>81? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>82? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>83? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>84? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>85? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>86? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>87? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>88? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>89? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>90? 5????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>91? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>92? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>93? 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>94? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>95? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>96? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>97? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>98? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>99? 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>100 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>101 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>102 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>103 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>104 4????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>105 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>106 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>107 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>108 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>109 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>110 3????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>111 2????????? 1???????? NA???? NA???????? NA???????? NA
>>>>>>
>>>>>>________________________________
>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>Sent: Monday, February 25, 2013 8:56 AM
>>>>>>Subject: Re: reading data
>>>>>>
>>>>>>
>>>>>>You're correct, but my real data have +- 40000 row,
and I can have duplicated rows. I group number of spec if data has the same Seq,
mod and z.
>>>>>>
>>>>>>For the data in attach , if I do the code (only for c
and t),
>>>>>>
>>>>>>c1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>>>c2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>>>c3 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/c3/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>>>t1 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t1/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>>>t2 <- read.table("C:/Users/Vera
Costa/Desktop/data.new/t2/MSMS_23PepInfo.txt",header=TRUE, sep =
"\t", na.strings="NA", dec=".", strip.white=TRUE)
>>>>>>dc1<-c1[ifelse(c1$FDR<0.01, TRUE, FALSE),]
>>>>>>dc2<-c2[ifelse(c2$FDR<0.01, TRUE, FALSE),]
>>>>>>dc3<-c3[ifelse(c2$FDR<0.01, TRUE, FALSE),]
>>>>>>dt1<-t1[ifelse(t1$FDR<0.01, TRUE, FALSE),]
>>>>>>dt2<-t2[ifelse(t2$FDR<0.01, TRUE, FALSE),]
>>>>>>bc1<- aggregate(spec ~ Seq + Mod+z, data = dc1,
paste, collapse = ",")
>>>>>>bc2<- aggregate(spec ~ Seq + Mod+z, data = dc2,
paste, collapse = ",")
>>>>>>bc3<- aggregate(spec ~ Seq + Mod+z, data = dc3,
paste, collapse = ",")
>>>>>>bt1<- aggregate(spec ~ Seq + Mod+z, data = dt1,
paste, collapse = ",")
>>>>>>bt2<- aggregate(spec ~ Seq + Mod+z, data = dt2,
paste, collapse = ",")
>>>>>>bc1$counts <- sapply(bc1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>>bc2$counts <- sapply(bc2$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>>bc3$counts <- sapply(bc3$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>>bt1$counts <- sapply(bt1$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>>bt2$counts <- sapply(bt2$spec, function(x)
length(gsub("\\s", "", unlist(strsplit(x, ",")))))
>>>>>>bc1<-bc1[,-4]
>>>>>>bc2<-bc2[,-4]
>>>>>>bc3<-bc3[,-4]
>>>>>>bt1<-bt1[,-4]
>>>>>>bt2<-bt2[,-4]
>>>>>>a1<-merge(bc1,bc2,by=c("Seq","Mod","z"),all=TRUE)
>>>>>>a2<-merge(a1,bc3,by=c("Seq","Mod","z"),all=TRUE)
>>>>>>a3<-merge(bt1,bt2,by=c("Seq","Mod","z"),all=TRUE)
>>>>>>aa<-merge(a2,a3,by=c("Seq","Mod","z"),all=TRUE)
>>>>>>aa
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>I have the output
>>>>>>
>>>>>>
>>>>>>??????????????????????????????????????
Seq???????????????? Mod z counts.x.x counts.y.x counts.x.y counts.y.y
>>>>>>1???????????????????????? aAAAAAAAAAGAAGGR?????????
1-n_acPro/ 2???????? NA????????? 1????????? 1????????? 1
>>>>>>2??????????????????? aAAAAAAAGAAGGRGSGPGRR?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>3?????????????????????????? aAAAAAGAGPEMVR?????????
1-n_acPro/ 2???????? NA????????? 2????????? 1????????? 2
>>>>>>4????????????????????????? aAAAAATAAAAASIR?????????
1-n_acPro/ 2????????? 1???????? NA????????? 1???????? NA
>>>>>>5??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR?????????
1-n_acPro/ 2???????? NA????????? 1???????? NA???????? NA
>>>>>>6??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR?????????
1-n_acPro/ 3????????? 1????????? 2???????? NA????????? 1
>>>>>>7??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR
1-<_Carbamoylation/ 2???????? NA???????? NA???????? NA????????? 1
>>>>>>8??????????????? aAAAAEQQQFYLLLGNLLSPDNVVR
1-<_Carbamoylation/ 3???????? NA???????? NA???????? NA????????? 1
>>>>>>9?????????????????????????????
AAAAAPGTAEK???????????????????? 2????????? 1????????? 1???????? NA???????? NA
>>>>>>10???????????????????????????? aAAAELSLLEK?????????
1-n_acPro/ 1???????? NA???????? NA???????? NA????????? 1
>>>>>>11???????????????????????????? aAAAELSLLEK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>12???????????????????????????
AAAAEVLGLILR???????????????????? 2???????? NA????????? 1???????? NA????????? 1
>>>>>>13???????? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR?????????
1-n_acPro/ 3????????? 1???????? NA???????? NA????????? 1
>>>>>>14?????????????????????????? aAAAVVVPAEWIK?????????
1-n_acPro/ 2????????? 1???????? NA????????? 1???????? NA
>>>>>>15??????????????????? aAADGDDSLYPIAVLIDELR?????????
1-n_acPro/ 2????????? 1???????? NA????????? 1???????? NA
>>>>>>16??????????????????? aAADGDDSLYPIAVLIDELR?????????
1-n_acPro/ 3???????? NA???????? NA????????? 1???????? NA
>>>>>>17?????????????????????????
AAADLMAYCEAHAK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>18?????????????????????????
AAADLMAYCEAHAK???????????????????? 3????????? 1???????? NA????????? 1???????? NA
>>>>>>19???????? aAAEAANCIMEVSCGQAESSEKPNAEDMTSK?????????
1-n_acPro/ 3????????? 1???????? NA????????? 1???????? NA
>>>>>>20???????????????????
AAAEIYEEFLAAFEGSDGNK???????????????????? 2????????? 1???????? NA?????????
1???????? NA
>>>>>>21?????????????????
AAAIGIDLGTTYSCVGVFQHGK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>22?????????????????
AAAIGIDLGTTYSCVGVFQHGK???????????????????? 3????????? 1???????? NA????????
NA???????? NA
>>>>>>23??????????????????????
AAALATVNAWAEQTGMK???????????????????? 2????????? 1???????? NA???????? NA????????
NA
>>>>>>24?????????????????
AAAPAPEEEMDECEQALAAEPK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>25????????????????????
AAAQLLQSQAQQSGAQQTK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>26??????????????????????????
AAATPESQEPQAK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>27?????????????????? aAAVAAAGAGEPQSPDELLPK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>28?????????? AAAVVGInSETIMKPASISEEELLNLINK???
8-N_Deamidation/ 3????????? 1???????? NA???????? NA???????? NA
>>>>>>29??????????????????????????? aADTQVSETLKR?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>30?????????????????????????
AAEDDEDDDVDTKK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>31????????????????????????????
AAEEPSKVEEK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>32????????????????????????????
AAEEPSKVEEK???????????????????? 3????????? 1???????? NA???????? NA???????? NA
>>>>>>33????????????????
AAEGGLSSPEFSELCIWLGSQIK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>34???????????????????
AAELIANSLATAGDGLIELR???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>35????????????????????????? aAEPNKTEIQTLFK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>36???????????????
AAEQILEDMITIDVENVMEDICSK???????????????????? 3????????? 1???????? NA????????
NA???????? NA
>>>>>>37????????????????????
AAESLADPTEYENLFPGLK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>38????????????????????
AAFDDAIAELDTLSEESYK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>39????????????????????
AAFDDAIAELDTLSEESYK???????????????????? 3????????? 1???????? NA????????
NA???????? NA
>>>>>>40???????????????????????
AAFECMYTLLDSCLDR???????????????????? 2????????? 1???????? NA???????? NA????????
NA
>>>>>>41???????????????????????
AAGGGAGSSEDDAQSR???????????????????? 2????????? 1???????? NA???????? NA????????
NA
>>>>>>42??????????????????????????
AAGHPGDPESQQR???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>43??????????????????????????
AAGHPGDPESQQR???????????????????? 3????????? 2???????? NA???????? NA???????? NA
>>>>>>44????????????????
AAGLATMISTMRPDIDNMDEYVR???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>45?????????????????????????? aAGTLYTYPENWR?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>46?????????????????????????? aAGTSSYWEDLRK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>47??????????????????????? aAGVEAAAEVAATEIK?????????
1-n_acPro/ 2???????? 11???????? NA???????? NA???????? NA
>>>>>>48???????????????????????
AAGVNVEPFWPGLFAK???????????????????? 2????????? 1???????? NA???????? NA????????
NA
>>>>>>49????????????????????????????
AAHLCAEAALR???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>50?????????
AALAGGTTMIIDHVVPEPGTSLLAAFDQWR???????????????????? 3????????? 1????????
NA???????? NA???????? NA
>>>>>>51?????????????????????????
AALCHFCIDMLNAK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>52????????????????????? aALDSLSLFTSLGLSEQK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>53??????????????????????????
AALEALGSCLNNK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>54??????????????????????????
AALEAQNALHNMK???????????????????? 2????????? 1???????? NA???????? NA???????? NA
>>>>>>55????????????????????
AALETDENLLLCAPTGAGK???????????????????? 2????????? 1???????? NA????????
NA???????? NA
>>>>>>56????????????????????? aALGVLESDLPSAVTLLK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>57??????????? AALPGILSELDVDVnEGSLMELQGHIGR??
15-N_Deamidation/ 3????????? 1???????? NA???????? NA???????? NA
>>>>>>58 AALPSHVVTMLDNFPTnLHPMSQLSAAVTALNSESNFAR??
17-N_Deamidation/ 4????????? 1???????? NA???????? NA???????? NA
>>>>>>59
AALPSHVVTMLDNFPTNLHPMSQLSAAVTALNSESNFAR???????????????????? 3????????? 1????????
NA???????? NA???????? NA
>>>>>>60??????????????????????? aALTAEHFAALQSLLK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>61??????????????????????????
AAMADTFLEHMCR???????????????????? 3????????? 1???????? NA???????? NA???????? NA
>>>>>>62?????????????????????
AAMFTAGSNFNHVVQNEK???????????????????? 3????????? 1???????? NA????????
NA???????? NA
>>>>>>63???????????????????? aANATTNPSQLLPLELVDK?????????
1-n_acPro/ 2????????? 1???????? NA???????? NA???????? NA
>>>>>>64???????????????????????? aAAAAVGNAVPCGAR?????????
1-n_acPro/ 2???????? NA????????? 1???????? NA????????? 1
>>>>>>65??????????????????????
AAAAAWEEPSSGNGTAR???????????????????? 2???????? NA????????? 1????????
NA????????? 1
>>>>>>66???????????? aAAAGAAAAAAAEGEAPAEMGALLLEK?????????
1-n_acPro/ 3???????? NA????????? 1???????? NA????????? 1
>>>>>>67??????????????? aAAAVGAGHGAGGPGAASSSGGAR?????????
1-n_acPro/ 2???????? NA????????? 1????????? 1???????? NA
>>>>>>68??????????????? aAAAVGAGHGAGGPGAASSSGGAR?????????
1-n_acPro/ 3???????? NA???????? NA????????? 1???????? NA
>>>>>>69????????????????? aAAAAAAAAAAAAAATATAGPR?????????
1-n_acPro/ 2???????? NA???????? NA???????? NA????????? 1
>>>>>>70?????????????????? aAAAAAAAAAAASSPVGVGQR?????????
1-n_acPro/ 2???????? NA???????? NA???????? NA????????? 1
>>>>>>71????????????????????????????
AAAAAAALQAK???????????????????? 2???????? NA???????? NA???????? NA????????? 1
>>>>>>72?????????????????? aAAAASAPQQLSDEELFSQLR?????????
1-n_acPro/ 2???????? NA???????? NA???????? NA????????? 1
>>>>>>73??????? aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK?????????
1-n_acPro/ 3???????? NA???????? NA???????? NA????????? 1
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>2013/2/25 arun <smartpink111 at yahoo.com>
>>>>>>
>>>>>>Hi,
>>>>>>>What i said was:the `spec` column didn't change
before and after the aggregate() step.? I think you did aggregate to group it
based on Seq, Mod, z.? In the example you provided, it was already grouped.? May
be it is not in your original dataset.? Anyway, please email me the output you
are getting for your codes.
>>>>>>>
>>>>>>>Arun
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>________________________________
>>>>>>>From: Vera Costa <veracosta.rt at gmail.com>
>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>Sent: Monday, February 25, 2013 5:36 AM
>>>>>>>
>>>>>>>Subject: Re: reading data
>>>>>>>
>>>>>>>
>>>>>>>Sorry, I don't understand what you said.
>>>>>>>
>>>>>>>I need to
>>>>>>>- read data (like the code that you did)
>>>>>>>- select only data with FDR<0.01 for?all files
>>>>>>>- remove first file of each group (a1,c1,t1,...)
>>>>>>>- select only column Seq, Mod, z, spec for all files
>>>>>>>-?for each file behind merge data with the same
spec, mod an z (grouping the spec)
>>>>>>>- table frequencies of spec like:
>>>>>>>
>>>>>>>???????????? seq???????c2?????????? c3?????????
c4??????????? t1????? ....
>>>>>>>?????????? aaaaA???? 0??????????? 2???????????
5????????????? 6???????????????? this table is how many number I have in spec
(in total)
>>>>>>>
>>>>>>>
>>>>>>>I think my small code isn't correct...
>>>>>>>
>>>>>>>Thank you
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>2013/2/23 arun <smartpink111 at yahoo.com>
>>>>>>>
>>>>>>>One more thing:
>>>>>>>>The last column 'spec' in the output is
already aggregated based on `Seq`, `Mod`, `z` in the data.new directory.
>>>>>>>>?res5[[3]][[1]]
>>>>>>>>
>>>>>>>>???????????????????????????????
Seq???????????????? Mod z????? spec
>>>>>>>>1??????????? aAAAAAAAAAAAAAATATAGPR?????????
1-n_acPro/ 2???? 11833
>>>>>>>>2???????????? aAAAAAAAAAAASSPVGVGQR?????????
1-n_acPro/ 2???? 11833
>>>>>>>>3????????????????? aAAAAAAAAAGAAGGR?????????
1-n_acPro/ 2???? 13103
>>>>>>>>4??????????????????????
AAAAAAALQAK???????????????????? 2????? 3084
>>>>>>>>5??????????????????? aAAAAAGAGPEMVR?????????
1-n_acPro/ 2 9646,9821 #################check here
>>>>>>>>
>>>>>>>>6???????? aAAAAEQQQFYLLLGNLLSPDNVVR
1-<_Carbamoylation/ 2???? 33650
>>>>>>>>7???????? aAAAAEQQQFYLLLGNLLSPDNVVR
1-<_Carbamoylation/ 3???? 33607
>>>>>>>>9???????? aAAAAEQQQFYLLLGNLLSPDNVVR?????????
1-n_acPro/ 3???? 33769
>>>>>>>>11??????????? aAAAASAPQQLSDEELFSQLR?????????
1-n_acPro/ 2???? 20602
>>>>>>>>12????????????????? aAAAAVGNAVPCGAR?????????
1-n_acPro/ 2???? 10018
>>>>>>>>13???????????????
AAAAAWEEPSSGNGTAR???????????????????? 2????? 5576
>>>>>>>>14????????????????????? aAAAELSLLEK?????????
1-n_acPro/ 1???? 19662
>>>>>>>>16????????????????????
AAAAEVLGLILR???????????????????? 2???? 22857
>>>>>>>>17????? aAAAGAAAAAAAEGEAPAEMGALLLEK?????????
1-n_acPro/ 3???? 26060
>>>>>>>>18? aAAAGGGGPGTAVGATGSGIAAAAAGLAVYR?????????
1-n_acPro/ 3???? 21479
>>>>>>>>19 aAAANSGSSLPLFDCPTWAGKPPPGLHLDVVK?????????
1-n_acPro/ 3???? 21159
>>>>>>>>
>>>>>>>>aggregate() doesn't change anything here,
especially in this dataset.
>>>>>>>>In the next line you used sapply(....., ), which
gives an output,
>>>>>>>>sapply(res6[[3]][[1]]$spec,function(x)
length(gsub("\\s","",unlist(strsplit(x,","))))) #
this I believe is not correct
>>>>>>>>#??? 11833???? 11833???? 13103????? 3084
9646,9821???? 33650???? 33607???? 33769?? #here you have two `11833` and one
`9646.9821`.? Not really sure what you want here
>>>>>>>>?# ????? 1???????? 1???????? 1???????? 1????????
2???????? 1???????? 1???????? 1
>>>>>>>>? #? 20602???? 10018????? 5576???? 19662????
22857???? 26060???? 21479???? 21159
>>>>>>>>?? # ??? 1???????? 1???????? 1???????? 1????????
1???????? 1???????? 1???????? 1
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>If it is:
>>>>>>>>?table(unlist(strsplit(res6[[3]][[1]]$spec,",")))
#this makes sense
>>>>>>>>
>>>>>>>>#10018 11833 13103 19662 20602 21159 21479 22857
26060? 3084 33607 33650 33769
>>>>>>>>?# ? 1???? 2???? 1???? 1???? 1???? 1???? 1????
1???? 1???? 1???? 1???? 1???? 1
>>>>>>>># 5576? 9646? 9821
>>>>>>>>?# ? 1???? 1???? 1?
>>>>>>>>
>>>>>>>>Now coming to the last `merge` section:
>>>>>>>>do you want to merge the counts in each group by
"spec" name: #in this case "Var1"
>>>>>>>>
>>>>>>>>$group_c
>>>>>>>>$group_c$c2
>>>>>>>>??? Var1 Freq
>>>>>>>>1? 10039??? 1
>>>>>>>>2? 13200??? 1
>>>>>>>>3? 22929??? 1
>>>>>>>>4? 26117??? 1
>>>>>>>>5? 33712??? 1
>>>>>>>>6? 33774??? 1
>>>>>>>>7? 33867??? 1
>>>>>>>>8??? 379??? 1
>>>>>>>>9?? 4102??? 1
>>>>>>>>10? 5664??? 1
>>>>>>>>11? 9703??? 1
>>>>>>>>12? 9876??? 1
>>>>>>>>
>>>>>>>>$group_c$c3
>>>>>>>>??? Var1 Freq
>>>>>>>>1? 10325??? 1
>>>>>>>>2? 21555??? 1
>>>>>>>>3? 22994??? 1
>>>>>>>>4? 26142??? 1
>>>>>>>>5?? 3341??? 1
>>>>>>>>6? 33708??? 1
>>>>>>>>7? 33870??? 1
>>>>>>>>8? 34095??? 1
>>>>>>>>9?? 4397??? 1
>>>>>>>>10? 4416??? 1
>>>>>>>>11? 5960??? 1
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>A.K.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>________________________________
>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>Sent: Friday, February 22, 2013 8:36 PM
>>>>>>>>
>>>>>>>>Subject: Re: reading data
>>>>>>>>
>>>>>>>>
>>>>>>>>Oh,sorry.
>>>>>>>>Now,I'm in phone. Tomorrow, i will send.
>>>>>>>>Thank you
>>>>>>>>No dia 22 de Fev de 2013 22:06, "arun"
<smartpink111 at yahoo.com> escreveu:
>>>>>>>>
>>>>>>>>Hi,
>>>>>>>>>
>>>>>>>>>As I mentioned in my earlier post, results
that you got from your code in the same dataset 'data.new' will make it
easy for me rather than figuring out how your code works.
>>>>>>>>>Thanks,
>>>>>>>>>A.K.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>________________________________
>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>To: arun <smartpink111 at yahoo.com>
>>>>>>>>>Sent: Friday, February 22, 2013 1:13 PM
>>>>>>>>>Subject: Re: reading data
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>Hi.
>>>>>>>>>
>>>>>>>>>I use you code and it was a good, good help.
Thank you.
>>>>>>>>>
>>>>>>>>>I'm now doing a new study of the data
but I need to optimize my code.
>>>>>>>>>
>>>>>>>>>For the same data, I need:
>>>>>>>>>
>>>>>>>>>- read data (like the code that you did)
>>>>>>>>>- select only data with FDR<0.01 for?all
files
>>>>>>>>>- remove first file of each group
(a1,c1,t1,...)
>>>>>>>>>- select only column Seq, Mod, z, spec for
all files
>>>>>>>>>-?for each file behind merge data with the
same spec, mod an z (grouping the spec)
>>>>>>>>>- table frequencies of spec like:
>>>>>>>>>???????????? seq???????c2??????????
c3????????? c4??????????? t1????? ....
>>>>>>>>>?????????? aaaaA???? 0???????????
2??????????? 5????????????? 6???????????????? this table is how many number I
have in spec (in total)
>>>>>>>>>??????????? .....
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>I start doing the code.....
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>spec <- function(directory,number) {
>>>>>>>>>??setwd(directory)
>>>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>?directT <- direct[grepl("^t",
direct)]
>>>>>>>>>?directC <- direct[grepl("^c",
direct)]
>>>>>>>>>
>>>>>>>>>?lista<-lapply(direct, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>?listaC<-lapply(directC, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>?listaT<-lapply(directT, function(x)
read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>
>>>>>>>>>?#boxplots for each run
>>>>>>>>>?dcf<-c()
>>>>>>>>>?dtf<-c()
>>>>>>>>>
>>>>>>>>>?for(i in 1:length(lista)){
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>?}
>>>>>>>>>
>>>>>>>>>?for (i in 2:length(listaC)) {
>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<Pfdr,
TRUE, FALSE),]
>>>>>>>>>??dcc1<- aggregate(spec ~ Seq + Mod+z,
data = dcc1, paste, collapse = ",")
>>>>>>>>>??dcc1$counts <- sapply(dcc1$spec,
function(x) length(gsub("\\s", "", unlist(strsplit(x,
",")))))
>>>>>>>>>??dcc1<-dcc1[,-4]
>>>>>>>>>??dcf<-list(dcf,dcc1)
>>>>>>>>>
>>>>>>>>>??}
>>>>>>>>>?print(dcf)
>>>>>>>>>
>>>>>>>>>merg<-merge(dcf[[1]][[2]],dcf[[2]],by=c("Seq","Mod","z"),all=TRUE)
>>>>>>>>>print(merg)
>>>>>>>>>?for (i in 2:length(listaT)) {
>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<Pfdr,
TRUE, FALSE),]
>>>>>>>>>??dct1<- aggregate(spec ~ Seq + Mod+z,
data = dct1, paste, collapse = ",")
>>>>>>>>>??dct1$counts <- sapply(dct1$spec,
function(x) length(gsub("\\s", "", unlist(strsplit(x,
",")))))
>>>>>>>>>??dct1<-dct1[,-4]
>>>>>>>>>??dtf<-list(dtf,dct1)
>>>>>>>>>??}
>>>>>>>>>}
>>>>>>>>>spec("C:/Users/Vera
Costa/Desktop/data.new",23)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>I can doing the new code. The problem is
that I need a lot of time to do this row:
>>>>>>>>>dcc1<- aggregate(spec ~ Seq + Mod+z, data
= dcc1, paste, collapse = ",")
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>I have near than 40000 rows.
>>>>>>>>>
>>>>>>>>>Could you help me to optimize this?
>>>>>>>>>
>>>>>>>>>Thank you.
>>>>>>>>>Vera
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>2013/2/20 Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>
>>>>>>>>>Thank you very much.
>>>>>>>>>>?
>>>>>>>>>>I will try.
>>>>>>>>>>?
>>>>>>>>>>thank you
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>2013/2/20 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Hi,
>>>>>>>>>>>
>>>>>>>>>>>You can change `res4` to:
>>>>>>>>>>>lev<-sort(unique(do.call(c,lapply(seq_along(res3),function(i)
do.call(c,lapply(res3[[i]],function(x) unique(x$z)))))))
>>>>>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=lev))))))
>>>>>>>>>>>
>>>>>>>>>>>freqs1<-do.call(rbind,lapply(split(freq.f1,gsub("\\d+","",freq.f1$id)),function(x)
x[-1,])) #here there is only level for a1.? So, it is removed
>>>>>>>>>>>?average1<- colMeans(freqs1[,-1])
>>>>>>>>>>>?average1
>>>>>>>>>>>#??????? 1???????? 2???????? 3
>>>>>>>>>>>#0.3333333 8.0000000 3.6666667
>>>>>>>>>>>pvalues1<-do.call(rbind,lapply(seq_len(nrow(freqs1)),function(x)
chisq.test(freqs1[x,-1],average1)))
>>>>>>>>>>>?row.names(pvalues1)<-
row.names(freqs1)
>>>>>>>>>>>?pvalues1
>>>>>>>>>>>#???????????????? [,1]
>>>>>>>>>>>#c.group_c.2 0.7235907
>>>>>>>>>>>#c.group_c.3 0.7963287
>>>>>>>>>>>#t?????????? 0.9079200
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>A.K.
>>>>>>>>>>>
>>>>>>>>>>>----- Original Message -----
>>>>>>>>>>>
>>>>>>>>>>>From: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>To: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>>Cc: R help <r-help at
r-project.org>
>>>>>>>>>>>Sent: Tuesday, February 19, 2013
7:29 PM
>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>
>>>>>>>>>>>Hi,
>>>>>>>>>>>Try this:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>>>>>>res2<-split(lista,names(lista))
>>>>>>>>>>>res3<- lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>#Freq whole data
>>>>>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z,levels=1:3))))))
>>>>>>>>>>>names(res4)<- names(res2)
>>>>>>>>>>>library(reshape2)
>>>>>>>>>>>freq.i1<-do.call(rbind,lapply(res4,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>>>>>>>freq.i1
>>>>>>>>>>>#????????? id 1? 2 3
>>>>>>>>>>>#group_a?? a1 1 12 6
>>>>>>>>>>>#group_c.1 c1 0 10 3
>>>>>>>>>>>#group_c.2 c2 0 12 3
>>>>>>>>>>>#group_c.3 c3 0 13 4
>>>>>>>>>>>#group_t.1 t1 0 10 4
>>>>>>>>>>>#group_t.2 t2 1 12 6
>>>>>>>>>>>
>>>>>>>>>>>freq.rel.i1<-
as.matrix(freq.i1[,-1]/rowSums(freq.i1[,-1]) )
>>>>>>>>>>>?freq.rel.i1
>>>>>>>>>>>?# ???????????????? 1????????
2???????? 3
>>>>>>>>>>>#group_a?? 0.05263158 0.6315789
0.3157895
>>>>>>>>>>>#group_c.1 0.00000000 0.7692308
0.2307692
>>>>>>>>>>>#group_c.2 0.00000000 0.8000000
0.2000000
>>>>>>>>>>>#group_c.3 0.00000000 0.7647059
0.2352941
>>>>>>>>>>>#group_t.1 0.00000000 0.7142857
0.2857143
>>>>>>>>>>>#group_t.2 0.05263158 0.6315789
0.3157895
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>#Freq with FDR< 0.01
>>>>>>>>>>>res5<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]],function(x)
as.data.frame(table(factor(x$z[x[["FDR"]]<0.01],levels=1:3))))))
>>>>>>>>>>>names(res5)<- names(res2)
>>>>>>>>>>>
>>>>>>>>>>>freq.f1<-
do.call(rbind,lapply(res5,function(x)
dcast(melt(data.frame(id=gsub("\\..*","",row.names(x)),x),id.var=c("id","Var1")),id~Var1,value.var="value")))
>>>>>>>>>>>
>>>>>>>>>>>?freq.f1
>>>>>>>>>>>?# ??????? id 1? 2 3
>>>>>>>>>>>#group_a?? a1 1 10 5
>>>>>>>>>>>#group_c.1 c1 0? 7 2
>>>>>>>>>>>#group_c.2 c2 0? 8 2
>>>>>>>>>>>#group_c.3 c3 0? 6 4
>>>>>>>>>>>#group_t.1 t1 0? 7 4
>>>>>>>>>>>#group_t.2 t2 1 10 5
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>freq.rel.f1<-
as.matrix(freq.f1[,-1]/rowSums(freq.f1[,-1]))
>>>>>>>>>>>
>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i1)))
>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>barplot(freq.rel.i1,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i1))
>>>>>>>>>>>barplot(freq.rel.f1,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f1))
>>>>>>>>>>>#change the legend position
>>>>>>>>>>>
>>>>>>>>>>>Also, didn't check the rest of
the code from chisquare test.
>>>>>>>>>>>A.K.
>>>>>>>>>>>________________________________
>>>>>>>>>>>From: Vera Costa <veracosta.rt at
gmail.com>
>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>Sent: Tuesday, February 19, 2013
4:19 PM
>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Here is the code and some outputs.
>>>>>>>>>>>
>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>?#reading data
>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>?direct<-dir(directory,pattern =
paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>
>>>>>>>>>>>?lista<-lapply(direct,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>?listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>?listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>
>>>>>>>>>>>?#count different z values
>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>??? for (i in 1:length(lista)) {
>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>>>??????? cab <- c(cab, names(dc))
>>>>>>>>>>>??}
>>>>>>>>>>>
>>>>>>>>>>>?#Relative freqs to construct the
graph
>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>?print(cab)
>>>>>>>>>>>
>>>>>>>>>>>###[1] "2" "3"
"1"
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>??? d <- matrix(ncol=length(cab))
>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>
>>>>>>>>>>>??? for (i in 1:length(listaC)) {
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of data with
FDR<0.01
>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>???????? }
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>?for (i in 1:length(listaT)) {
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>>>??dct<-table(factor(dct$z,
levels=cab))
>>>>>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>??#Relative freq of data with
FDR<0.01
>>>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>??dct1<-table(factor(dct1$z,
levels=cab))
>>>>>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>??????? }
>>>>>>>>>>>
>>>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>>>
>>>>>>>>>>>?print(freq.i)
>>>>>>>>>>>##????? 2 3 1
>>>>>>>>>>>#c1 10 3 0
>>>>>>>>>>>#c2 12 3 0
>>>>>>>>>>>#c3 13 4 0
>>>>>>>>>>>#t1 10 4 0
>>>>>>>>>>>#t2 12 6 1
>>>>>>>>>>>
>>>>>>>>>>>?print(freq.f)
>>>>>>>>>>>??###???? 2 3 1
>>>>>>>>>>>#c1? 7 2 0
>>>>>>>>>>>#c2? 8 2 0
>>>>>>>>>>>#c3? 6 4 0
>>>>>>>>>>>#t1? 7 4 0
>>>>>>>>>>>#t2 10 5 1
>>>>>>>>>>>
>>>>>>>>>>>?print(freq.rel.i)
>>>>>>>>>>>###?????????????? 2????????
3????????? 1
>>>>>>>>>>>#c1 0.7692308 0.2307692 0.00000000
>>>>>>>>>>>#c2 0.8000000 0.2000000 0.00000000
>>>>>>>>>>>#c3 0.7647059 0.2352941 0.00000000
>>>>>>>>>>>#t1 0.7142857 0.2857143 0.00000000
>>>>>>>>>>>#t2 0.6315789 0.3157895 0.05263158
>>>>>>>>>>>?print(freq.rel.f)
>>>>>>>>>>>
>>>>>>>>>>>###???????????????? 2???????? 3?????
1
>>>>>>>>>>>#c1 0.7777778 0.2222222 0.0000
>>>>>>>>>>>#c2 0.8000000 0.2000000 0.0000
>>>>>>>>>>>#c3 0.6000000 0.4000000 0.0000
>>>>>>>>>>>#t1 0.6363636 0.3636364 0.0000
>>>>>>>>>>>#t2 0.6250000 0.3125000 0.0625
>>>>>>>>>>>
>>>>>>>>>>>#Graph plot
>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>>>
>>>>>>>>>>>#average of the group (except
c1&t1)
>>>>>>>>>>>freqs<-rbind(dcf[-1,], dtf[-1,])
>>>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>>>print(average)
>>>>>>>>>>>
>>>>>>>>>>>###???????????? 2???????? 3????????
1
>>>>>>>>>>>#8.0000000 3.6666667 0.3333333
>>>>>>>>>>>
>>>>>>>>>>>#chisquare test function
>>>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>>>?somax<-sum(x)
>>>>>>>>>>>?somay<-sum(y)
>>>>>>>>>>>?nj.<-x+y
>>>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>>>?return(pvalue)
>>>>>>>>>>>?}
>>>>>>>>>>>
>>>>>>>>>>>#pvalues of the chisquare test
between sample and average (H0: two samples has the same distribution)
>>>>>>>>>>>pvalues<-c()
>>>>>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>>>}
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>#data frame with final p-values
>>>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>>>print(dataframe)
>>>>>>>>>>>
>>>>>>>>>>>###? ? sample name??? pvalue
>>>>>>>>>>>#1????????? c2 0.7235907
>>>>>>>>>>>#2????????? c3 0.7963287
>>>>>>>>>>>#3???????????? 0.9079200
>>>>>>>>>>>}
>>>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados",23)
>>>>>>>>>>>
>>>>>>>>>>>###and two barplots..
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>Here, I remove the group a1.
>>>>>>>>>>>
>>>>>>>>>>>Thank you
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>2013/2/19 arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>
>>>>>>>>>>>Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>Could you send the results for
the folder that was sent to me?? It will be easy for me.
>>>>>>>>>>>>
>>>>>>>>>>>>Arun
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>________________________________
>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>Sent: Tuesday, February 19, 2013
3:47 PM
>>>>>>>>>>>>
>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>Oh sorry, I change the folder.
>>>>>>>>>>>>
>>>>>>>>>>>>I send for your folder
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>2013/2/19 arun <smartpink111
at yahoo.com>
>>>>>>>>>>>>
>>>>>>>>>>>>Hello,
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>? Regarding the results, is
it from the same folder that you sent to me??
>>>>>>>>>>>>>I am getting different
results by running your steps.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>direct<-
list.files(recursive=TRUE)
>>>>>>>>>>>>>? direct
>>>>>>>>>>>>>#[1]
"a1/MSMS_23PepInfo.txt" "c1/MSMS_23PepInfo.txt"
"c2/MSMS_23PepInfo.txt"
>>>>>>>>>>>>>#[4]
"c3/MSMS_23PepInfo.txt" "t1/MSMS_23PepInfo.txt"
"t2/MSMS_23PepInfo.txt"
>>>>>>>>>>>>>
>>>>>>>>>>>>>?directT<-
list.files(recursive=TRUE)[grepl("^t",dir())]
>>>>>>>>>>>>>
>>>>>>>>>>>>>directT
>>>>>>>>>>>>>#[1]
"t1/MSMS_23PepInfo.txt" "t2/MSMS_23PepInfo.txt"
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>directC<-
list.files(recursive=TRUE)[grepl("^c",dir())]
>>>>>>>>>>>>>
>>>>>>>>>>>>>directC
>>>>>>>>>>>>>#[1]
"c1/MSMS_23PepInfo.txt" "c2/MSMS_23PepInfo.txt"
"c3/MSMS_23PepInfo.txt"
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>lista<-
lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>>>>>>>?
>>>>>>>>>>>>>listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#count different z values
>>>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>>>??? for (i in
1:length(lista)) {
>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>>>>>??????? cab <- c(cab,
names(dc))
>>>>>>>>>>>>>? }
>>>>>>>>>>>>>?
>>>>>>>>>>>>>?#Relative freqs to
construct the graph
>>>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>>>?print(cab)
>>>>>>>>>>>>>
>>>>>>>>>>>>>#[1] "1"
"2" "3"? #Here results are not correct
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>>>
>>>>>>>>>>>>>??? for (i in
1:length(listaC)) {
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>?print(dci) #here too.
>>>>>>>>>>>>>
>>>>>>>>>>>>>#?? 1? 2 3
>>>>>>>>>>>>>#c1 0 10 3
>>>>>>>>>>>>>#c2 0 12 3
>>>>>>>>>>>>>#c3 0 13 4
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>It is important to clear
this before I make any changes to the script.? You need to send me the output of
the same data folder to understand what is going on.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>To: arun <smartpink111 at
yahoo.com>
>>>>>>>>>>>>>Sent: Tuesday, February 19,
2013 9:24 AM
>>>>>>>>>>>>>
>>>>>>>>>>>>>Subject: Re: reading data
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Ok.
>>>>>>>>>>>>>
>>>>>>>>>>>>>Here is the code and some
outputs.
>>>>>>>>>>>>>
>>>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>>>?#reading data
>>>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>>>?direct<-dir(directory,pattern
= paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>>>>>?directT <-
direct[grepl("^t", direct)]
>>>>>>>>>>>>>?directC <-
direct[grepl("^c", direct)]
>>>>>>>>>>>>>
>>>>>>>>>>>>>?lista<-lapply(direct,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>?listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>?listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#count different z values
>>>>>>>>>>>>>?cab <- vector()
>>>>>>>>>>>>>??? for (i in
1:length(lista)) {
>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>??????? dc<-table(dc$z)
>>>>>>>>>>>>>??????? cab <- c(cab,
names(dc))
>>>>>>>>>>>>>??}
>>>>>>>>>>>>>
>>>>>>>>>>>>>?#Relative freqs to
construct the graph
>>>>>>>>>>>>>??? cab <- unique(cab)
>>>>>>>>>>>>>?print(cab)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###[1] "1"
"2" "3" "4" "5"
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??? d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>?dci<- d[-1,]
>>>>>>>>>>>>>??? dcf <- d[-1,]
>>>>>>>>>>>>>?dti <- d[-1,]
>>>>>>>>>>>>>?dtf <- d[-1,]
>>>>>>>>>>>>>
>>>>>>>>>>>>>??? for (i in
1:length(listaC)) {
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>>>??dci<- rbind(dci, dcc)
>>>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>>>??dcf<- rbind(dcf,dcc1)
>>>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>?print(dci)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>>>>>>>
>>>>>>>>>>>>>?print(dcf)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###??? 1??? 2??? 3?? 4? 5
>>>>>>>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>>>>>>>
>>>>>>>>>>>>>?for (i in 1:length(listaT))
{
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of all data
>>>>>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>>>>>??dct<-table(factor(dct$z,
levels=cab))
>>>>>>>>>>>>>??dti<- rbind(dti, dct)
>>>>>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??#Relative freq of data
with FDR<0.01
>>>>>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>??dct1<-table(factor(dct1$z,
levels=cab))
>>>>>>>>>>>>>??dtf<- rbind(dtf,dct1)
>>>>>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>
>>>>>>>>>>>>>?print(dti)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###???? 1???? 2??? 3?? 4? 5
>>>>>>>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>>>>>>>
>>>>>>>>>>>>>?print(dtf)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>####??? 1??? 2??? 3?? 4? 5
>>>>>>>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>>>>>?print(freq.i)
>>>>>>>>>>>>>##???? 1???? 2??? 3?? 4? 5
>>>>>>>>>>>>>#c1? 93? 8356 3621 450 55
>>>>>>>>>>>>>#c2 108 13513 6859 793 73
>>>>>>>>>>>>>#c3? 97 13526 6724 739 82
>>>>>>>>>>>>>#c4 101 13417 6574 761 62
>>>>>>>>>>>>>#t1? 32? 8640 4098 429 36
>>>>>>>>>>>>>#t2 128 13209 6723 788 75
>>>>>>>>>>>>>#t3? 85 13043 6691 754 82
>>>>>>>>>>>>>#t4 139 13750 7036 807 84
>>>>>>>>>>>>>
>>>>>>>>>>>>>?print(freq.f)
>>>>>>>>>>>>>??###? 1??? 2??? 3?? 4? 5
>>>>>>>>>>>>>#c1 10 4576 2100 199 17
>>>>>>>>>>>>>#c2? 7 7831 4039 314 23
>>>>>>>>>>>>>#c3 16 7887 4087 286 22
>>>>>>>>>>>>>#c4 20 7824 4045 311 20
>>>>>>>>>>>>>#t1? 5 4885 2571 196? 8
>>>>>>>>>>>>>#t2 12 7752 4209 360 28
>>>>>>>>>>>>>#t3 19 7563 4086 336 18
>>>>>>>>>>>>>#t4 14 8108 4218 312 26
>>>>>>>>>>>>>
>>>>>>>>>>>>>?print(freq.rel.i)
>>>>>>>>>>>>>###???????????? 1????????
2???????? 3????????? 4?????????? 5
>>>>>>>>>>>>>#c1 0.007395626 0.6644930
0.2879523 0.03578529 0.004373757
>>>>>>>>>>>>>#c2 0.005059496 0.6330460
0.3213248 0.03714982 0.003419844
>>>>>>>>>>>>>#c3 0.004582389 0.6389834
0.3176493 0.03491119 0.003873772
>>>>>>>>>>>>>#c4 0.004829070 0.6415013
0.3143199 0.03638537 0.002964380
>>>>>>>>>>>>>#t1 0.002417832 0.6528145
0.3096335 0.03241405 0.002720060
>>>>>>>>>>>>>#t2 0.006117670 0.6313148
0.3213210 0.03766190 0.003584572
>>>>>>>>>>>>>#t3 0.004115226 0.6314694
0.3239409 0.03650448 0.003969983
>>>>>>>>>>>>>#t4 0.006371470 0.6302714
0.3225156 0.03699120 0.003850385
>>>>>>>>>>>>>?print(freq.rel.f)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###????????????? 1????????
2???????? 3????????? 4?????????? 5
>>>>>>>>>>>>>#c1 0.0014488554 0.6629962
0.3042596 0.02883222 0.002463054
>>>>>>>>>>>>>#c2 0.0005731128 0.6411495
0.3306861 0.02570820 0.001883085
>>>>>>>>>>>>>#c3 0.0013010246 0.6413238
0.3323305 0.02325581 0.001788909
>>>>>>>>>>>>>#c4 0.0016366612 0.6402619
0.3310147 0.02545008 0.001636661
>>>>>>>>>>>>>#t1 0.0006523157 0.6373125
0.3354207 0.02557078 0.001043705
>>>>>>>>>>>>>#t2 0.0009707952 0.6271337
0.3405064 0.02912386 0.002265189
>>>>>>>>>>>>>#t3 0.0015804359 0.6290967
0.3398769 0.02794876 0.001497255
>>>>>>>>>>>>>#t4 0.0011042751 0.6395330
0.3327023 0.02460956 0.002050797
>>>>>>>>>>>>>
>>>>>>>>>>>>>#Graph plot
>>>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>>>>>
>>>>>>>>>>>>>#average of the group
(except c1&t1)
>>>>>>>>>>>>>freqs<-rbind(dcf[-1,],
dtf[-1,])
>>>>>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>>>>>print(average)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###???????? 1?????????
2????????? 3????????? 4????????? 5
>>>>>>>>>>>>>?# 14.66667 7827.50000
4114.00000? 319.83333?? 22.83333
>>>>>>>>>>>>>
>>>>>>>>>>>>>#chisquare test function
>>>>>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>>>>>?somax<-sum(x)
>>>>>>>>>>>>>?somay<-sum(y)
>>>>>>>>>>>>>?nj.<-x+y
>>>>>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>>>>>?return(pvalue)
>>>>>>>>>>>>>?}
>>>>>>>>>>>>>
>>>>>>>>>>>>>#pvalues of the chisquare
test between sample and average (H0: two samples has the same distribution)
>>>>>>>>>>>>>pvalues<-c()
>>>>>>>>>>>>>for (i in 1:(nrow(freqs))){
>>>>>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>>>>>}
>>>>>>>>>>>>>print(pvalues)
>>>>>>>>>>>>>##[1] 0.5307206 0.6849480
0.8332661 0.3474956 0.5546527 0.9387602
>>>>>>>>>>>>>
>>>>>>>>>>>>>#data frame with final
p-values
>>>>>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>>>>>print(dataframe)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###? sample name??? pvalue
>>>>>>>>>>>>>#1????????? c2 0.5307206
>>>>>>>>>>>>>#2????????? c3 0.6849480
>>>>>>>>>>>>>#3????????? c4 0.8332661
>>>>>>>>>>>>>#4????????? t2 0.3474956
>>>>>>>>>>>>>#5????????? t3 0.5546527
>>>>>>>>>>>>>#6????????? t4 0.9387602
>>>>>>>>>>>>>}
>>>>>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados",23)
>>>>>>>>>>>>>
>>>>>>>>>>>>>###and two barplots...
>>>>>>>>>>>>>
>>>>>>>>>>>>>Thank you
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>2013/2/19 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>
>>>>>>>>>>>>>Got it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>So, if I run your codes
that you sent yesterday, will I get the correct results for relative frequency
etc.? It would be also great if you can sent me the output generated using your
codes (on two groups as you showed yesterday).? It will help me in checking
results much faster than running your code and see if that is the result
(because I have to do some adjustment to your code for running in linux
especially the ?dir()).?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>I may be able to run it
only later.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>Sent: Tuesday, February
19, 2013 8:53 AM
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>I sent in second email.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>But I send again.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>2013/2/19 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Your attachment
didn't came through.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>Sent: Tuesday,
February 19, 2013 8:47 AM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Subject: Re: reading
data
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Sorry about a lot of
questions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>I attach a small
part of my real data (I have a lot of row).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>My main objective is
construct two?graph. The first with the relative frequencies of each group
(c1,c2,c3....). The second with the same frequencies but with FDR<0.01.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>After that I need to
do the average in each group (but without the first group-c1,t1,a1....) and do
the qui square test to see if the groups has the?same distribution. You
understand?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>At first, I had only
two groups, and I did the code that I sent you. But I need a general code, not
for two groups that I know the names, but for all groups (sometimes I can?have 7
or 8 or 9 groups).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>it?s better now my
explanation??:-)
>>>>>>>>>>>>>>>My English isn't
also very good :-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Please not publish
this data in forum...
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Thank you
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>2013/2/18 arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>I run the codes
to understand what was going on.?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>I didn't
fully understand it as you constructed the codes for your original dataset and
not for the 'data` directory you sent to me.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>From: Vera Costa
<veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>Sent: Monday,
February 18, 2013 4:02 PM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Subject: Re:
reading data
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>>>>>I don't need
the same,but equivalent. I will try your suggestions.
>>>>>>>>>>>>>>>>Thank you.
>>>>>>>>>>>>>>>>No dia 18 de Fev
de 2013 19:41, "arun" <smartpink111 at yahoo.com> escreveu:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>I am not
able to open your graph.? I am using linux.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Also, the
codes in the function are not reproducible
>>>>>>>>>>>>>>>>>?directT
<- direct[grepl("^t", direct)]
>>>>>>>>>>>>>>>>>?directC
<- direct[grepl("^c", direct)]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>It takes
double the time to know what is going on.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>dir()
>>>>>>>>>>>>>>>>>#[1]
"a1" "a2" "a3" "b1" "b2"
"c1"
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>direct<-
list.files(recursive=TRUE)[grepl("^a|^b",dir())]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?direct
>>>>>>>>>>>>>>>>>#[1]
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
"MSMS_23PepInfo.txt"
>>>>>>>>>>>>>>>>>#[4]
"MSMS_23PepInfo.txt" "MSMS_23PepInfo.txt"
>>>>>>>>>>>>>>>>>directA<-
list.files(recursive=TRUE)[grepl("^a",dir())]
>>>>>>>>>>>>>>>>>directB<-
list.files(recursive=TRUE)[grepl("^b",dir())]
>>>>>>>>>>>>>>>>>lista<-
lapply(direct,function(x)
read.table(x,header=TRUE,stringsAsFactors=FALSE,sep="\t",fill=TRUE))
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>listaA<-lapply(directA,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>>>>>listaB<-lapply(directB,
function(x) read.table(x,header=TRUE, sep = "\t",fill=TRUE))
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#here I am
changing the names listaT, z, etc..
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>count
different mm values
>>>>>>>>>>>>>>>>>?cab <-
vector()
>>>>>>>>>>>>>>>>>??? for (i
in 1:length(lista)) {
>>>>>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$b<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>>>>>???????
dc<-table(dc$mm)
>>>>>>>>>>>>>>>>>??????? cab
<- c(cab, names(dc))
>>>>>>>>>>>>>>>>>? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?#Relative
freqs to construct the graph
>>>>>>>>>>>>>>>>>??? cab
<- unique(cab)
>>>>>>>>>>>>>>>>>??? d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>>>>>?dci<-
d[-1,]
>>>>>>>>>>>>>>>>>??? dcf
<- d[-1,]
>>>>>>>>>>>>>>>>>?dti <-
d[-1,]
>>>>>>>>>>>>>>>>>?dtf <-
d[-1,]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>???
########################################
>>>>>>>>>>>>>>>>>?for (i in
1:length(listaA)) {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>? #Relative
freq of all data
>>>>>>>>>>>>>>>>>?
dcc<-listaA[[i]]
>>>>>>>>>>>>>>>>>?
dcc<-table(factor(dcc$mm, levels=cab))
>>>>>>>>>>>>>>>>>? dci<-
rbind(dci, dcc)
>>>>>>>>>>>>>>>>>?
rownames(dci)<-rownames(1:(nrow(dci)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>? #Relative
freq of data with FDR<0.01
>>>>>>>>>>>>>>>>>?
dcc1<-listaA[[i]][ifelse(listaA[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>>>>>?
dcc1<-table(factor(dcc1$mm, levels=cab))
>>>>>>>>>>>>>>>>>? dcf<-
rbind(dcf,dcc1)
>>>>>>>>>>>>>>>>>?
rownames(dcf)<-rownames(1:(nrow(dcf)), do.NULL = FALSE, prefix =
"a")
>>>>>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?for (i in
1:length(listaB)) {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>? #Relative
freq of all data
>>>>>>>>>>>>>>>>>?
dct<-listaB[[i]]
>>>>>>>>>>>>>>>>>?
dct<-table(factor(dct$mm, levels=cab))
>>>>>>>>>>>>>>>>>? dti<-
rbind(dti, dct)
>>>>>>>>>>>>>>>>>?
rownames(dti)<-rownames(1:(nrow(dti)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>? #Relative
freq of data with FDR<0.01
>>>>>>>>>>>>>>>>>?
dct1<-listaB[[i]][ifelse(listaB[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>>>>>?
dct1<-table(factor(dct1$mm, levels=cab))
>>>>>>>>>>>>>>>>>? dtf<-
rbind(dtf,dct1)
>>>>>>>>>>>>>>>>>?
rownames(dtf)<-rownames(1:(nrow(dtf)), do.NULL = FALSE, prefix =
"b")
>>>>>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>>>>>?
freq.i<-rbind(dci,dti)
>>>>>>>>>>>>>>>>>?
freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>>>>>>>?
freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>>>>>>>?
freq.rel.f<-freq.f/apply(freq.f,1,sum)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?freq.i
>>>>>>>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>>>>>>>#a1 4 1
>>>>>>>>>>>>>>>>>#a2 4 1
>>>>>>>>>>>>>>>>>#a3 4 1
>>>>>>>>>>>>>>>>>#b1 4 1
>>>>>>>>>>>>>>>>>#b2 4 1
>>>>>>>>>>>>>>>>>#b3 4 1
>>>>>>>>>>>>>>>>>#b4 4 1
>>>>>>>>>>>>>>>>>#result from
my code.??
>>>>>>>>>>>>>>>>>?files<-paste("MSMS_",23,"PepInfo.txt",sep="")
>>>>>>>>>>>>>>>>>read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>>>>>>>lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>>>>>>>names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>res2<-split(lista,names(lista))
>>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>>res4<-lapply(seq_along(res3),function(i)
do.call(rbind,lapply(res3[[i]], function(x)
table(x$mm[x[["b"]]<0.01]))))
>>>>>>>>>>>>>>>>>?names(res4)<-
names(res2)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>res4
>>>>>>>>>>>>>>>>>$group_a
>>>>>>>>>>>>>>>>>#?? 2 3
>>>>>>>>>>>>>>>>>#a1 3 1
>>>>>>>>>>>>>>>>>#a2 3 1
>>>>>>>>>>>>>>>>>#a3 3 1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>>>>>>>#b1 3 1
>>>>>>>>>>>>>>>>>#b2 3 1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>>>>>>?#? 2 3
>>>>>>>>>>>>>>>>>#c1 3 1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>There is a
difference in output from freq.i and res4.? There were only two files under
'group_b`.? So, check your codes.
>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>From: Vera
Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>Sent:
Monday, February 18, 2013 10:27 AM
>>>>>>>>>>>>>>>>>Subject: Re:
reading data
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Hi!!!
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I'm
coming to ask a new question.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>I want a
function to do my statistics. I start with you had send me:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>>>>>>>?indx<-gsub("[./]","",list.dirs())
>>>>>>>>>>>>>>>>>?indx1<-
indx[indx!=""]
>>>>>>>>>>>>>>>>>?print(indx1)
>>>>>>>>>>>>>>>>>?files<-paste("MSMS_",number,"PepInfo.txt",sep="")
>>>>>>>>>>>>>>>>>?read.data<-function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y) read.table(y,header=TRUE,sep =
"\t",stringsAsFactors=FALSE,fill=TRUE))}
>>>>>>>>>>>>>>>>>?lista<-do.call("c",lapply(list.files(recursive=T)[grep(files,list.files(recursive=T))],read.data))
>>>>>>>>>>>>>>>>>?print(lista)
>>>>>>>>>>>>>>>>>?#names(lista)<-paste("group_",gsub("\\d+","",names(lista)),sep="")?ve
= TRUE)
>>>>>>>>>>>>>>>>>?}
>>>>>>>>>>>>>>>>>z.plot("C:/Users/Vera
Costa/Desktop/dados.lixo",23)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>In my
lista?I?can?t merge rows to have the group, because the idea is for each file
count? frequencies of mm, when b<0.01. after that I want a graph like the
graph in attach.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>When I had 2
groups and knew the name of the groups, I did the code (but Know I have more
groups and, maybe, I don?t know the name of the groups):
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>z.plot <-
function(directory,number) {
>>>>>>>>>>>>>>>>>?#reading
data
>>>>>>>>>>>>>>>>>??setwd(directory)
>>>>>>>>>>>>>>>>>?direct<-dir(directory,pattern
= paste("MSMS_",number,"PepInfo.txt",sep=""),
full.names = FALSE, recursive = TRUE)
>>>>>>>>>>>>>>>>>?directT
<- direct[grepl("^t", direct)]
>>>>>>>>>>>>>>>>>?directC
<- direct[grepl("^c", direct)]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?lista<-lapply(direct,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>>>>>?listaC<-lapply(directC,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>>>>>?listaT<-lapply(directT,
function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?#count
different z values
>>>>>>>>>>>>>>>>>?cab <-
vector()
>>>>>>>>>>>>>>>>>??? for (i
in 1:length(lista)) {
>>>>>>>>>>>>>>>>>????????
dc<-lista[[i]][ifelse(lista[[i]]$FDR<0.01, TRUE, FALSE),]
>>>>>>>>>>>>>>>>>???????
dc<-table(dc$z)
>>>>>>>>>>>>>>>>>??????? cab
<- c(cab, names(dc))
>>>>>>>>>>>>>>>>>??}
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?#Relative
freqs to construct the graph
>>>>>>>>>>>>>>>>>??? cab
<- unique(cab)
>>>>>>>>>>>>>>>>>??? d <-
matrix(ncol=length(cab))
>>>>>>>>>>>>>>>>>?dci<-
d[-1,]
>>>>>>>>>>>>>>>>>??? dcf
<- d[-1,]
>>>>>>>>>>>>>>>>>?dti <-
d[-1,]
>>>>>>>>>>>>>>>>>?dtf <-
d[-1,]
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>??? for (i
in 1:length(listaC)) {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>??#Relative
freq of all data
>>>>>>>>>>>>>>>>>??dcc<-listaC[[i]]
>>>>>>>>>>>>>>>>>??dcc<-table(factor(dcc$z,
levels=cab))
>>>>>>>>>>>>>>>>>??dci<-
rbind(dci, dcc)
>>>>>>>>>>>>>>>>>??rownames(dci)<-rownames(1:(nrow(dci)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>??#Relative
freq of data with FDR<0.01
>>>>>>>>>>>>>>>>>??dcc1<-listaC[[i]][ifelse(listaC[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>>>>>??dcc1<-table(factor(dcc1$z,
levels=cab))
>>>>>>>>>>>>>>>>>??dcf<-
rbind(dcf,dcc1)
>>>>>>>>>>>>>>>>>??rownames(dcf)<-rownames(1:(nrow(dcf)),
do.NULL = FALSE, prefix = "c")
>>>>>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>?for (i in
1:length(listaT)) {
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>??#Relative
freq of all data
>>>>>>>>>>>>>>>>>??dct<-listaT[[i]]
>>>>>>>>>>>>>>>>>??dct<-table(factor(dct$z,
levels=cab))
>>>>>>>>>>>>>>>>>??dti<-
rbind(dti, dct)
>>>>>>>>>>>>>>>>>??rownames(dti)<-rownames(1:(nrow(dti)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>??#Relative
freq of data with FDR<0.01
>>>>>>>>>>>>>>>>>??dct1<-listaT[[i]][ifelse(listaT[[i]]$FDR<0.01,
TRUE, FALSE),]
>>>>>>>>>>>>>>>>>??dct1<-table(factor(dct1$z,
levels=cab))
>>>>>>>>>>>>>>>>>??dtf<-
rbind(dtf,dct1)
>>>>>>>>>>>>>>>>>??rownames(dtf)<-rownames(1:(nrow(dtf)),
do.NULL = FALSE, prefix = "t")
>>>>>>>>>>>>>>>>>??????? }
>>>>>>>>>>>>>>>>>??freq.i<-rbind(dci,dti)
>>>>>>>>>>>>>>>>>??freq.f<-rbind(dcf,dtf)
>>>>>>>>>>>>>>>>>??freq.rel.i<-freq.i/apply(freq.i,1,sum)
>>>>>>>>>>>>>>>>>??freq.rel.f<-freq.f/apply(freq.f,1,sum)?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#Graph plot
>>>>>>>>>>>>>>>>>colour<-sample(rainbow(nrow(freq.rel.i)))
>>>>>>>>>>>>>>>>>par(mfrow=c(1,2))
>>>>>>>>>>>>>>>>>barplot(freq.rel.i,beside=T,main=("Sample"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.i))
>>>>>>>>>>>>>>>>>barplot(freq.rel.f,beside=T,main=("Sample
with FDR<0.01"),xlab="Charge",ylab="Relative
Frequencies",col=colour,legend.text = rownames(freq.rel.f))
>>>>>>>>>>>>>>>>>#average of
the group (except c1&t1)
>>>>>>>>>>>>>>>>>freqs<-rbind(dcf[-1,],
dtf[-1,])
>>>>>>>>>>>>>>>>>average<-apply(freqs,2,mean)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#chisquare
test function
>>>>>>>>>>>>>>>>>chisq.test<-function(x,y){
>>>>>>>>>>>>>>>>>?somax<-sum(x)
>>>>>>>>>>>>>>>>>?somay<-sum(y)
>>>>>>>>>>>>>>>>>?nj.<-x+y
>>>>>>>>>>>>>>>>>?nj<-sum(nj.)
>>>>>>>>>>>>>>>>>?ejx<-(nj./nj)*somax
>>>>>>>>>>>>>>>>>?ejy<-(nj./nj)*somay
>>>>>>>>>>>>>>>>>?ETx<-((x-ejx)^2)/ejx
>>>>>>>>>>>>>>>>>?ETy<-((y-ejy)^2)/ejy
>>>>>>>>>>>>>>>>>?ETobs<-sum(ETx)+sum(ETy)
>>>>>>>>>>>>>>>>>?pvalue<-1-pchisq(c(ETobs),df=length(x|y)-1,lower.tail=TRUE)
>>>>>>>>>>>>>>>>>?return(pvalue)
>>>>>>>>>>>>>>>>>?}
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>#pvalues of
the chisquare test between sample and average (H0: two samples has the same
distribution)
>>>>>>>>>>>>>>>>>pvalues<-c()
>>>>>>>>>>>>>>>>>for (i in
1:(nrow(freqs))){
>>>>>>>>>>>>>>>>>a<-chisq.test(freqs[i,],average)
>>>>>>>>>>>>>>>>>pvalues<-c(pvalues,a)
>>>>>>>>>>>>>>>>>}
>>>>>>>>>>>>>>>>>#data frame
with final p-values
>>>>>>>>>>>>>>>>>dataframe<-data.frame(c(rownames(freqs)),
c(pvalues))
>>>>>>>>>>>>>>>>>colnames(dataframe)<-c("sample
name","pvalue")
>>>>>>>>>>>>>>>>>print(dataframe)
>>>>>>>>>>>>>>>>>}
>>>>>>>>>>>>>>>>>z.plot("C:/Users/Vera/Desktop/data",23)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>Thank you
again
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>2013/2/17
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>HI Vera,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>No
problem.? I am cc:ing to r-help.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>To: arun
<smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>Sent:
Sunday, February 17, 2013 5:44 AM
>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Hi.
Thank you. It works now:-)
>>>>>>>>>>>>>>>>>>And yes,
I use windows.
>>>>>>>>>>>>>>>>>>Thank
you very much.
>>>>>>>>>>>>>>>>>>No dia
17 de Fev de 2013 00:44, "arun" <smartpink111 at yahoo.com>
escreveu:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>Hi Vera,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Have
you tried the suggestion?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Are
you using Windows?
>>>>>>>>>>>>>>>>>>>Thanks,
>>>>>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>To:
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>Sent:
Saturday, February 16, 2013 7:10 PM
>>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Thank
you.
>>>>>>>>>>>>>>>>>>>In
mine, I have an error " 'what' must be a character string or a
function".
>>>>>>>>>>>>>>>>>>>I
need to do equivalent in my system.
>>>>>>>>>>>>>>>>>>>Thank
you and sorry one more time.
>>>>>>>>>>>>>>>>>>>No
dia 16 de Fev de 2013 23:53, "arun" <smartpink111 at yahoo.com>
escreveu:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>>>>You
didn't mention what the error message or whether you are reading file names
which are? not "mmmmm11kk.txt".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>It
is workiing on my system as I run it again.
>>>>>>>>>>>>>>>>>>>>?c()
combine values into a vector or list.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>?sessionInfo()
>>>>>>>>>>>>>>>>>>>>R
version 2.15.1 (2012-06-22)
>>>>>>>>>>>>>>>>>>>>Platform:
x86_64-pc-linux-gnu (64-bit)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>locale:
>>>>>>>>>>>>>>>>>>>>?[1]
LC_CTYPE=en_CA.UTF-8?????? LC_NUMERIC=C?????????????
>>>>>>>>>>>>>>>>>>>>?[3]
LC_TIME=en_CA.UTF-8??????? LC_COLLATE=en_CA.UTF-8???
>>>>>>>>>>>>>>>>>>>>?[5]
LC_MONETARY=en_CA.UTF-8??? LC_MESSAGES=en_CA.UTF-8??
>>>>>>>>>>>>>>>>>>>>?[7]
LC_PAPER=C???????????????? LC_NAME=C????????????????
>>>>>>>>>>>>>>>>>>>>?[9]
LC_ADDRESS=C?????????????? LC_TELEPHONE=C???????????
>>>>>>>>>>>>>>>>>>>>[11]
LC_MEASUREMENT=en_CA.UTF-8 LC_IDENTIFICATION=C??????
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>attached
base packages:
>>>>>>>>>>>>>>>>>>>>[1]
stats???? graphics? grDevices utils???? datasets? methods?? base????
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>other
attached packages:
>>>>>>>>>>>>>>>>>>>>[1]
stringr_0.6.2? reshape2_1.2.2
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>loaded
via a namespace (and not attached):
>>>>>>>>>>>>>>>>>>>>[1]
plyr_1.8
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>#code
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>>>>>#result
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>res3
>>>>>>>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>>>>>>>#$group_a$a1
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>$group_a$a2
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>$group_a$a3
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>$group_b
>>>>>>>>>>>>>>>>>>>>$group_b$b1
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>$group_b$b2
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>$group_c
>>>>>>>>>>>>>>>>>>>>$group_c$c1
>>>>>>>>>>>>>>>>>>>>????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>>To:
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>>Sent:
Saturday, February 16, 2013 6:32 PM
>>>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>Sorry
again... In:
>>>>>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("...
>>>>>>>>>>>>>>>>>>>>What
is this c? In do.call(c,?? When I put this row im R, I have an error.
>>>>>>>>>>>>>>>>>>>>Thank
you
>>>>>>>>>>>>>>>>>>>>No
dia 15 de Fev de 2013 18:11, "arun" <smartpink111 at yahoo.com>
escreveu:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>>>>>No
problem.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>BTW,
these questions are not stupid..
>>>>>>>>>>>>>>>>>>>>>Arun
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>>>To:
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 1:08 PM
>>>>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>Thank
you very much.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>I
will try to apply and after I tell you if it is ok :-)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>Thank
you and sorry about this questions (sometimes stupid questions).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>2013/2/15
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>>>>>>>No
problem.
>>>>>>>>>>>>>>>>>>>>>>?c()
for concatenate to vector or list().
>>>>>>>>>>>>>>>>>>>>>>If
I use do.call(cbind,..) or do.call(rbind,...)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>do.call(cbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>>>>>>>#??
[,1]??? [,2]??? [,3]??? [,4]??? [,5]??? [,6]??
>>>>>>>>>>>>>>>>>>>>>>#a1
List,11 List,11 List,11 List,11 List,11 List,11
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>?do.call(rbind,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))?
>>>>>>>>>>>>>>>>>>>>>>#????
a1????
>>>>>>>>>>>>>>>>>>>>>>#[1,]
List,11
>>>>>>>>>>>>>>>>>>>>>>#[2,]
List,11
>>>>>>>>>>>>>>>>>>>>>>#[3,]
List,11
>>>>>>>>>>>>>>>>>>>>>>#[4,]
List,11
>>>>>>>>>>>>>>>>>>>>>>#[5,]
List,11
>>>>>>>>>>>>>>>>>>>>>>#[6,]
List,11
>>>>>>>>>>>>>>>>>>>>>>ie.
>>>>>>>>>>>>>>>>>>>>>>list
within in a list
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>?restrial<-lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))})
>>>>>>>>>>>>>>>>>>>>>>?str(restrial)
>>>>>>>>>>>>>>>>>>>>>>#List
of 6
>>>>>>>>>>>>>>>>>>>>>>#
$ :List of 1
>>>>>>>>>>>>>>>>>>>>>>?
#..$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>>>>>>>?
.#. ..$ Id: chr [1:6] "aAA" "aAAAA" "aA"
"aAA" ...
>>>>>>>>>>>>>>>>>>>>>>?
#.. ..$ M : chr [1:6] "1" "1" "2" "1"
...
>>>>>>>>>>>>>>>>>>>>>>?
#. ..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>>>>>>>?
#. ..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>>>>>>>?
-----------------------------------------------------------------
>>>>>>>>>>>>>>>>>>>>>>str(res)
>>>>>>>>>>>>>>>>>>>>>>#List
of 6
>>>>>>>>>>>>>>>>>>>>>>#
$ a1:'data.frame':??? 6 obs. of? 11 variables:
>>>>>>>>>>>>>>>>>>>>>>?#
..$ Id: chr [1:6] "aAA" "aAAAA" "aA"
"aAA" ...
>>>>>>>>>>>>>>>>>>>>>>?
#..$ M : chr [1:6] "1" "1" "2" "1" ...
>>>>>>>>>>>>>>>>>>>>>>?#
..$ mm: int [1:6] 2 2 1 2 3 2
>>>>>>>>>>>>>>>>>>>>>>?#
..$ x : int [1:6] 739 2263 1 1965 3660 1972
>>>>>>>>>>>>>>>>>>>>>>-----------------------------------------------------------------
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>You
mentioned about naming this to "group_a","group_b". etc..
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>res3<-
lapply(res2,function(x)
{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>>>>>>>>>>>>>>>>?res3$group_a
>>>>>>>>>>>>>>>>>>>>>>$a1
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>#$a2
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>#$a3
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>?#
?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>________________________________
>>>>>>>>>>>>>>>>>>>>>>From:
Vera Costa <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>>>>To:
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 12:39 PM
>>>>>>>>>>>>>>>>>>>>>>Subject:
Re: reading data
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>Thank
you very much and sorry my questions.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>But
this code isn't grouping for letters sure? I mean, a1,a2,a3 is the same
group, (the first letter give me the name of the group)
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>Another
question, in do.call, you did do.call (c,.....) .What is c?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>Sorry
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>2013/2/15
arun <smartpink111 at yahoo.com>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>HI,
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>Just
to add:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>res<-do.call(c,lapply(list.files(recursive=T)[grep("mmmmm11kk",list.files(recursive=T))],function(x)
{names(x)<-gsub("^(.*)\\/.*","\\1",x);
lapply(x,function(y)
read.table(y,header=TRUE,stringsAsFactors=FALSE,fill=TRUE))}))? #it seems like
one of the rows of your file doesn't have 6 elements, so added fill=TRUE
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>?names(res)<-paste("group_",gsub("\\d+","",names(res)),sep="")
>>>>>>>>>>>>>>>>>>>>>>>res[grep("group_b",names(res))]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>I
am not sure how you want the grouped data to look like.? If you want something
like this:
>>>>>>>>>>>>>>>>>>>>>>>res1<-do.call(rbind,res)
>>>>>>>>>>>>>>>>>>>>>>>res2<-lapply(split(res1,gsub("[.0-9]","",row.names(res1))),function(x)
{row.names(x)<-1:nrow(x);x})
>>>>>>>>>>>>>>>>>>>>>>>res2
>>>>>>>>>>>>>>>>>>>>>>>#$group_a
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>?#
??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>>#1???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#2?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#3????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#4???
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#5??
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#6????
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>#7???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#8?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#9????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#10??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#11?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#12???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>#13??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#14
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#15???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#16??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#17?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#18???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>>>>>?#
??? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>>#1???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#2?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#3????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#4???
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#5??
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#6????
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>#7???
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#8?
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#9????
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#10??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#11?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#12???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>#$group_c
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>?#
?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>#or
if you want it like this:
>>>>>>>>>>>>>>>>>>>>>>>res2<-split(res,names(res))
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>res2[["group_b"]]
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>>>>>#????
Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>#$group_b
>>>>>>>>>>>>>>>>>>>>>>>?#
?? Id? M mm??? x???????? b? u? k? j??? y??????? p??? v
>>>>>>>>>>>>>>>>>>>>>>>#1??
aAA? 1? 2? 739 0.1257000? 2? 2 AA??? 2???? 8867 8926
>>>>>>>>>>>>>>>>>>>>>>>#2
aAAAA? 1? 2 2263 0.0004000? 2? 2 AR??? 4???? 7640 8926
>>>>>>>>>>>>>>>>>>>>>>>#3???
aA? 2? 1??? 1 0.0845435? 2 AA? 2 6790 734,1092?? NA
>>>>>>>>>>>>>>>>>>>>>>>#4??
aAA? 1? 2 1965 0.0007000? 4? 3 AR??? 2??? 11616 8926
>>>>>>>>>>>>>>>>>>>>>>>#5?
aAAA? 1? 3 3660 0.0008600 18? 3 AA??? 2??? 20392? 496
>>>>>>>>>>>>>>>>>>>>>>>#6???
AA na? 2 1972 0.0007000 11? 3 AR?? 25????? 509? 734
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>Hope
this helps.
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>A.K.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>-----
Original Message -----
>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>From:
"veracosta.rt at gmail.com" <veracosta.rt at gmail.com>
>>>>>>>>>>>>>>>>>>>>>>>To:
smartpink111 at yahoo.com
>>>>>>>>>>>>>>>>>>>>>>>Cc:
>>>>>>>>>>>>>>>>>>>>>>>Sent:
Friday, February 15, 2013 9:15 AM
>>>>>>>>>>>>>>>>>>>>>>>Subject:
reading data
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>Hi,
>>>>>>>>>>>>>>>>>>>>>>>I
post yesterday and you helped me. I have little problem.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>At
first, I never worked with regular expressions...
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>The
code that you gave me it's ok, but my files are inside the folders a1,a2,a3.
I try to explain better.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>I
have one folder named "data". Inside this folder I have some other
folders named "a1","a2","b1",b2",...and
inside of each one of that I have some files. I want only the file
"mmmmmm.txt" (in all folders I have One file with this name).
>>>>>>>>>>>>>>>>>>>>>>>The
name of the folder give me the name of the group,but I need to read the file
inside. And after, have "group_a", group_"b"...because I
need to work with this data grouped (and know the name of the group).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>Thank
you.
>>>>>>>>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>>>>>>>>???
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>????????????????????????????????
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>????????
>>>>>>>>>>>>>>>?
>>>>>>>>>>>>>>????????????????????????????????????????????
>>>>>>>>>>>>>?
>>>>>>>>>>>>????????????????????????????????????
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>?????????????????????????
>>>>>>>>>
>>>>>>>>???? ???
>>>>>>>?????????
>>>>>>?
>>>>>??
>>>>???
>>>
>>
>>
>>????????
>??

Maybe Matching Threads

Search for more seemingly similar threads

R help - Feb 2013 - reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

[R] reading data

Maybe Matching Threads