thr3ads.net - R help - [R] Merging and extracting data from list [Jan 2010]

If this information is useful, please help other people find it:
Share via:

Dr. Viviana Menzel

2010-Jan-21 18:46 UTC

[R] Merging and extracting data from list

Hello R-help group,

I have a question about merging lists. I have two lists:

Genes list (hSgenes)
name    chr    strand    start    end    transStart    transEnd    
symbol    description    feature
ENSG00000223972    1    1    11874    14412    11874    14412        
DEAD/H box polypeptide 11 like 1DEAD/H box polypeptide 11 like 3DEAD/H 
box polypeptide 11 like 9 ;; [Source:UniProtKB/TrEMBL;Acc:B7ZGX0]    gene
ENSG00000227232    1    -1    14363    29570    17551    29343    
WASH5P    WAS protein family homolog 5 pseudogene (WASH5P), non-coding 
RNA [Source:RefSeq DNA;Acc:NR_024540]    gene
.....

Chers list (chersList)
name    chr    start    end    cellType    antibody    features    
maxLevel    score
chr1.cher1    1    859132    859732    human    AB    ENSG00000223764 
ENSG00000231958 ENSG00000187634    1.25736038968316    0.664381383074449
chr1.cher2    1    889564    890464    human    AB    ENSG00000188976    
1.47884233632064    2.88839131446868
chr1.cher3    1    1106364    1106864    human    AB    
ENSG00000162571    1.83795654418115    3.58404359147275
....

In the second list, I want to add a column with the gene description 
(obtained from the first list). I used the following method:

chersMergeGenes <- 
data.frame(chersList,description=hSgenes$description[match(chersList$features, 
hSgenes$name)],symbol=hSgenes$symbol[match(chersList$features, 
hSgenes$name)])
write.table(chersMergeGenes, row.names=F, quote=F, sep="\t", 
file="chersMergeGenes.txt")


and it works only partially. When chersList$features contains more than 
a feature (e.g. ENSG00000223764 ENSG00000231958 ENSG00000187634), it 
doesn't work (NA as result).
But I don't know how to split the features to obtain all descriptions.

Can someone give me a hint to do this?


Another problem:

I have following data:

$ENSG00000000003
[1] "GO:0043123" "GO:0004871"

$ENSG00000000419
 [1] "GO:0018406" "GO:0035269" "GO:0006506"
"GO:0019348" "GO:0005789"
 [6] "GO:0005624" "GO:0005783" "GO:0033185"
"GO:0004582" "GO:0004169"
[11] "GO:0005515"

$ENSG00000000457
[1] "GO:0005737" "GO:0030027" "GO:0005794"
"GO:0005515"

I want to extract a list of names ($ENSG00000?????) where go = 
GO:0005515. How can I do it?

Thanks on advance

Viviana

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Viviana Menzel
Rottweg 34
35428 Langg?ns
Tel.: +49 6403 7748550
Mobil: +49 177 5126092
E-Mail: vivianamenzel at gmx.de
Web: www.dres-menzel.de

Matthew Dowle

2010-Jan-22 09:09 UTC

head link

[R] Merging and extracting data from list

?merge
plyr
data.table
sqldf
crantastic

"Dr. Viviana Menzel" <vivianamenzel at gmx.de> wrote in message 
news:4B58A0E9.3050806 at gmx.de...
Hello R-help group,

I have a question about merging lists. I have two lists:

Genes list (hSgenes)
name    chr    strand    start    end    transStart    transEnd
symbol    description    feature
ENSG00000223972    1    1    11874    14412    11874    14412
DEAD/H box polypeptide 11 like 1DEAD/H box polypeptide 11 like 3DEAD/H
box polypeptide 11 like 9 ;; [Source:UniProtKB/TrEMBL;Acc:B7ZGX0]    gene
ENSG00000227232    1    -1    14363    29570    17551    29343
WASH5P    WAS protein family homolog 5 pseudogene (WASH5P), non-coding
RNA [Source:RefSeq DNA;Acc:NR_024540]    gene
.....

Chers list (chersList)
name    chr    start    end    cellType    antibody    features
maxLevel    score
chr1.cher1    1    859132    859732    human    AB    ENSG00000223764
ENSG00000231958 ENSG00000187634    1.25736038968316    0.664381383074449
chr1.cher2    1    889564    890464    human    AB    ENSG00000188976
1.47884233632064    2.88839131446868
chr1.cher3    1    1106364    1106864    human    AB
ENSG00000162571    1.83795654418115    3.58404359147275
....

In the second list, I want to add a column with the gene description
(obtained from the first list). I used the following method:

chersMergeGenes <-
data.frame(chersList,description=hSgenes$description[match(chersList$features,
hSgenes$name)],symbol=hSgenes$symbol[match(chersList$features,
hSgenes$name)])
write.table(chersMergeGenes, row.names=F, quote=F, sep="\t",
file="chersMergeGenes.txt")


and it works only partially. When chersList$features contains more than
a feature (e.g. ENSG00000223764 ENSG00000231958 ENSG00000187634), it
doesn't work (NA as result).
But I don't know how to split the features to obtain all descriptions.

Can someone give me a hint to do this?


Another problem:

I have following data:

$ENSG00000000003
[1] "GO:0043123" "GO:0004871"

$ENSG00000000419
 [1] "GO:0018406" "GO:0035269" "GO:0006506"
"GO:0019348" "GO:0005789"
 [6] "GO:0005624" "GO:0005783" "GO:0033185"
"GO:0004582" "GO:0004169"
[11] "GO:0005515"

$ENSG00000000457
[1] "GO:0005737" "GO:0030027" "GO:0005794"
"GO:0005515"

I want to extract a list of names ($ENSG00000?????) where go GO:0005515. How can
I do it?

Thanks on advance

Viviana

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr. Viviana Menzel
Rottweg 34
35428 Langg?ns
Tel.: +49 6403 7748550
Mobil: +49 177 5126092
E-Mail: vivianamenzel at gmx.de
Web: www.dres-menzel.de

Maybe Matching Threads

Search for more reasonably related threads

R help - Jan 2010 - Merging and extracting data from list

[R] Merging and extracting data from list

[R] Merging and extracting data from list

Maybe Matching Threads