Hello R-help group, I have a question about merging lists. I have two lists: Genes list (hSgenes) name chr strand start end transStart transEnd symbol description feature ENSG00000223972 1 1 11874 14412 11874 14412 DEAD/H box polypeptide 11 like 1DEAD/H box polypeptide 11 like 3DEAD/H box polypeptide 11 like 9 ;; [Source:UniProtKB/TrEMBL;Acc:B7ZGX0] gene ENSG00000227232 1 -1 14363 29570 17551 29343 WASH5P WAS protein family homolog 5 pseudogene (WASH5P), non-coding RNA [Source:RefSeq DNA;Acc:NR_024540] gene ..... Chers list (chersList) name chr start end cellType antibody features maxLevel score chr1.cher1 1 859132 859732 human AB ENSG00000223764 ENSG00000231958 ENSG00000187634 1.25736038968316 0.664381383074449 chr1.cher2 1 889564 890464 human AB ENSG00000188976 1.47884233632064 2.88839131446868 chr1.cher3 1 1106364 1106864 human AB ENSG00000162571 1.83795654418115 3.58404359147275 .... In the second list, I want to add a column with the gene description (obtained from the first list). I used the following method: chersMergeGenes <- data.frame(chersList,description=hSgenes$description[match(chersList$features, hSgenes$name)],symbol=hSgenes$symbol[match(chersList$features, hSgenes$name)]) write.table(chersMergeGenes, row.names=F, quote=F, sep="\t", file="chersMergeGenes.txt") and it works only partially. When chersList$features contains more than a feature (e.g. ENSG00000223764 ENSG00000231958 ENSG00000187634), it doesn't work (NA as result). But I don't know how to split the features to obtain all descriptions. Can someone give me a hint to do this? Another problem: I have following data: $ENSG00000000003 [1] "GO:0043123" "GO:0004871" $ENSG00000000419 [1] "GO:0018406" "GO:0035269" "GO:0006506" "GO:0019348" "GO:0005789" [6] "GO:0005624" "GO:0005783" "GO:0033185" "GO:0004582" "GO:0004169" [11] "GO:0005515" $ENSG00000000457 [1] "GO:0005737" "GO:0030027" "GO:0005794" "GO:0005515" I want to extract a list of names ($ENSG00000?????) where go = GO:0005515. How can I do it? Thanks on advance Viviana -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr. Viviana Menzel Rottweg 34 35428 Langg?ns Tel.: +49 6403 7748550 Mobil: +49 177 5126092 E-Mail: vivianamenzel at gmx.de Web: www.dres-menzel.de
?merge plyr data.table sqldf crantastic "Dr. Viviana Menzel" <vivianamenzel at gmx.de> wrote in message news:4B58A0E9.3050806 at gmx.de... Hello R-help group, I have a question about merging lists. I have two lists: Genes list (hSgenes) name chr strand start end transStart transEnd symbol description feature ENSG00000223972 1 1 11874 14412 11874 14412 DEAD/H box polypeptide 11 like 1DEAD/H box polypeptide 11 like 3DEAD/H box polypeptide 11 like 9 ;; [Source:UniProtKB/TrEMBL;Acc:B7ZGX0] gene ENSG00000227232 1 -1 14363 29570 17551 29343 WASH5P WAS protein family homolog 5 pseudogene (WASH5P), non-coding RNA [Source:RefSeq DNA;Acc:NR_024540] gene ..... Chers list (chersList) name chr start end cellType antibody features maxLevel score chr1.cher1 1 859132 859732 human AB ENSG00000223764 ENSG00000231958 ENSG00000187634 1.25736038968316 0.664381383074449 chr1.cher2 1 889564 890464 human AB ENSG00000188976 1.47884233632064 2.88839131446868 chr1.cher3 1 1106364 1106864 human AB ENSG00000162571 1.83795654418115 3.58404359147275 .... In the second list, I want to add a column with the gene description (obtained from the first list). I used the following method: chersMergeGenes <- data.frame(chersList,description=hSgenes$description[match(chersList$features, hSgenes$name)],symbol=hSgenes$symbol[match(chersList$features, hSgenes$name)]) write.table(chersMergeGenes, row.names=F, quote=F, sep="\t", file="chersMergeGenes.txt") and it works only partially. When chersList$features contains more than a feature (e.g. ENSG00000223764 ENSG00000231958 ENSG00000187634), it doesn't work (NA as result). But I don't know how to split the features to obtain all descriptions. Can someone give me a hint to do this? Another problem: I have following data: $ENSG00000000003 [1] "GO:0043123" "GO:0004871" $ENSG00000000419 [1] "GO:0018406" "GO:0035269" "GO:0006506" "GO:0019348" "GO:0005789" [6] "GO:0005624" "GO:0005783" "GO:0033185" "GO:0004582" "GO:0004169" [11] "GO:0005515" $ENSG00000000457 [1] "GO:0005737" "GO:0030027" "GO:0005794" "GO:0005515" I want to extract a list of names ($ENSG00000?????) where go GO:0005515. How can I do it? Thanks on advance Viviana -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr. Viviana Menzel Rottweg 34 35428 Langg?ns Tel.: +49 6403 7748550 Mobil: +49 177 5126092 E-Mail: vivianamenzel at gmx.de Web: www.dres-menzel.de