I can perhaps do this: m=Reduce(function(x, y) merge(x, y, all=TRUE), list(s11, s22, s33,s44,s55)) but than in the output of this one SNP (just for example)> head(m)rs V1.1 V3.1 V4.1 V1.2 V3.2 V4.2 V1.3 6 rs1029829 ENSG00000154803 1.02519e-11 0.469402 <NA> NA NA ENSG00000141030 V3.3 V4.3 V1.4 V3.4 V4.4 V1.5 V3.5 V4.5 6 3.06126e-28 0.726948 <NA> NA NA <NA> NA NA ... but how to filter out this output (m) in order to remove all rows where I have NA in any of these columns: V1.1,V1.2,V1.3,V1.4,V1.5 On Tue, Dec 3, 2019 at 1:48 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> the desired output would look like this (example give just for two genes, > it should include all 5 from all 5 data frames): > > where the example is if say only 5 rs are shared between those two genes, > what is given after rs# is values from V4 column for each gene > > GENES ENSG00000001629 ENSG00000127914 > rs1208998 -0.0337989326337439 -0.00106024397995199 > rs4729008 0.0630831868839983 0.00890783698397027 > rs11772754 0.181375539335959 0.0012636115921931 > rs10257459 0.0369962603988132 0.00509887844657462 > rs17164876 0.0307882763321834 -0.00188979524322732 > > On Tue, Dec 3, 2019 at 1:40 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > >> Hello, >> >> I have 5 dataframes (s11,s22,s33,s44,s55) that look like this: >> >> > head(s11) >> V1.1 rs V3.1 V4.1 >> 1 ENSG00000154803 rs12940868 3.80175e-05 -0.519565 >> 2 ENSG00000154803 rs4383187 8.92772e-05 -0.367303 >> 3 ENSG00000154803 rs4404112 9.32402e-05 -0.366634 >> 4 ENSG00000154803 rs7214091 8.38003e-05 0.337576 >> 5 ENSG00000154803 rs35871790 9.67028e-05 -0.305755 >> 6 ENSG00000154803 rs112532541 1.08341e-04 -0.305493 >> >> > head(s22) >> V1.2 rs V3.2 V4.2 >> 602 ENSG00000264589 rs62065452 1.34475e-17 -0.695948 >> 603 ENSG00000264589 rs377004743 1.26272e-17 -0.695627 >> 630 ENSG00000264589 rs1724390 1.01129e-17 -0.693518 >> 643 ENSG00000264589 rs367637729 4.05726e-17 -0.682833 >> 653 ENSG00000264589 rs376183404 1.13177e-17 -0.697646 >> 673 ENSG00000264589 rs112327620 1.59840e-17 -0.707904 >> >> Each one has one unique value in respective V1 >> >> I am trying to merge all at once all 5 data frames by the "rs" column. >> >> Can you please help with this, >> Ana >> >> >> >> >>[[alternative HTML version deleted]]
would this make sense for the previous: mt=na.omit(m, cols = c("V1.1","V1.2","V1.3","V1.4","V1.5")) On Tue, Dec 3, 2019 at 2:09 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> I can perhaps do this: > > m=Reduce(function(x, y) merge(x, y, all=TRUE), list(s11, s22, s33,s44,s55)) > > but than in the output of this one SNP (just for example) > > > head(m) > rs V1.1 V3.1 V4.1 V1.2 V3.2 V4.2 > V1.3 > 6 rs1029829 ENSG00000154803 1.02519e-11 0.469402 <NA> NA NA > ENSG00000141030 > V3.3 V4.3 V1.4 V3.4 V4.4 V1.5 V3.5 V4.5 > 6 3.06126e-28 0.726948 <NA> NA NA <NA> NA NA > ... > > but how to filter out this output (m) in order to remove all rows where I > have NA in any of these columns: V1.1,V1.2,V1.3,V1.4,V1.5 > > > > > > On Tue, Dec 3, 2019 at 1:48 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > >> the desired output would look like this (example give just for two genes, >> it should include all 5 from all 5 data frames): >> >> where the example is if say only 5 rs are shared between those two genes, >> what is given after rs# is values from V4 column for each gene >> >> GENES ENSG00000001629 ENSG00000127914 >> rs1208998 -0.0337989326337439 -0.00106024397995199 >> rs4729008 0.0630831868839983 0.00890783698397027 >> rs11772754 0.181375539335959 0.0012636115921931 >> rs10257459 0.0369962603988132 0.00509887844657462 >> rs17164876 0.0307882763321834 -0.00188979524322732 >> >> On Tue, Dec 3, 2019 at 1:40 PM Ana Marija <sokovic.anamarija at gmail.com> >> wrote: >> >>> Hello, >>> >>> I have 5 dataframes (s11,s22,s33,s44,s55) that look like this: >>> >>> > head(s11) >>> V1.1 rs V3.1 V4.1 >>> 1 ENSG00000154803 rs12940868 3.80175e-05 -0.519565 >>> 2 ENSG00000154803 rs4383187 8.92772e-05 -0.367303 >>> 3 ENSG00000154803 rs4404112 9.32402e-05 -0.366634 >>> 4 ENSG00000154803 rs7214091 8.38003e-05 0.337576 >>> 5 ENSG00000154803 rs35871790 9.67028e-05 -0.305755 >>> 6 ENSG00000154803 rs112532541 1.08341e-04 -0.305493 >>> >>> > head(s22) >>> V1.2 rs V3.2 >>> V4.2 >>> 602 ENSG00000264589 rs62065452 1.34475e-17 -0.695948 >>> 603 ENSG00000264589 rs377004743 1.26272e-17 -0.695627 >>> 630 ENSG00000264589 rs1724390 1.01129e-17 -0.693518 >>> 643 ENSG00000264589 rs367637729 4.05726e-17 -0.682833 >>> 653 ENSG00000264589 rs376183404 1.13177e-17 -0.697646 >>> 673 ENSG00000264589 rs112327620 1.59840e-17 -0.707904 >>> >>> Each one has one unique value in respective V1 >>> >>> I am trying to merge all at once all 5 data frames by the "rs" column. >>> >>> Can you please help with this, >>> Ana >>> >>> >>> >>> >>>[[alternative HTML version deleted]]
I apologize I would need to reformulate this problem because there will be much more unique genes I have to look up, 381 so all genes or in one data frame> head(r)V1 V2 V3 V4 1 ENSG00000273172 rs7215271 4.33932e-17 -0.602316 2 ENSG00000273172 rs34889101 4.99518e-17 -0.596089 3 ENSG00000273172 rs4890177 4.23229e-17 -0.590085 4 ENSG00000273172 rs4890178 7.14216e-17 -0.581467 5 ENSG00000273172 rs7503363 3.16802e-17 -0.582836 6 ENSG00000273172 rs35611892 2.24399e-17 -0.583710> tail(r)V1 V2 V3 V4 18946 ENSG00000141560 rs7215271 8.53890e-17 0.572286 18947 ENSG00000141560 rs606532 9.00740e-17 0.572151 18963 ENSG00000175711 rs111566282 5.71871e-17 -0.609586 18964 ENSG00000175711 rs76319775 4.58843e-17 -0.610164 18965 ENSG00000175711 rs62074661 4.17490e-17 -0.603199 18966 ENSG00000176845 rs11433639 1.45496e-17 -0.761955 So for the adobe example I would just have in result for merging this one row: because they gave this same rs: rs7215271 and output would contain all columns related to those two genes which have the same: rs7215271 it can be also possible that I can find more than 2 genes sharing the same rs. Can you please advise about this On Tue, Dec 3, 2019 at 2:16 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> would this make sense for the previous: > mt=na.omit(m, cols = c("V1.1","V1.2","V1.3","V1.4","V1.5")) > > On Tue, Dec 3, 2019 at 2:09 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > >> I can perhaps do this: >> >> m=Reduce(function(x, y) merge(x, y, all=TRUE), list(s11, s22, >> s33,s44,s55)) >> >> but than in the output of this one SNP (just for example) >> >> > head(m) >> rs V1.1 V3.1 V4.1 V1.2 V3.2 V4.2 >> V1.3 >> 6 rs1029829 ENSG00000154803 1.02519e-11 0.469402 <NA> NA NA >> ENSG00000141030 >> V3.3 V4.3 V1.4 V3.4 V4.4 V1.5 V3.5 V4.5 >> 6 3.06126e-28 0.726948 <NA> NA NA <NA> NA NA >> ... >> >> but how to filter out this output (m) in order to remove all rows where I >> have NA in any of these columns: V1.1,V1.2,V1.3,V1.4,V1.5 >> >> >> >> >> >> On Tue, Dec 3, 2019 at 1:48 PM Ana Marija <sokovic.anamarija at gmail.com> >> wrote: >> >>> the desired output would look like this (example give just for two >>> genes, it should include all 5 from all 5 data frames): >>> >>> where the example is if say only 5 rs are shared between those two >>> genes, what is given after rs# is values from V4 column for each gene >>> >>> GENES ENSG00000001629 ENSG00000127914 >>> rs1208998 -0.0337989326337439 -0.00106024397995199 >>> rs4729008 0.0630831868839983 0.00890783698397027 >>> rs11772754 0.181375539335959 0.0012636115921931 >>> rs10257459 0.0369962603988132 0.00509887844657462 >>> rs17164876 0.0307882763321834 -0.00188979524322732 >>> >>> On Tue, Dec 3, 2019 at 1:40 PM Ana Marija <sokovic.anamarija at gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I have 5 dataframes (s11,s22,s33,s44,s55) that look like this: >>>> >>>> > head(s11) >>>> V1.1 rs V3.1 V4.1 >>>> 1 ENSG00000154803 rs12940868 3.80175e-05 -0.519565 >>>> 2 ENSG00000154803 rs4383187 8.92772e-05 -0.367303 >>>> 3 ENSG00000154803 rs4404112 9.32402e-05 -0.366634 >>>> 4 ENSG00000154803 rs7214091 8.38003e-05 0.337576 >>>> 5 ENSG00000154803 rs35871790 9.67028e-05 -0.305755 >>>> 6 ENSG00000154803 rs112532541 1.08341e-04 -0.305493 >>>> >>>> > head(s22) >>>> V1.2 rs V3.2 >>>> V4.2 >>>> 602 ENSG00000264589 rs62065452 1.34475e-17 -0.695948 >>>> 603 ENSG00000264589 rs377004743 1.26272e-17 -0.695627 >>>> 630 ENSG00000264589 rs1724390 1.01129e-17 -0.693518 >>>> 643 ENSG00000264589 rs367637729 4.05726e-17 -0.682833 >>>> 653 ENSG00000264589 rs376183404 1.13177e-17 -0.697646 >>>> 673 ENSG00000264589 rs112327620 1.59840e-17 -0.707904 >>>> >>>> Each one has one unique value in respective V1 >>>> >>>> I am trying to merge all at once all 5 data frames by the "rs" column. >>>> >>>> Can you please help with this, >>>> Ana >>>> >>>> >>>> >>>> >>>>[[alternative HTML version deleted]]
On 12/3/19 12:16 PM, Ana Marija wrote:> would this make sense for the previous: > mt=na.omit(m, cols = c("V1.1","V1.2","V1.3","V1.4","V1.5")) > > On Tue, Dec 3, 2019 at 2:09 PM Ana Marija <sokovic.anamarija at gmail.com> > wrote: > >> I can perhaps do this: >> >> m=Reduce(function(x, y) merge(x, y, all=TRUE), list(s11, s22, s33,s44,s55)) >> >> but than in the output of this one SNP (just for example) >> >>> head(m) >> rs V1.1 V3.1 V4.1 V1.2 V3.2 V4.2 >> V1.3 >> 6 rs1029829 ENSG00000154803 1.02519e-11 0.469402 <NA> NA NA >> ENSG00000141030 >> V3.3 V4.3 V1.4 V3.4 V4.4 V1.5 V3.5 V4.5 >> 6 3.06126e-28 0.726948 <NA> NA NA <NA> NA NAIt's a very simple matter when using gmail to adhere to the Posting Guide policy of plaintext submission to rhelp. Failing to adhere to that rule is making your successive posting less and less readable.>> ... >> >> but how to filter out this output (m) in order to remove all rows where I >> have NA in any of these columns: V1.1,V1.2,V1.3,V1.4,V1.5The complete.cases function returns a logical vector suitable for selecting a subset. -- David.>> >> >> >> >> >> On Tue, Dec 3, 2019 at 1:48 PM Ana Marija <sokovic.anamarija at gmail.com> >> wrote: >> >>> the desired output would look like this (example give just for two genes, >>> it should include all 5 from all 5 data frames): >>> >>> where the example is if say only 5 rs are shared between those two genes, >>> what is given after rs# is values from V4 column for each gene >>> >>> GENES ENSG00000001629 ENSG00000127914 >>> rs1208998 -0.0337989326337439 -0.00106024397995199 >>> rs4729008 0.0630831868839983 0.00890783698397027 >>> rs11772754 0.181375539335959 0.0012636115921931 >>> rs10257459 0.0369962603988132 0.00509887844657462 >>> rs17164876 0.0307882763321834 -0.00188979524322732 >>> >>> On Tue, Dec 3, 2019 at 1:40 PM Ana Marija <sokovic.anamarija at gmail.com> >>> wrote: >>> >>>> Hello, >>>> >>>> I have 5 dataframes (s11,s22,s33,s44,s55) that look like this: >>>> >>>>> head(s11) >>>> V1.1 rs V3.1 V4.1 >>>> 1 ENSG00000154803 rs12940868 3.80175e-05 -0.519565 >>>> 2 ENSG00000154803 rs4383187 8.92772e-05 -0.367303 >>>> 3 ENSG00000154803 rs4404112 9.32402e-05 -0.366634 >>>> 4 ENSG00000154803 rs7214091 8.38003e-05 0.337576 >>>> 5 ENSG00000154803 rs35871790 9.67028e-05 -0.305755 >>>> 6 ENSG00000154803 rs112532541 1.08341e-04 -0.305493 >>>> >>>>> head(s22) >>>> V1.2 rs V3.2 >>>> V4.2 >>>> 602 ENSG00000264589 rs62065452 1.34475e-17 -0.695948 >>>> 603 ENSG00000264589 rs377004743 1.26272e-17 -0.695627 >>>> 630 ENSG00000264589 rs1724390 1.01129e-17 -0.693518 >>>> 643 ENSG00000264589 rs367637729 4.05726e-17 -0.682833 >>>> 653 ENSG00000264589 rs376183404 1.13177e-17 -0.697646 >>>> 673 ENSG00000264589 rs112327620 1.59840e-17 -0.707904 >>>> >>>> Each one has one unique value in respective V1 >>>> >>>> I am trying to merge all at once all 5 data frames by the "rs" column. >>>> >>>> Can you please help with this, >>>> Ana >>>> >>>> >>>> >>>> >>>> > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.