Hi Jim, I think one of the issue is that data frames are so big,> dim(l4)[1] 166941635 8> dim(asign)[1] 107371528 5 so my example would not reproduce the error On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Hi Ana, > When I run this example taken from your email: > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", > header=TRUE,stringsAsFactors=FALSE) > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", > header=TRUE,stringsAsFactors=FALSE) > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > [1] X1 X2 X3 X4 X5 > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > [11] p.val.Retina > <0 rows> (or 0-length row.names) > > It works okay, but there are no matches in the join. So I can't even > guess what the problem is. > > Jim > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hello, > > > > I have two data frames like this: > > > > > head(l4) > > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > > head(asign) > > gene chr chr_pos pos p.val.Retina > > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > > negative length vectors are not allowed > > > sapply(l4,class) > > X1 X2 X3 X4 X5 variant_id > > "character" "character" "character" "character" "character" "character" > > pval_nominal gene_id.LCL > > "numeric" "character" > > > sapply(asign,class) > > gene chr chr_pos pos p.val.Retina > > "character" "character" "character" "character" "character" > > > > Please advise as to why I am getting this error when merging? > > > > Thanks > > Ana > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.
Yes. Have you tried the bigmemory package? Jim On Thu, Oct 24, 2019 at 10:08 AM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hi Jim, > > I think one of the issue is that data frames are so big, > > dim(l4) > [1] 166941635 8 > > dim(asign) > [1] 107371528 5 > > so my example would not reproduce the error > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > Hi Ana, > > When I run this example taken from your email: > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", > > header=TRUE,stringsAsFactors=FALSE) > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > header=TRUE,stringsAsFactors=FALSE) > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > [1] X1 X2 X3 X4 X5 > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > [11] p.val.Retina > > <0 rows> (or 0-length row.names) > > > > It works okay, but there are no matches in the join. So I can't even > > guess what the problem is. > > > > Jim > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > > > Hello, > > > > > > I have two data frames like this: > > > > > > > head(l4) > > > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > > > head(asign) > > > gene chr chr_pos pos p.val.Retina > > > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > > > negative length vectors are not allowed > > > > sapply(l4,class) > > > X1 X2 X3 X4 X5 variant_id > > > "character" "character" "character" "character" "character" "character" > > > pval_nominal gene_id.LCL > > > "numeric" "character" > > > > sapply(asign,class) > > > gene chr chr_pos pos p.val.Retina > > > "character" "character" "character" "character" "character" > > > > > > Please advise as to why I am getting this error when merging? > > > > > > Thanks > > > Ana > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code.
no can you please send me an example how the command would look like in my case? On Wed, Oct 23, 2019 at 6:16 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Yes. Have you tried the bigmemory package? > > Jim > > On Thu, Oct 24, 2019 at 10:08 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hi Jim, > > > > I think one of the issue is that data frames are so big, > > > dim(l4) > > [1] 166941635 8 > > > dim(asign) > > [1] 107371528 5 > > > > so my example would not reproduce the error > > > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > > > Hi Ana, > > > When I run this example taken from your email: > > > > > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", > > > header=TRUE,stringsAsFactors=FALSE) > > > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > > > ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > > ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > > ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > > ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > > ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", > > > header=TRUE,stringsAsFactors=FALSE) > > > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > [1] X1 X2 X3 X4 X5 > > > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > > > [11] p.val.Retina > > > <0 rows> (or 0-length row.names) > > > > > > It works okay, but there are no matches in the join. So I can't even > > > guess what the problem is. > > > > > > Jim > > > > > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > > > > > Hello, > > > > > > > > I have two data frames like this: > > > > > > > > > head(l4) > > > > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > > > > head(asign) > > > > gene chr chr_pos pos p.val.Retina > > > > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > > > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > > > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > > > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > > > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > > > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > > > > negative length vectors are not allowed > > > > > sapply(l4,class) > > > > X1 X2 X3 X4 X5 variant_id > > > > "character" "character" "character" "character" "character" "character" > > > > pval_nominal gene_id.LCL > > > > "numeric" "character" > > > > > sapply(asign,class) > > > > gene chr chr_pos pos p.val.Retina > > > > "character" "character" "character" "character" "character" > > > > > > > > Please advise as to why I am getting this error when merging? > > > > > > > > Thanks > > > > Ana > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code.
Hello, Sometimes sqldf::sqldf tends to save memory. Maybe if you try library(sqldf) sqldf('select l4.*, asign.gene, asign.chr_pos, asign.`p.val.Retina` from l4 inner join asign on X1 = asign.chr and X2 = asign.pos') Or you can filter the rows that match first, then merge the results. Something along the lines of # read in only the columns needed with fread, it's fast l4join <- data.table::fread(l4_file, select = c("X1", "X2")) ajoin <- data.table::fread(asign_file, select = c("chr", "pos")) # create indices with the matches on both sides i1 <- (l4join$X1 %in% ajoin$chr) & (l4join$X2 %in% ajoin$pos) i2 <- (ajoin$chr %in% l4join$X1) & (ajoin$pos %in% l4join$X2) rm(l4join, ajoin) # don't need this any more, remove them # now the real fread's l4 <- data.table::fread(l4_file) asign <- data.table::fread(asign_file) # extract the relevant rows and merge res <- l4[i1, ] res2 <- asign[i2, setdiff(names(asign), names(l4))] merge(res, res2, by.x = c("X1", "X2"), by.y = c("chr", "pos")) Hope this helps, Rui Barradas ?s 00:08 de 24/10/19, Ana Marija escreveu:> Hi Jim, > > I think one of the issue is that data frames are so big, >> dim(l4) > [1] 166941635 8 >> dim(asign) > [1] 107371528 5 > > so my example would not reproduce the error > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote: >> >> Hi Ana, >> When I run this example taken from your email: >> >> l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL >> chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 >> chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 >> chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 >> chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 >> chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 >> chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", >> header=TRUE,stringsAsFactors=FALSE) >> asign<-read.table(text="gene chr chr_pos pos p.val.Retina >> ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 >> ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 >> ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 >> ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 >> ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 >> ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", >> header=TRUE,stringsAsFactors=FALSE) >> merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) >> [1] X1 X2 X3 X4 X5 >> [6] variant_id pval_nominal gene_id.LCL gene chr_pos >> [11] p.val.Retina >> <0 rows> (or 0-length row.names) >> >> It works okay, but there are no matches in the join. So I can't even >> guess what the problem is. >> >> Jim >> >> On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: >>> >>> Hello, >>> >>> I have two data frames like this: >>> >>>> head(l4) >>> X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL >>> 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 >>> 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 >>> 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 >>> 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 >>> 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 >>> 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 >>>> head(asign) >>> gene chr chr_pos pos p.val.Retina >>> 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 >>> 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 >>> 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 >>> 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 >>> 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 >>> 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 >>>> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) >>> Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : >>> negative length vectors are not allowed >>>> sapply(l4,class) >>> X1 X2 X3 X4 X5 variant_id >>> "character" "character" "character" "character" "character" "character" >>> pval_nominal gene_id.LCL >>> "numeric" "character" >>>> sapply(asign,class) >>> gene chr chr_pos pos p.val.Retina >>> "character" "character" "character" "character" "character" >>> >>> Please advise as to why I am getting this error when merging? >>> >>> Thanks >>> Ana >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
HI Rui, thank you so much for this. I tried with the sqldf but it didn't help. Next I tried your 2nd method and I was following your steps until:> res2 <- asign[i2, setdiff(names(asign), names(l4))] > m=merge(res, res2, by.x = c("chr", "pos"), by.y = c("chr", "pos"))Error in merge.data.table(res, res2, by.x = c("chr", "pos"), by.y = c("chr", : Elements listed in `by.y` must be valid column names in y.> head(res)chr pos a1 a2 a3 variant_id pval_nominal gene_id 1: chr1 54490 G A b38 chr1_54490_G_A_b38 0.608495 ENSG00000227232.5 2: chr1 58814 G A b38 chr1_58814_G_A_b38 0.295211 ENSG00000227232.5 3: chr1 60351 A G b38 chr1_60351_A_G_b38 0.439788 ENSG00000227232.5 4: chr1 61920 G A b38 chr1_61920_G_A_b38 0.319528 ENSG00000227232.5 5: chr1 63671 G A b38 chr1_63671_G_A_b38 0.237739 ENSG00000227232.5 6: chr1 64931 G A b38 chr1_64931_G_A_b38 0.276679 ENSG00000227232.5> head(res2)[1] "gene" "chr_pos" "p.val.Retina"> dim(res)[1] 111478253 8> head(l4)chr pos a1 a2 a3 variant_id pval_nominal gene_id 1: chr1 13550 G A b38 chr1_13550_G_A_b38 0.375614 ENSG00000227232.5 2: chr1 14671 G C b38 chr1_14671_G_C_b38 0.474708 ENSG00000227232.5 3: chr1 14677 G A b38 chr1_14677_G_A_b38 0.699887 ENSG00000227232.5 4: chr1 16841 G T b38 chr1_16841_G_T_b38 0.127895 ENSG00000227232.5 5: chr1 16856 A G b38 chr1_16856_A_G_b38 0.627822 ENSG00000227232.5 6: chr1 17005 A G b38 chr1_17005_A_G_b38 0.802803 ENSG00000227232.5> head(asign)gene chr chr_pos pos p.val.Retina 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572> length(i2)[1] 107371528 Everything is the same as I stated initially in the problem, except that as you can see in l4 I renamed columns so now instead of X1 and X2 I have "chr", "pos" Do you know why this command didn't return anything? res2 <- asign[i2, setdiff(names(asign), names(l4))] On Thu, Oct 24, 2019 at 2:17 PM Rui Barradas <ruipbarradas at sapo.pt> wrote:> > Hello, > > Sometimes sqldf::sqldf tends to save memory. Maybe if you try > > library(sqldf) > > sqldf('select l4.*, asign.gene, asign.chr_pos, asign.`p.val.Retina` > from l4 > inner join asign > on X1 = asign.chr and X2 = asign.pos') > > Or you can filter the rows that match first, then merge the results. > Something along the lines of > > # read in only the columns needed with fread, it's fast > l4join <- data.table::fread(l4_file, select = c("X1", "X2")) > ajoin <- data.table::fread(asign_file, select = c("chr", "pos")) > > # create indices with the matches on both sides > i1 <- (l4join$X1 %in% ajoin$chr) & (l4join$X2 %in% ajoin$pos) > i2 <- (ajoin$chr %in% l4join$X1) & (ajoin$pos %in% l4join$X2) > > rm(l4join, ajoin) # don't need this any more, remove them > > # now the real fread's > l4 <- data.table::fread(l4_file) > asign <- data.table::fread(asign_file) > > # extract the relevant rows and merge > res <- l4[i1, ] > res2 <- asign[i2, setdiff(names(asign), names(l4))] > merge(res, res2, by.x = c("X1", "X2"), by.y = c("chr", "pos")) > > > Hope this helps, > > Rui Barradas > > > > > > > ?s 00:08 de 24/10/19, Ana Marija escreveu: > > Hi Jim, > > > > I think one of the issue is that data frames are so big, > >> dim(l4) > > [1] 166941635 8 > >> dim(asign) > > [1] 107371528 5 > > > > so my example would not reproduce the error > > > > On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote: > >> > >> Hi Ana, > >> When I run this example taken from your email: > >> > >> l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > >> chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > >> chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > >> chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > >> chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > >> chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > >> chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", > >> header=TRUE,stringsAsFactors=FALSE) > >> asign<-read.table(text="gene chr chr_pos pos p.val.Retina > >> ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > >> ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > >> ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > >> ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > >> ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > >> ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", > >> header=TRUE,stringsAsFactors=FALSE) > >> merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > >> [1] X1 X2 X3 X4 X5 > >> [6] variant_id pval_nominal gene_id.LCL gene chr_pos > >> [11] p.val.Retina > >> <0 rows> (or 0-length row.names) > >> > >> It works okay, but there are no matches in the join. So I can't even > >> guess what the problem is. > >> > >> Jim > >> > >> On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > >>> > >>> Hello, > >>> > >>> I have two data frames like this: > >>> > >>>> head(l4) > >>> X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > >>> 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > >>> 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > >>> 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > >>> 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > >>> 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > >>> 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > >>>> head(asign) > >>> gene chr chr_pos pos p.val.Retina > >>> 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > >>> 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > >>> 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > >>> 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > >>> 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > >>> 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > >>>> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > >>> Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > >>> negative length vectors are not allowed > >>>> sapply(l4,class) > >>> X1 X2 X3 X4 X5 variant_id > >>> "character" "character" "character" "character" "character" "character" > >>> pval_nominal gene_id.LCL > >>> "numeric" "character" > >>>> sapply(asign,class) > >>> gene chr chr_pos pos p.val.Retina > >>> "character" "character" "character" "character" "character" > >>> > >>> Please advise as to why I am getting this error when merging? > >>> > >>> Thanks > >>> Ana > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >