Hello, I have two data frames like this:> head(l4)X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232> head(asign)gene chr chr_pos pos p.val.Retina 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos"))Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : negative length vectors are not allowed> sapply(l4,class)X1 X2 X3 X4 X5 variant_id "character" "character" "character" "character" "character" "character" pval_nominal gene_id.LCL "numeric" "character"> sapply(asign,class)gene chr chr_pos pos p.val.Retina "character" "character" "character" "character" "character" Please advise as to why I am getting this error when merging? Thanks Ana
I also tried left_join but I got: Error: std::bad_alloc> df3 <- left_join(l4, asign, by = c("chr","pos"))Error: std::bad_alloc> dim(l4)[1] 166941635 8> dim(asign)[1] 107371528 5 On Wed, Oct 23, 2019 at 5:32 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hello, > > I have two data frames like this: > > > head(l4) > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > head(asign) > gene chr chr_pos pos p.val.Retina > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > negative length vectors are not allowed > > sapply(l4,class) > X1 X2 X3 X4 X5 variant_id > "character" "character" "character" "character" "character" "character" > pval_nominal gene_id.LCL > "numeric" "character" > > sapply(asign,class) > gene chr chr_pos pos p.val.Retina > "character" "character" "character" "character" "character" > > Please advise as to why I am getting this error when merging? > > Thanks > Ana
Hi Ana, When I run this example taken from your email: l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", header=TRUE,stringsAsFactors=FALSE) asign<-read.table(text="gene chr chr_pos pos p.val.Retina ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", header=TRUE,stringsAsFactors=FALSE) merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) [1] X1 X2 X3 X4 X5 [6] variant_id pval_nominal gene_id.LCL gene chr_pos [11] p.val.Retina <0 rows> (or 0-length row.names) It works okay, but there are no matches in the join. So I can't even guess what the problem is. Jim On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hello, > > I have two data frames like this: > > > head(l4) > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > head(asign) > gene chr chr_pos pos p.val.Retina > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > negative length vectors are not allowed > > sapply(l4,class) > X1 X2 X3 X4 X5 variant_id > "character" "character" "character" "character" "character" "character" > pval_nominal gene_id.LCL > "numeric" "character" > > sapply(asign,class) > gene chr chr_pos pos p.val.Retina > "character" "character" "character" "character" "character" > > Please advise as to why I am getting this error when merging? > > Thanks > Ana > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Ah, it looks like a memory allocation problem. Jim On Thu, Oct 24, 2019 at 10:05 AM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > I also tried left_join but I got: Error: std::bad_alloc > > > df3 <- left_join(l4, asign, by = c("chr","pos")) > Error: std::bad_alloc > > dim(l4) > [1] 166941635 8 > > dim(asign) > [1] 107371528 5 > > On Wed, Oct 23, 2019 at 5:32 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hello, > > > > I have two data frames like this: > > > > > head(l4) > > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > > head(asign) > > gene chr chr_pos pos p.val.Retina > > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > > negative length vectors are not allowed > > > sapply(l4,class) > > X1 X2 X3 X4 X5 variant_id > > "character" "character" "character" "character" "character" "character" > > pval_nominal gene_id.LCL > > "numeric" "character" > > > sapply(asign,class) > > gene chr chr_pos pos p.val.Retina > > "character" "character" "character" "character" "character" > > > > Please advise as to why I am getting this error when merging? > > > > Thanks > > Ana > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Jim, I think one of the issue is that data frames are so big,> dim(l4)[1] 166941635 8> dim(asign)[1] 107371528 5 so my example would not reproduce the error On Wed, Oct 23, 2019 at 6:05 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Hi Ana, > When I run this example taken from your email: > > l4<-read.table(text="X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232", > header=TRUE,stringsAsFactors=FALSE) > asign<-read.table(text="gene chr chr_pos pos p.val.Retina > ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572", > header=TRUE,stringsAsFactors=FALSE) > merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > [1] X1 X2 X3 X4 X5 > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > [11] p.val.Retina > <0 rows> (or 0-length row.names) > > It works okay, but there are no matches in the join. So I can't even > guess what the problem is. > > Jim > > On Thu, Oct 24, 2019 at 9:33 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hello, > > > > I have two data frames like this: > > > > > head(l4) > > X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL > > 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 > > 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 > > 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 > > 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 > > 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 > > 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 > > > head(asign) > > gene chr chr_pos pos p.val.Retina > > 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 > > 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 > > 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 > > 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 > > 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 > > 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 > > > m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) > > Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : > > negative length vectors are not allowed > > > sapply(l4,class) > > X1 X2 X3 X4 X5 variant_id > > "character" "character" "character" "character" "character" "character" > > pval_nominal gene_id.LCL > > "numeric" "character" > > > sapply(asign,class) > > gene chr chr_pos pos p.val.Retina > > "character" "character" "character" "character" "character" > > > > Please advise as to why I am getting this error when merging? > > > > Thanks > > Ana > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.
On 23/10/2019 7:04 p.m., Ana Marija wrote:> I also tried left_join but I got: Error: std::bad_alloc > >> df3 <- left_join(l4, asign, by = c("chr","pos") > Error: std::bad_allocLooks like bugs in whatever package you're finding "left_join" in (and previously "merge"). Are those from dplyr and base? Showing us str(lr), str(asign), and sessionInfo() would be helpful. Duncan Murdoch>> dim(l4) > [1] 166941635 8 >> dim(asign) > [1] 107371528 5 > > On Wed, Oct 23, 2019 at 5:32 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: >> >> Hello, >> >> I have two data frames like this: >> >>> head(l4) >> X1 X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL >> 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG00000227232 >> 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG00000227232 >> 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG00000227232 >> 4 chr1 16841 G T b38 1:16841:G:T 0.127895 ENSG00000227232 >> 5 chr1 16856 A G b38 1:16856:A:G 0.627822 ENSG00000227232 >> 6 chr1 17005 A G b38 1:17005:A:G 0.802803 ENSG00000227232 >>> head(asign) >> gene chr chr_pos pos p.val.Retina >> 1: ENSG00000227232 chr1 1:10177:A:AC 10177 0.381708 >> 2: ENSG00000227232 chr1 rs145072688:10352:T:TA 10352 0.959523 >> 3: ENSG00000227232 chr1 1:11008:C:G 11008 0.218132 >> 4: ENSG00000227232 chr1 1:11012:C:G 11012 0.218132 >> 5: ENSG00000227232 chr1 1:13110:G:A 13110 0.998262 >> 6: ENSG00000227232 chr1 rs201725126:13116:T:G 13116 0.438572 >>> m = merge(l4, asign, by.x=c("X1", "X2"), by.y=c("chr", "pos")) >> Error in merge.data.frame(l4, asign, by.x = c("X1", "X2"), by.y = c("chr", : >> negative length vectors are not allowed >>> sapply(l4,class) >> X1 X2 X3 X4 X5 variant_id >> "character" "character" "character" "character" "character" "character" >> pval_nominal gene_id.LCL >> "numeric" "character" >>> sapply(asign,class) >> gene chr chr_pos pos p.val.Retina >> "character" "character" "character" "character" "character" >> >> Please advise as to why I am getting this error when merging? >> >> Thanks >> Ana > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >