Hi Michael, this still doesn't work, by data frame has a few less columns now, but the principle is still the same:> head(d)chr pos gene_id LCL Retina wl wr 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838 so solution for the first row should be:> sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail)sumz = 2.386896 p = 0.008495647 when I run what you proposed in the last email: helper <- function(x) { p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p p } d$META <- apply(d, MARGIN = 1, helper) I am getting: Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) : Must have at least two valid p values Please advise, Ana On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote:> > Dear Ana > > Yes, when apply coerces q to a matrix it does so as a character matrix > because of the values in the first column. So you need to wrap the > references to x in helper in as.numeric() tat is to day like > as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I > should have thought of it before. > > When I next update metap I will try to get it to degrade more gracefully > when it finds an error. > > Michael > > On 28/10/2019 19:06, Ana Marija wrote: > > Hi Michael, > > > > I tried what you proposed with my data frame q: > > > >> head(q) > > ID P G E > > wb wg we > > 1: rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774 > > 2: rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774 > > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774 > > 4: rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774 > > 5: rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774 > > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774 > > > > so the solution of the first row would be this: > >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail) > > sumz = 1.481833 p = 0.06919239 > > > > I tried applying the function you wrote: > > helper <- function(x) { > > p <- sumz(x[2:4], weights = x[5:7])$p > > p > > } > > > > With: > > > > q$META <- apply(q, MARGIN = 1, helper) > > > > # I want to make a new column in q named META with results > > but I got this error: > > Error in sumz(x[2:4], weights = x[5:7]) : > > Must have at least two valid p values > > > > Please advise, > > Ana > > > > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote: > >> > >> Dear Ana > >> > >> There must be several ways of doing this but see below for an idea with > >> comments in-line. > >> > >> On 26/10/2019 00:31, Ana Marija wrote: > >>> Hello, > >>> > >>> I would like to use this package metap > >>> to calculate multiple o values > >>> > >>> I have my data frame with 3 p values > >>>> head(tt) > >>> RS G E B > >>> 1: rs2089177 0.9986 0.7153 0.604716 > >>> 2: rs4360974 0.9738 0.7838 0.430228 > >>> 3: rs6502526 0.9744 0.7839 0.429160 > >>> 4: rs8069906 0.7184 0.4918 0.521452 > >>> 5: rs9905280 0.7205 0.4861 0.465758 > >>> 6: rs4313843 0.9804 0.8522 0.474313 > >>> > >>> and data frame with corresponding weights for each of the p values > >>> from the tt data frame > >>> > >>>> head(df) > >>> wg we wb RS > >>> 1 40.6325 35.39774 580.6436 rs2089177 > >>> 2 40.6325 35.39774 580.6436 rs4360974 > >>> 3 40.6325 35.39774 580.6436 rs6502526 > >>> 4 40.6325 35.39774 580.6436 rs8069906 > >>> 5 40.6325 35.39774 580.6436 rs9905280 > >>> 6 40.6325 35.39774 580.6436 rs4313843 > >>> > >>> RS column is the same in df and tt > >>> > >> > >> So you can create a new data-frame with merge() > >> > >> newdata <- merge(tt, df) > >> > >> which will use RS as the key to merge them on. > >> > >> The write a function of one argument, a seven element vector, which > >> picks out the p-values and the weights and feeds them to sumz(). > >> Something like > >> > >> helper <- function(x) { > >> p <- sumz(x[2:4], weights = x[5:7])$p > >> p > >> } > >> Note you need to check that 2:4 and 5:7 are actually where they are in > >> the row of newdat. > >> > >> Then use apply() to apply that to the rows of newdat. > >> > >> I have not tested any of this but the general idea should be OK even if > >> the details are wrong. > >> > >> Michael > >> > >> > >>> How to use this sunz() function to create a new data frame which would > >>> look the same as tt only it would have additional column, say named > >>> "META" which has calculated meta p values for each row > >>> > >>> This i s example of how much would be p value in the first row: > >>> > >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail) > >>> p = 0.6940048 > >>> > >>> Thanks > >>> Ana > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >> > >> -- > >> Michael > >> http://www.dewey.myzen.co.uk/home.html > > > > -- > Michael > http://www.dewey.myzen.co.uk/home.html
I also tried to do it this way: d$META <- sapply(seq_len(nrow(d)), function(rn) { unlist(sumz(as.matrix(d[,.(LCL,Retina)])[rn,], weights as.vector(d[,.(wl,wr)])[rn,], na.action=na.fail)["p"]) }) but again I am getting error: Error in sumz(as.matrix(d[, .(LCL, Retina)])[rn, ], weights = as.vector(d[, : Must have at least two valid p values for this reference these are details about my data frame:> head(d)chr pos gene_id LCL Retina wl wr 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838> sapply(d,class)chr pos gene_id LCL Retina wl "character" "character" "character" "numeric" "numeric" "numeric" wr "numeric"> sum(is.na(d$LCL))[1] 0> sum(is.na(d$Retina))[1] 0> sum(is.na(d$wl))[1] 0> sum(is.na(d$wr))[1] 0> dim(d)[1] 1668837 7 On Wed, Oct 30, 2019 at 4:52 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hi Michael, > > this still doesn't work, by data frame has a few less columns now, but > the principle is still the same: > > > head(d) > chr pos gene_id LCL > Retina wl wr > 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838 > 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838 > 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838 > 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838 > 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838 > 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838 > > so solution for the first row should be: > > sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail) > sumz = 2.386896 p = 0.008495647 > > when I run what you proposed in the last email: > > helper <- function(x) { > p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p > p > } > > d$META <- apply(d, MARGIN = 1, helper) > > I am getting: > > Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) : > Must have at least two valid p values > > Please advise, > Ana > > On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote: > > > > Dear Ana > > > > Yes, when apply coerces q to a matrix it does so as a character matrix > > because of the values in the first column. So you need to wrap the > > references to x in helper in as.numeric() tat is to day like > > as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I > > should have thought of it before. > > > > When I next update metap I will try to get it to degrade more gracefully > > when it finds an error. > > > > Michael > > > > On 28/10/2019 19:06, Ana Marija wrote: > > > Hi Michael, > > > > > > I tried what you proposed with my data frame q: > > > > > >> head(q) > > > ID P G E > > > wb wg we > > > 1: rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774 > > > 2: rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774 > > > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774 > > > 4: rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774 > > > 5: rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774 > > > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774 > > > > > > so the solution of the first row would be this: > > >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail) > > > sumz = 1.481833 p = 0.06919239 > > > > > > I tried applying the function you wrote: > > > helper <- function(x) { > > > p <- sumz(x[2:4], weights = x[5:7])$p > > > p > > > } > > > > > > With: > > > > > > q$META <- apply(q, MARGIN = 1, helper) > > > > > > # I want to make a new column in q named META with results > > > but I got this error: > > > Error in sumz(x[2:4], weights = x[5:7]) : > > > Must have at least two valid p values > > > > > > Please advise, > > > Ana > > > > > > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote: > > >> > > >> Dear Ana > > >> > > >> There must be several ways of doing this but see below for an idea with > > >> comments in-line. > > >> > > >> On 26/10/2019 00:31, Ana Marija wrote: > > >>> Hello, > > >>> > > >>> I would like to use this package metap > > >>> to calculate multiple o values > > >>> > > >>> I have my data frame with 3 p values > > >>>> head(tt) > > >>> RS G E B > > >>> 1: rs2089177 0.9986 0.7153 0.604716 > > >>> 2: rs4360974 0.9738 0.7838 0.430228 > > >>> 3: rs6502526 0.9744 0.7839 0.429160 > > >>> 4: rs8069906 0.7184 0.4918 0.521452 > > >>> 5: rs9905280 0.7205 0.4861 0.465758 > > >>> 6: rs4313843 0.9804 0.8522 0.474313 > > >>> > > >>> and data frame with corresponding weights for each of the p values > > >>> from the tt data frame > > >>> > > >>>> head(df) > > >>> wg we wb RS > > >>> 1 40.6325 35.39774 580.6436 rs2089177 > > >>> 2 40.6325 35.39774 580.6436 rs4360974 > > >>> 3 40.6325 35.39774 580.6436 rs6502526 > > >>> 4 40.6325 35.39774 580.6436 rs8069906 > > >>> 5 40.6325 35.39774 580.6436 rs9905280 > > >>> 6 40.6325 35.39774 580.6436 rs4313843 > > >>> > > >>> RS column is the same in df and tt > > >>> > > >> > > >> So you can create a new data-frame with merge() > > >> > > >> newdata <- merge(tt, df) > > >> > > >> which will use RS as the key to merge them on. > > >> > > >> The write a function of one argument, a seven element vector, which > > >> picks out the p-values and the weights and feeds them to sumz(). > > >> Something like > > >> > > >> helper <- function(x) { > > >> p <- sumz(x[2:4], weights = x[5:7])$p > > >> p > > >> } > > >> Note you need to check that 2:4 and 5:7 are actually where they are in > > >> the row of newdat. > > >> > > >> Then use apply() to apply that to the rows of newdat. > > >> > > >> I have not tested any of this but the general idea should be OK even if > > >> the details are wrong. > > >> > > >> Michael > > >> > > >> > > >>> How to use this sunz() function to create a new data frame which would > > >>> look the same as tt only it would have additional column, say named > > >>> "META" which has calculated meta p values for each row > > >>> > > >>> This i s example of how much would be p value in the first row: > > >>> > > >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail) > > >>> p = 0.6940048 > > >>> > > >>> Thanks > > >>> Ana > > >>> > > >>> ______________________________________________ > > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > >>> and provide commented, minimal, self-contained, reproducible code. > > >>> > > >> > > >> -- > > >> Michael > > >> http://www.dewey.myzen.co.uk/home.html > > > > > > > -- > > Michael > > http://www.dewey.myzen.co.uk/home.html
Can you please get back to me about this, I need this meta p values for manuscript I have to submit next week On Wed, Oct 30, 2019 at 5:35 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > I also tried to do it this way: > > d$META <- sapply(seq_len(nrow(d)), function(rn) { > unlist(sumz(as.matrix(d[,.(LCL,Retina)])[rn,], weights > as.vector(d[,.(wl,wr)])[rn,], > na.action=na.fail)["p"]) > }) > > but again I am getting error: > Error in sumz(as.matrix(d[, .(LCL, Retina)])[rn, ], weights = as.vector(d[, : > Must have at least two valid p values > > for this reference these are details about my data frame: > > head(d) > chr pos gene_id LCL Retina > wl wr > 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838 > 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838 > 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838 > 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838 > 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838 > 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838 > > sapply(d,class) > chr pos gene_id LCL Retina wl > "character" "character" "character" "numeric" "numeric" "numeric" > wr > "numeric" > > sum(is.na(d$LCL)) > [1] 0 > > sum(is.na(d$Retina)) > [1] 0 > > sum(is.na(d$wl)) > [1] 0 > > sum(is.na(d$wr)) > [1] 0 > > dim(d) > [1] 1668837 7 > > On Wed, Oct 30, 2019 at 4:52 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hi Michael, > > > > this still doesn't work, by data frame has a few less columns now, but > > the principle is still the same: > > > > > head(d) > > chr pos gene_id LCL > > Retina wl wr > > 1: chr1 775930 ENSG00000237094 0.3559520 9.72251e-05 31.62278 21.2838 > > 2: chr1 815963 ENSG00000237094 0.2648080 3.85837e-06 31.62278 21.2838 > > 3: chr1 816376 ENSG00000237094 0.3313120 3.85824e-06 31.62278 21.2838 > > 4: chr1 817186 ENSG00000237094 0.0912854 3.75134e-06 31.62278 21.2838 > > 5: chr1 817341 ENSG00000237094 0.1020520 3.75134e-06 31.62278 21.2838 > > 6: chr1 817514 ENSG00000237094 0.0831412 3.82866e-06 31.62278 21.2838 > > > > so solution for the first row should be: > > > sumz(c(0.3559520,9.72251e-05), weights = c(31.62278,21.2838), na.action = na.fail) > > sumz = 2.386896 p = 0.008495647 > > > > when I run what you proposed in the last email: > > > > helper <- function(x) { > > p <- sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7]))$p > > p > > } > > > > d$META <- apply(d, MARGIN = 1, helper) > > > > I am getting: > > > > Error in sumz(as.numeric(x[4:5]), weights = as.numeric(x[6:7])) : > > Must have at least two valid p values > > > > Please advise, > > Ana > > > > On Wed, Oct 30, 2019 at 5:02 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote: > > > > > > Dear Ana > > > > > > Yes, when apply coerces q to a matrix it does so as a character matrix > > > because of the values in the first column. So you need to wrap the > > > references to x in helper in as.numeric() tat is to day like > > > as.numeric(x[2:4]) and similarly for the other one. Sorry about that, I > > > should have thought of it before. > > > > > > When I next update metap I will try to get it to degrade more gracefully > > > when it finds an error. > > > > > > Michael > > > > > > On 28/10/2019 19:06, Ana Marija wrote: > > > > Hi Michael, > > > > > > > > I tried what you proposed with my data frame q: > > > > > > > >> head(q) > > > > ID P G E > > > > wb wg we > > > > 1: rs1029830 0.0979931 0.0054060 0.39160 580.6436 40.6325 35.39774 > > > > 2: rs1029832 0.1501820 0.0028140 0.39320 580.6436 40.6325 35.39774 > > > > 3: rs11078374 0.1701250 0.0009805 0.49730 580.6436 40.6325 35.39774 > > > > 4: rs1124961 0.1710150 0.7252000 0.05737 580.6436 40.6325 35.39774 > > > > 5: rs1135237 0.1493650 0.6851000 0.06354 580.6436 40.6325 35.39774 > > > > 6: rs11867934 0.0757972 0.0006140 0.00327 580.6436 40.6325 35.39774 > > > > > > > > so the solution of the first row would be this: > > > >> sumz(c(0.0979931,0.0054060,0.39160), weights = c(580.6436,40.6325,35.39774), na.action = na.fail) > > > > sumz = 1.481833 p = 0.06919239 > > > > > > > > I tried applying the function you wrote: > > > > helper <- function(x) { > > > > p <- sumz(x[2:4], weights = x[5:7])$p > > > > p > > > > } > > > > > > > > With: > > > > > > > > q$META <- apply(q, MARGIN = 1, helper) > > > > > > > > # I want to make a new column in q named META with results > > > > but I got this error: > > > > Error in sumz(x[2:4], weights = x[5:7]) : > > > > Must have at least two valid p values > > > > > > > > Please advise, > > > > Ana > > > > > > > > On Sun, Oct 27, 2019 at 9:49 AM Michael Dewey <lists at dewey.myzen.co.uk> wrote: > > > >> > > > >> Dear Ana > > > >> > > > >> There must be several ways of doing this but see below for an idea with > > > >> comments in-line. > > > >> > > > >> On 26/10/2019 00:31, Ana Marija wrote: > > > >>> Hello, > > > >>> > > > >>> I would like to use this package metap > > > >>> to calculate multiple o values > > > >>> > > > >>> I have my data frame with 3 p values > > > >>>> head(tt) > > > >>> RS G E B > > > >>> 1: rs2089177 0.9986 0.7153 0.604716 > > > >>> 2: rs4360974 0.9738 0.7838 0.430228 > > > >>> 3: rs6502526 0.9744 0.7839 0.429160 > > > >>> 4: rs8069906 0.7184 0.4918 0.521452 > > > >>> 5: rs9905280 0.7205 0.4861 0.465758 > > > >>> 6: rs4313843 0.9804 0.8522 0.474313 > > > >>> > > > >>> and data frame with corresponding weights for each of the p values > > > >>> from the tt data frame > > > >>> > > > >>>> head(df) > > > >>> wg we wb RS > > > >>> 1 40.6325 35.39774 580.6436 rs2089177 > > > >>> 2 40.6325 35.39774 580.6436 rs4360974 > > > >>> 3 40.6325 35.39774 580.6436 rs6502526 > > > >>> 4 40.6325 35.39774 580.6436 rs8069906 > > > >>> 5 40.6325 35.39774 580.6436 rs9905280 > > > >>> 6 40.6325 35.39774 580.6436 rs4313843 > > > >>> > > > >>> RS column is the same in df and tt > > > >>> > > > >> > > > >> So you can create a new data-frame with merge() > > > >> > > > >> newdata <- merge(tt, df) > > > >> > > > >> which will use RS as the key to merge them on. > > > >> > > > >> The write a function of one argument, a seven element vector, which > > > >> picks out the p-values and the weights and feeds them to sumz(). > > > >> Something like > > > >> > > > >> helper <- function(x) { > > > >> p <- sumz(x[2:4], weights = x[5:7])$p > > > >> p > > > >> } > > > >> Note you need to check that 2:4 and 5:7 are actually where they are in > > > >> the row of newdat. > > > >> > > > >> Then use apply() to apply that to the rows of newdat. > > > >> > > > >> I have not tested any of this but the general idea should be OK even if > > > >> the details are wrong. > > > >> > > > >> Michael > > > >> > > > >> > > > >>> How to use this sunz() function to create a new data frame which would > > > >>> look the same as tt only it would have additional column, say named > > > >>> "META" which has calculated meta p values for each row > > > >>> > > > >>> This i s example of how much would be p value in the first row: > > > >>> > > > >>>> sumz(c(0.9986,0.7153,0.604716), weights = c(40.6325,35.39774,580.6436), na.action = na.fail) > > > >>> p = 0.6940048 > > > >>> > > > >>> Thanks > > > >>> Ana > > > >>> > > > >>> ______________________________________________ > > > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > > >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > >>> and provide commented, minimal, self-contained, reproducible code. > > > >>> > > > >> > > > >> -- > > > >> Michael > > > >> http://www.dewey.myzen.co.uk/home.html > > > > > > > > > > -- > > > Michael > > > http://www.dewey.myzen.co.uk/home.html