Hello R gurus, I am biologist doing biomarker research and I have a data set where I have 6 proteins and close to 3000 samples, i have to look for differences between disease(Y) and controls(N) along with genetic risk, genotypes, sex and other demographic info available. however i do not know any of the statistics to do the adjustment for sex, age, genotype, genetic risk. I have been reading in papers where the authors are talking about adjusting for age, sex, genotype, genetic risk. The CDC website suggests for adjusting the age using the weights, but I am not sure as this would apply to my data. one website says that if the distribution is not equal then one has to model sex, age and other demographic parameters as co-variates. I would appreciate if someone can help me to understand this more clearly and provide directions on modeling these to do my analysis. I am attaching a sample data file with this post. Thanks http://www.nabble.com/file/p24534963/Sample%2Bdata.csv Sample+data.csv -- View this message in context: http://www.nabble.com/How-to-do-adjust-for-sex%2C-age%2C-genotype-for-a-data-tp24534963p24534963.html Sent from the R help mailing list archive at Nabble.com.
Frank E Harrell Jr
2009-Jul-17 14:51 UTC
[R] How to do adjust for sex, age, genotype for a data
1Rnwb wrote:> Hello R gurus, > > I am biologist doing biomarker research and I have a data set where I have 6 > proteins and close to 3000 samples, i have to look for differences between > disease(Y) and controls(N) along with genetic risk, genotypes, sex and other > demographic info available. however i do not know any of the statistics to > do the adjustment for sex, age, genotype, genetic risk. I have been reading > in papers where the authors are talking about adjusting for age, sex, > genotype, genetic risk. The CDC website suggests for adjusting the age using > the weights, but I am not sure as this would apply to my data. one website > says that if the distribution is not equal then one has to model sex, age > and other demographic parameters as co-variates. I would appreciate if > someone can help me to understand this more clearly and provide directions > on modeling these to do my analysis. I am attaching a sample data file with > this post. Thanks > http://www.nabble.com/file/p24534963/Sample%2Bdata.csv Sample+data.csvIf the only clinical variables you are adjusting for are age and sex this analysis will be misleading at best. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
then what will be the other factors needed to be adjusted and whether I should adjust or use them as covariates. Finally how these analysis will be done in R Harrell, Frank E wrote:> > 1Rnwb wrote: >> Hello R gurus, >> >> I am biologist doing biomarker research and I have a data set where I >> have 6 >> proteins and close to 3000 samples, i have to look for differences >> between >> disease(Y) and controls(N) along with genetic risk, genotypes, sex and >> other >> demographic info available. however i do not know any of the statistics >> to >> do the adjustment for sex, age, genotype, genetic risk. I have been >> reading >> in papers where the authors are talking about adjusting for age, sex, >> genotype, genetic risk. The CDC website suggests for adjusting the age >> using >> the weights, but I am not sure as this would apply to my data. one >> website >> says that if the distribution is not equal then one has to model sex, age >> and other demographic parameters as co-variates. I would appreciate if >> someone can help me to understand this more clearly and provide >> directions >> on modeling these to do my analysis. I am attaching a sample data file >> with >> this post. Thanks >> http://www.nabble.com/file/p24534963/Sample%2Bdata.csv Sample+data.csv > > If the only clinical variables you are adjusting for are age and sex > this analysis will be misleading at best. > > Frank > > -- > Frank E Harrell Jr Professor and Chair School of Medicine > Department of Biostatistics Vanderbilt University > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/How-to-do-adjust-for-sex%2C-age%2C-genotype-for-a-data-tp24534963p24538093.html Sent from the R help mailing list archive at Nabble.com.
Charles C. Berry
2009-Jul-18 04:16 UTC
[R] How to do adjust for sex, age, genotype for a data
On Fri, 17 Jul 2009, 1Rnwb wrote:> > then what will be the other factors needed to be adjustedIt is NOT an exaggeration to say that hundreds of research papers, dozens of books, and many dissertations have been written on how to go about answering that question in one context or another. Given the background you say you have, I doubt that any advice you will get from this list will enable you to craft a good answer. What you really need is collaboration with or mentoring from someone who is expert in these matters and willing to dig into the particulars of your research area. and whether I> should adjust or use them as covariates.Usually, these amount to the same thing. Finally how these analysis will be> done in RIf you are doing this yourself you will probably need guidance from a well crafted monograph. Quite a few are listed at http://www.r-project.org/doc/bib/R-books.html HTH, Chuck> > > Harrell, Frank E wrote: >> >> 1Rnwb wrote: >>> Hello R gurus, >>> >>> I am biologist doing biomarker research and I have a data set where I >>> have 6 >>> proteins and close to 3000 samples, i have to look for differences >>> between >>> disease(Y) and controls(N) along with genetic risk, genotypes, sex and >>> other >>> demographic info available. however i do not know any of the statistics >>> to >>> do the adjustment for sex, age, genotype, genetic risk. I have been >>> reading >>> in papers where the authors are talking about adjusting for age, sex, >>> genotype, genetic risk. The CDC website suggests for adjusting the age >>> using >>> the weights, but I am not sure as this would apply to my data. one >>> website >>> says that if the distribution is not equal then one has to model sex, age >>> and other demographic parameters as co-variates. I would appreciate if >>> someone can help me to understand this more clearly and provide >>> directions >>> on modeling these to do my analysis. I am attaching a sample data file >>> with >>> this post. Thanks >>> http://www.nabble.com/file/p24534963/Sample%2Bdata.csv Sample+data.csv >> >> If the only clinical variables you are adjusting for are age and sex >> this analysis will be misleading at best. >> >> Frank >> >> -- >> Frank E Harrell Jr Professor and Chair School of Medicine >> Department of Biostatistics Vanderbilt University >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > View this message in context: http://www.nabble.com/How-to-do-adjust-for-sex%2C-age%2C-genotype-for-a-data-tp24534963p24538093.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:cberry at tajo.ucsd.edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901