Hi Readers, I have a question. I have a large dataset and want to throw away columns that have the same value in the column itself and I want to know which column this was. For example > x<-data.frame(id=c(1,2,3), snp1=c("A","G", "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) > x id snp1 snp2 snp3 1 1 A G G 2 2 G G G 3 3 G G A Now I want to know that snp2 in monomorphic (the same value for the column) and after I know which column it is I want to take these columns out. Thanks, Naomi Disclaimer: De informatie opgenomen in dit bericht (en bijlagen) kan vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te gebruiken, de afzender direct te informeren en het bericht te vernietigen. Aan dit bericht kunnen geen rechten of plichten worden ontleend. ---------------------------------------------------------------------------- ---------------------------- Disclaimer: The information contained in this message may be confidential and is intended to be exclusively for the addressee. Should you receive this message unintentionally, you are expected not to use the contents herein, to notify the sender immediately and to destroy the message. No rights can be derived from this message.
Try this:> xid snp1 snp2 snp3 1 1 A G G 2 2 G G G 3 3 G G A> str(x)'data.frame': 3 obs. of 4 variables: $ id : num 1 2 3 $ snp1: Factor w/ 2 levels "A","G": 1 2 2 $ snp2: Factor w/ 1 level "G": 1 1 1 $ snp3: Factor w/ 2 levels "A","G": 2 2 1> # test for which columns are the same > apply(x, 2, function(.col) all(head(.col, -1) == tail(.col, -1)))id snp1 snp2 snp3 FALSE FALSE TRUE FALSE>On Thu, Mar 26, 2009 at 7:15 AM, Duijvesteijn, Naomi <Naomi.Duijvesteijn at ipg.nl> wrote:> > ? Hi Readers, > > > ? I have a question. > > > ? I have a large dataset and want to throw away columns that have the same > ? value in the column itself and I want to know which column this was. > > > ? For example > > ? > x<-data.frame(id=c(1,2,3), snp1=c("A","G", > ? "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) > > ? > x > > ? ? id snp1 snp2 snp3 > > ? 1 ?1 ? ?A ? ?G ? ?G > > ? 2 ?2 ? ?G ? ?G ? ?G > > ? 3 ?3 ? ?G ? ?G ? ?A > > > ? Now I want to know that snp2 in monomorphic (the same value for the column) > ? and after I know which column it is I want to take these columns out. > > > ? Thanks, > > ? Naomi > > > > > > ? Disclaimer: ?De ?informatie opgenomen in dit bericht (en bijlagen) kan > ? vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). > ? Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te > ? gebruiken, de afzender direct te informeren en het bericht te vernietigen. > ? Aan dit bericht kunnen geen rechten of plichten worden ontleend. > > ? ---------------------------------------------------------------------------- > ? ---------------------------- > > ? Disclaimer: The information contained in this message may be confidential > ? and is intended to be exclusively for the addressee. Should you receive this > ? message unintentionally, you are expected not to use the contents herein, to > ? notify the sender immediately and to destroy the message. No rights can be > ? derived from this message. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
On Thu, Mar 26, 2009 at 12:15 PM, Duijvesteijn, Naomi <Naomi.Duijvesteijn at ipg.nl> wrote:> > ? Hi Readers, > > > ? I have a question. > > > ? I have a large dataset and want to throw away columns that have the same > ? value in the column itself and I want to know which column this was. > > > ? For example > > ? > x<-data.frame(id=c(1,2,3), snp1=c("A","G", > ? "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) > > ? > x > > ? ? id snp1 snp2 snp3 > > ? 1 ?1 ? ?A ? ?G ? ?G > > ? 2 ?2 ? ?G ? ?G ? ?G > > ? 3 ?3 ? ?G ? ?G ? ?A > > > ? Now I want to know that snp2 in monomorphic (the same value for the column) > ? and after I know which column it is I want to take these columns out. > > > ? Thanks, > > ? Naomi >Another, perhaps slightly more intuitive solution than Jim's would be the following: x<-data.frame(id=c(1,2,3), snp1=c("A","G", "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) is.monovalued<-function(df){ sapply(df,function(x){ length(unique(x))==1 }) } monovaluedCols<-is.monovalued(x) which(monovaluedCols) x[!monovaluedCols] /Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik
this works which.is.not.unique <- apply(x,2,function(x)ifelse(length(unique(x))==1,F,T)) x[,which.is.not.unique] patrizio 2009/3/26 Duijvesteijn, Naomi <Naomi.Duijvesteijn at ipg.nl>:> > ? Hi Readers, > > > ? I have a question. > > > ? I have a large dataset and want to throw away columns that have the same > ? value in the column itself and I want to know which column this was. > > > ? For example > > ? > x<-data.frame(id=c(1,2,3), snp1=c("A","G", > ? "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) > > ? > x > > ? ? id snp1 snp2 snp3 > > ? 1 ?1 ? ?A ? ?G ? ?G > > ? 2 ?2 ? ?G ? ?G ? ?G > > ? 3 ?3 ? ?G ? ?G ? ?A > > > ? Now I want to know that snp2 in monomorphic (the same value for the column) > ? and after I know which column it is I want to take these columns out. > > > ? Thanks, > > ? Naomi > > > > > > ? Disclaimer: ?De ?informatie opgenomen in dit bericht (en bijlagen) kan > ? vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). > ? Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te > ? gebruiken, de afzender direct te informeren en het bericht te vernietigen. > ? Aan dit bericht kunnen geen rechten of plichten worden ontleend. > > ? ---------------------------------------------------------------------------- > ? ---------------------------- > > ? Disclaimer: The information contained in this message may be confidential > ? and is intended to be exclusively for the addressee. Should you receive this > ? message unintentionally, you are expected not to use the contents herein, to > ? notify the sender immediately and to destroy the message. No rights can be > ? derived from this message. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Patrizio Frederic wrote:> this works > > which.is.not.unique <- apply(x,2,function(x)ifelse(length(unique(x))==1,F,T)) > x[,which.is.not.unique] >or you simplify that idea and say x[, apply(x, 2, function(x) length(unique(x)) > 1)] Uwe Ligges> patrizio > > 2009/3/26 Duijvesteijn, Naomi <Naomi.Duijvesteijn at ipg.nl>: >> Hi Readers, >> >> >> I have a question. >> >> >> I have a large dataset and want to throw away columns that have the same >> value in the column itself and I want to know which column this was. >> >> >> For example >> >> > x<-data.frame(id=c(1,2,3), snp1=c("A","G", >> "G"),snp2=c("G","G","G"),snp3=c("G","G","A")) >> >> > x >> >> id snp1 snp2 snp3 >> >> 1 1 A G G >> >> 2 2 G G G >> >> 3 3 G G A >> >> >> Now I want to know that snp2 in monomorphic (the same value for the column) >> and after I know which column it is I want to take these columns out. >> >> >> Thanks, >> >> Naomi >> >> >> >> >> >> Disclaimer: De informatie opgenomen in dit bericht (en bijlagen) kan >> vertrouwelijk zijn en is uitsluitend bestemd voor de geadresseerde(n). >> Indien u dit bericht ten onrechte ontvangt, wordt u geacht de inhoud niet te >> gebruiken, de afzender direct te informeren en het bericht te vernietigen. >> Aan dit bericht kunnen geen rechten of plichten worden ontleend. >> >> ---------------------------------------------------------------------------- >> ---------------------------- >> >> Disclaimer: The information contained in this message may be confidential >> and is intended to be exclusively for the addressee. Should you receive this >> message unintentionally, you are expected not to use the contents herein, to >> notify the sender immediately and to destroy the message. No rights can be >> derived from this message. >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.