Hello, I have a very data matrix and I have a file which has the names that I need to subset. However I cannot manage to subset the main file. ANy idea? bg <- read.table (file.choose(), header=T, row.names) bg Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 Gi20Jun11 0.001217 0 0.001217 0 0.000000 0 0 0 0.001217 0 0 0 0 0 0.001217 0 0.001217 Gi40Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 Gi425Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 Gi45Jun11 0.000000 0 0.000000 0 0.001513 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 Gi475Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 Gi50Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 ... #second file which has the names that I want to subset c_bg [,1] [1,] "Otu0128" [2,] "Otu0218" [3,] "Otu0034" [4,] "Otu0257" [5,] "Otu0212" [6,] "Otu0279" [7,] "Otu0318" [8,] "Otu0266" [9,] "Otu0056" ... #by using the c_bg name file, I would like to subset bg file g1<-subset(bg,colnames(bg) %in% (c_bg)) # this returns me the all the column names in bg file. Thank you, ?
Hi, bg<- read.table(text=" ???? Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 Gi20Jun11? 0.001217??????? 0 0.001217??????? 0 0.000000??????? 0??????? 0??????? 0 0.001217??????? 0??????? 0??????? 0??????? 0??????? 0 0.001217??????? 0 0.001217 Gi40Jun11? 0.000000??????? 0 0.000000??????? 0 0.000000??????? 0??????? 0??????? 0 0.000000??????? 0??????? 0??????? 0??????? 0??????? 0 0.000000??????? 0 0.000000 Gi425Jun11 0.000000??????? 0 0.000000??????? 0 0.000000??????? 0??????? 0??????? 0 0.000000??????? 0??????? 0??????? 0??????? 0??????? 0 0.000000??????? 0 0.000000 Gi45Jun11? 0.000000??????? 0 0.000000??????? 0 0.001513??????? 0??????? 0??????? 0 0.000000??????? 0??????? 0??????? 0??????? 0??????? 0 0.000000??????? 0 0.000000 Gi475Jun11 0.000000??????? 0 0.000000??????? 0 0.000000??????? 0??????? 0??????? 0 0.000000??????? 0??????? 0??????? 0??????? 0??????? 0 0.000000??????? 0 0.000000 Gi50Jun11? 0.000000??????? 0 0.000000??????? 0 0.000000??????? 0??????? 0??????? 0 0.000000??????? 0??????? 0??????? 0??????? 0??????? 0 0.000000??????? 0 0.000000 ",sep="",header=TRUE,stringsAsFactors=F) c_bg<- read.table(text=" ?Otu00039 ?Otu0128 ?Otu0218 ?Otu0034 ?Otu00158 ? Otu0257 ? Otu0212 ? Otu00125 ? ",sep="",header=FALSE,stringsAsFactors=F) bg[,names(bg)%in%c_bg[,1]] ?# ???????? Otu00039 Otu00125 Otu00158 #Gi20Jun11? 0.001217??????? 0??????? 0 #Gi40Jun11? 0.000000??????? 0??????? 0 #Gi425Jun11 0.000000??????? 0??????? 0 #Gi45Jun11? 0.000000??????? 0??????? 0 #Gi475Jun11 0.000000??????? 0??????? 0 #Gi50Jun11? 0.000000??????? 0??????? 0 A.K. ----- Original Message ----- From: Ozgul Inceoglu <Ozgul.Inceoglu at ulb.ac.be> To: r-help at r-project.org Cc: Sent: Tuesday, February 12, 2013 9:29 AM Subject: [R] subsetting data file by intoducing a second file Hello, I have a very data matrix and I have a file which has the names that I need to subset. However I cannot manage to subset the main file. ANy idea? bg <- read.table (file.choose(), header=T, row.names) bg ? ? ? ? ? Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 Gi20Jun11? 0.001217? ? ? ? 0 0.001217? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.001217? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.001217? ? ? ? 0 0.001217 Gi40Jun11? 0.000000? ? ? ? 0 0.000000? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0 0.000000 Gi425Jun11 0.000000? ? ? ? 0 0.000000? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0 0.000000 Gi45Jun11? 0.000000? ? ? ? 0 0.000000? ? ? ? 0 0.001513? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0 0.000000 Gi475Jun11 0.000000? ? ? ? 0 0.000000? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0 0.000000 Gi50Jun11? 0.000000? ? ? ? 0 0.000000? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0? ? ? ? 0 0.000000? ? ? ? 0 0.000000 ... #second file which has the names that I want to subset c_bg ? ? ? [,1]? ? ? [1,] "Otu0128" ? [2,] "Otu0218" ? [3,] "Otu0034" ? [4,] "Otu0257" ? [5,] "Otu0212" ? [6,] "Otu0279" ? [7,] "Otu0318" ? [8,] "Otu0266" ? [9,] "Otu0056" ... #by using the c_bg name file, I would like to subset bg file g1<-subset(bg,colnames(bg) %in% (c_bg)) # this returns me the all the column names in bg file. Thank you, ? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, Read the help page ?subset more carefully, it's argument 'select' you should be using: subset(bg, select = c_bg) Hope this helps, Rui Barradas Em 12-02-2013 14:29, Ozgul Inceoglu escreveu:> Hello, > > I have a very data matrix and I have a file which has the names that I need to subset. However I cannot manage to subset the main file. ANy idea? > > bg <- read.table (file.choose(), header=T, row.names) > bg > Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 > Gi20Jun11 0.001217 0 0.001217 0 0.000000 0 0 0 0.001217 0 0 0 0 0 0.001217 0 0.001217 > Gi40Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi425Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi45Jun11 0.000000 0 0.000000 0 0.001513 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi475Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi50Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > ... > #second file which has the names that I want to subset > c_bg > [,1] > [1,] "Otu0128" > [2,] "Otu0218" > [3,] "Otu0034" > [4,] "Otu0257" > [5,] "Otu0212" > [6,] "Otu0279" > [7,] "Otu0318" > [8,] "Otu0266" > [9,] "Otu0056" > ... > #by using the c_bg name file, I would like to subset bg file > > g1<-subset(bg,colnames(bg) %in% (c_bg)) > > # this returns me the all the column names in bg file. > > Thank you, > ? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Ozgul, the interesting part is the "select" parameter in subset: subset(bg,select=c_bg[,1]) cheers Am 12.02.2013 15:29, schrieb Ozgul Inceoglu:> Hello, > > I have a very data matrix and I have a file which has the names that I need to subset. However I cannot manage to subset the main file. ANy idea? > > bg <- read.table (file.choose(), header=T, row.names) > bg > Otu00022 Otu00029 Otu00039 Otu00042 Otu00101 Otu00105 Otu00125 Otu00131 Otu00137 Otu00155 Otu00158 Otu00172 Otu00181 Otu00185 Otu00190 Otu00209 Otu00218 > Gi20Jun11 0.001217 0 0.001217 0 0.000000 0 0 0 0.001217 0 0 0 0 0 0.001217 0 0.001217 > Gi40Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi425Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi45Jun11 0.000000 0 0.000000 0 0.001513 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi475Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > Gi50Jun11 0.000000 0 0.000000 0 0.000000 0 0 0 0.000000 0 0 0 0 0 0.000000 0 0.000000 > ... > #second file which has the names that I want to subset > c_bg > [,1] > [1,] "Otu0128" > [2,] "Otu0218" > [3,] "Otu0034" > [4,] "Otu0257" > [5,] "Otu0212" > [6,] "Otu0279" > [7,] "Otu0318" > [8,] "Otu0266" > [9,] "Otu0056" > ... > #by using the c_bg name file, I would like to subset bg file > > g1<-subset(bg,colnames(bg) %in% (c_bg)) > > # this returns me the all the column names in bg file. > > Thank you, > ? > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790 -- Pflichtangaben gem?? Gesetz ?ber elektronische Handelsregister und Genossenschaftsregister sowie das Unternehmensregister (EHUG): Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg Vorstandsmitglieder: Prof. Dr. Martin Zeitz (Vorsitzender), Dr. Alexander Kirstein, Joachim Pr?l?, Prof. Dr. Dr. Uwe Koch-Gromus