Hi, I have a silly question regarding the usage of two commands: read.table and gregexpr: For read.table, if I read a matrix and set header = T, I found that all the dash ("-") becomes dots (".") A = read.table("Matrix.txt", sep = "\t", header = F) A[1,1] # "A-B-C-D". A = read.table("Matrix.txt", sep = "\t", header = T) colnames(A)[1] # "A.B.C.D" Is there a way to use the header = T argument, but still keep the original format "A-B-C-D"? For gregexpr, gregexpr("-","A-B-C-D")[[1]] #[1] 2 4 6 #attr(,"match.length") #[1] 1 1 1 gregexpr(".","A.B.C.D")[[1]] [1] 1 2 3 4 5 6 7 attr(,"match.length") [1] 1 1 1 1 1 1 1 Looks like dots means all the characters. Is there a way that I can extract the position of the dots specifically? Thanks, -Jack [[alternative HTML version deleted]]
Hi Jack, yes there is. see ?read.table for option check.names and to the 2nd task "." is a special character in regular expressions, so mask it or don't use regular expressions: gregexpr("[.]","A.B.C.D") #or gregexpr(".","A.B.C.D",fixed=T) cheers. Am 17.08.2011 15:03, schrieb Jack Luo:> Hi, > > I have a silly question regarding the usage of two commands: read.table and > gregexpr? > For read.table, if I read a matrix and set header = T, I found that all the > dash ("-") becomes dots (".") > > > A = read.table("Matrix.txt", sep = "\t", header = F) > A[1,1] > # "A-B-C-D". > > A = read.table("Matrix.txt", sep = "\t", header = T) > colnames(A)[1] > # "A.B.C.D" > > Is there a way to use the header = T argument, but still keep the original > format "A-B-C-D"? > > For gregexpr, > gregexpr("-","A-B-C-D")[[1]] > #[1] 2 4 6 > #attr(,"match.length") > #[1] 1 1 1 > > > gregexpr(".","A.B.C.D")[[1]] > [1] 1 2 3 4 5 6 7 > attr(,"match.length") > [1] 1 1 1 1 1 1 1 > > Looks like dots means all the characters. Is there a way that I can extract > the position of the dots specifically? > > Thanks, > > -Jack > > [[alternative HTML version deleted]] > > > > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Eik Vettorazzi Institut f?r Medizinische Biometrie und Epidemiologie Universit?tsklinikum Hamburg-Eppendorf Martinistr. 52 20246 Hamburg T ++49/40/7410-58243 F ++49/40/7410-57790
Hi! You can try import the file with header = F, and after inform that the first row is a header. On this post is some idea: http://stackoverflow.com/questions/2293131/reading-first-row-as-header-is-easy-what-gives-with-two-rows-being-the-header On Wed, Aug 17, 2011 at 10:03 AM, Jack Luo <jluo.rhelp@gmail.com> wrote:> Hi, > > I have a silly question regarding the usage of two commands: read.table and > gregexpr: > For read.table, if I read a matrix and set header = T, I found that all the > dash ("-") becomes dots (".") > > > A = read.table("Matrix.txt", sep = "\t", header = F) > A[1,1] > # "A-B-C-D". > > A = read.table("Matrix.txt", sep = "\t", header = T) > colnames(A)[1] > # "A.B.C.D" > > Is there a way to use the header = T argument, but still keep the original > format "A-B-C-D"? > > For gregexpr, > gregexpr("-","A-B-C-D")[[1]] > #[1] 2 4 6 > #attr(,"match.length") > #[1] 1 1 1 > > > gregexpr(".","A.B.C.D")[[1]] > [1] 1 2 3 4 5 6 7 > attr(,"match.length") > [1] 1 1 1 1 1 1 1 > > Looks like dots means all the characters. Is there a way that I can extract > the position of the dots specifically? > > Thanks, > > -Jack > > [[alternative HTML version deleted]] > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Atenciosamente, Raphael Saldanha saldanha.plangeo@gmail.com [[alternative HTML version deleted]]