eek! Chel Hee,anything that complicated should engender fear and trembling. Much simpler and more efficient (if I understand correctly) i <- seq.int(1L,length(ID1),by = 2L) paste0(ID1[i],ID1[i+1]) That gives a vector of paired letters. If you want a single character string, just collapse with a " " (space): paste0(ID1[i],ID1[i+1],collapse= " ") Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee <chl948 at mail.usask.ca> wrote:> I am using just the first row of your data (i.e. ID1). > >> ID1 <- c("A", "A", "T", "G", "C", "T", "G", "C", "G", "T", "C", "G", "T", >> "A") >> do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse="")) > 1 2 3 4 5 6 7 > "AA" "TG" "CT" "GC" "GT" "CG" "TA" >> > > Is this what you are looking for? I hope this helps. > > Chel Hee Lee > > > On 01/28/2015 05:55 PM, Kate Ignatius wrote: >> >> I have genetic data as follows (simple example, actual data is much >> larger): >> >> comb >> >> ID1 A A T G C T G C G T C G T A >> >> ID2 G C T G C C T G C T G T T T >> >> And I wish to get an output like this: >> >> ID1 AA TG CT GC GT CG TA >> >> ID2 GC TG CC TG CT GT TT >> >> That is, paste every two columns together. >> >> I have this code, but I get the error: >> >> Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 >> >> conc <- function(x) { >> s <- seq(2, nchar(x), 2) >> paste0(x[s], x[s+1]) >> } >> >> combn <- as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) >> >> Thanks in advance! >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Bert! yes, you are VERY correct!!! Why am I making this simple thing so complicated??? ;) Thank you so much for your nice lesson! Chel Hee Lee On 01/28/2015 09:59 PM, Bert Gunter wrote:> eek! > > Chel Hee,anything that complicated should engender fear and trembling. > > Much simpler and more efficient (if I understand correctly) > > i <- seq.int(1L,length(ID1),by = 2L) > paste0(ID1[i],ID1[i+1]) > > That gives a vector of paired letters. If you want a single character > string, just collapse with a " " (space): > > paste0(ID1[i],ID1[i+1],collapse= " ") > > Cheers, > Bert > > Bert Gunter > Genentech Nonclinical Biostatistics > (650) 467-7374 > > "Data is not information. Information is not knowledge. And knowledge > is certainly not wisdom." > Clifford Stoll > > > > > On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee <chl948 at mail.usask.ca> wrote: >> I am using just the first row of your data (i.e. ID1). >> >>> ID1 <- c("A", "A", "T", "G", "C", "T", "G", "C", "G", "T", "C", "G", "T", >>> "A") >>> do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse="")) >> 1 2 3 4 5 6 7 >> "AA" "TG" "CT" "GC" "GT" "CG" "TA" >>> >> >> Is this what you are looking for? I hope this helps. >> >> Chel Hee Lee >> >> >> On 01/28/2015 05:55 PM, Kate Ignatius wrote: >>> >>> I have genetic data as follows (simple example, actual data is much >>> larger): >>> >>> comb >>> >>> ID1 A A T G C T G C G T C G T A >>> >>> ID2 G C T G C C T G C T G T T T >>> >>> And I wish to get an output like this: >>> >>> ID1 AA TG CT GC GT CG TA >>> >>> ID2 GC TG CC TG CT GT TT >>> >>> That is, paste every two columns together. >>> >>> I have this code, but I get the error: >>> >>> Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 >>> >>> conc <- function(x) { >>> s <- seq(2, nchar(x), 2) >>> paste0(x[s], x[s+1]) >>> } >>> >>> combn <- as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) >>> >>> Thanks in advance! >>> >>> ______________________________________________ >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code.
Kate, here's a solution that uses regular expressions, rather than vector manipulation:> mystr = "ID1 A A T G C T G C G T C G T A" > gsub(" ([ACGT]) ([ACGT])", " \\1\\2", mystr)[1] "ID1 AA TG CT GC GT CG TA" -John> -----Original Message----- > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Chel Hee > Lee > Sent: Wednesday, January 28, 2015 11:07 PM > To: Bert Gunter > Cc: r-help > Subject: Re: [R] Paste every two columns together > > Hi Bert! yes, you are VERY correct!!! Why am I making this simple thing so > complicated??? ;) Thank you so much for your nice lesson! > > Chel Hee Lee > > On 01/28/2015 09:59 PM, Bert Gunter wrote: > > eek! > > > > Chel Hee,anything that complicated should engender fear and trembling. > > > > Much simpler and more efficient (if I understand correctly) > > > > i <- seq.int(1L,length(ID1),by = 2L) > > paste0(ID1[i],ID1[i+1]) > > > > That gives a vector of paired letters. If you want a single character > > string, just collapse with a " " (space): > > > > paste0(ID1[i],ID1[i+1],collapse= " ") > > > > Cheers, > > Bert > > > > Bert Gunter > > Genentech Nonclinical Biostatistics > > (650) 467-7374 > > > > "Data is not information. Information is not knowledge. And knowledge > > is certainly not wisdom." > > Clifford Stoll > > > > > > > > > > On Wed, Jan 28, 2015 at 7:41 PM, Chel Hee Lee <chl948 at mail.usask.ca> > wrote: > >> I am using just the first row of your data (i.e. ID1). > >> > >>> ID1 <- c("A", "A", "T", "G", "C", "T", "G", "C", "G", "T", "C", "G", > >>> "T", > >>> "A") > >>> do.call(c,lapply(tapply(ID1, gl(7,2), c), paste, collapse="")) > >> 1 2 3 4 5 6 7 > >> "AA" "TG" "CT" "GC" "GT" "CG" "TA" > >>> > >> > >> Is this what you are looking for? I hope this helps. > >> > >> Chel Hee Lee > >> > >> > >> On 01/28/2015 05:55 PM, Kate Ignatius wrote: > >>> > >>> I have genetic data as follows (simple example, actual data is much > >>> larger): > >>> > >>> comb > >>> > >>> ID1 A A T G C T G C G T C G T A > >>> > >>> ID2 G C T G C C T G C T G T T T > >>> > >>> And I wish to get an output like this: > >>> > >>> ID1 AA TG CT GC GT CG TA > >>> > >>> ID2 GC TG CC TG CT GT TT > >>> > >>> That is, paste every two columns together. > >>> > >>> I have this code, but I get the error: > >>> > >>> Error in seq.default(2, nchar(x), 2) : 'to' must be of length 1 > >>> > >>> conc <- function(x) { > >>> s <- seq(2, nchar(x), 2) > >>> paste0(x[s], x[s+1]) > >>> } > >>> > >>> combn <- as.data.frame(lapply(comb, conc), stringsAsFactors=FALSE) > >>> > >>> Thanks in advance! > >>> > >>> ______________________________________________ > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >> > >> ______________________________________________ > >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.