Dear Guru's My first steps with R have ground to a halt! I have a vector of sample identifiers> sampleIDs[1] "D1_1" "D1_2" "D1_3" "D1_4" "D1_5" "D1_6" "D1_7" "D1_8" [9] "D1_9" "D1_10" "D1_11" "D1_12" "F1_13" "F1_14" "F1_15" "F1_16" [17] "F1_17" "F1_18" "F1_19" "F1_20" "F1_21" "F1_22" "F1_23" "F1_24" [25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31" "DDC_32" [33] "DDC_33" "DDC_34" "DDC_35" "DDC_36" from which I've split the prefix identifier using strsplit> splitIDs <- strsplit( as.character(sampleIDs), "_") > splitIDs[[1]] [1] "D1" "1" [[2]] [1] "D1" "2" [[3]] [1] "D1" "3" [[4]] [1] "D1" "4" etc I am now struggling to work with the prefix identifiers (D1, F1, DDC) because the only way I have figured out to access them is with splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the identifiers into a factor and counted. Is there a vectorised solution someone can suggest? Or an alternative strategy .. these are early days using R for me! Thanks regards M -- View this message in context: http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161254.html Sent from the R help mailing list archive at Nabble.com.
-- View this message in context: http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161389.html Sent from the R help mailing list archive at Nabble.com.
There are several ways to get a matrix, for example mat = as.matrix(as.data.frame(splitIDs)) or mat = sapply(splitIDs, I) True experts may suggests even more ways. Peter On Wed, Dec 22, 2010 at 1:02 PM, maddox <matthewgdodds at hotmail.com> wrote:> > Dear Guru's > > My first steps with R have ground to a halt! I have a vector of sample > identifiers > >> sampleIDs > ?[1] "D1_1" ? "D1_2" ? "D1_3" ? "D1_4" ? "D1_5" ? "D1_6" ? "D1_7" ? "D1_8" > ?[9] "D1_9" ? "D1_10" ?"D1_11" ?"D1_12" ?"F1_13" ?"F1_14" ?"F1_15" ?"F1_16" > [17] "F1_17" ?"F1_18" ?"F1_19" ?"F1_20" ?"F1_21" ?"F1_22" ?"F1_23" ?"F1_24" > [25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31" "DDC_32" > [33] "DDC_33" "DDC_34" "DDC_35" "DDC_36" > > from which I've split the prefix identifier using strsplit > >> splitIDs <- strsplit( as.character(sampleIDs), "_") >> splitIDs > [[1]] > [1] "D1" "1" > > [[2]] > [1] "D1" "2" > > [[3]] > [1] "D1" "3" > > [[4]] > [1] "D1" "4" ?etc > > I am now struggling to work with the prefix identifiers (D1, F1, DDC) > because the only way I have figured out to access them is with > splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the > identifiers into a factor and counted. > > Is there a vectorised solution someone can suggest? > Or an alternative strategy .. these are early days using R for me! > Thanks > > > regards > > M > > > > -- > View this message in context: http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161254.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Try sapply(strsplit(sampleIDs, "_"), "[", 1) HTH, Jorge On Wed, Dec 22, 2010 at 4:02 PM, maddox <> wrote:> > Dear Guru's > > My first steps with R have ground to a halt! I have a vector of sample > identifiers > > > sampleIDs > [1] "D1_1" "D1_2" "D1_3" "D1_4" "D1_5" "D1_6" "D1_7" "D1_8" > [9] "D1_9" "D1_10" "D1_11" "D1_12" "F1_13" "F1_14" "F1_15" "F1_16" > [17] "F1_17" "F1_18" "F1_19" "F1_20" "F1_21" "F1_22" "F1_23" "F1_24" > [25] "DDC_25" "DDC_26" "DDC_27" "DDC_28" "DDC_29" "DDC_30" "DDC_31" > "DDC_32" > [33] "DDC_33" "DDC_34" "DDC_35" "DDC_36" > > from which I've split the prefix identifier using strsplit > > > splitIDs <- strsplit( as.character(sampleIDs), "_") > > splitIDs > [[1]] > [1] "D1" "1" > > [[2]] > [1] "D1" "2" > > [[3]] > [1] "D1" "3" > > [[4]] > [1] "D1" "4" etc > > I am now struggling to work with the prefix identifiers (D1, F1, DDC) > because the only way I have figured out to access them is with > splitIDs[[i]][1] i.e. it seems like I have to use a loop to get the > identifiers into a factor and counted. > > Is there a vectorised solution someone can suggest? > Or an alternative strategy .. these are early days using R for me! > Thanks > > > regards > > M > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161254.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks Jorge, for your reply. In the end I changed my approach and used a sub() strategy I found on this forum to recover the prefixes as below. IDs.prefix <- sub("([^*])(_.*)", "\\1" , sampleIDs ) IDs.split <- cbind(sampleIDs , IDs.prefix) Regards M -- View this message in context: http://r.789695.n4.nabble.com/vectorised-recovery-of-strsplit-value-tp3161254p3161806.html Sent from the R help mailing list archive at Nabble.com.