Dear R People: I have the following set of data> Block[1:5][1] "5600-5699" "6100-6199" "9700-9799" "9400-9499" "8300-8399" and I want to split at the -> strsplit(Block[1:5],"-")[[1]] [1] "5600" "5699" [[2]] [1] "6100" "6199" [[3]] [1] "9700" "9799" [[4]] [1] "9400" "9499" [[5]] [1] "8300" "8399">What is the best way to extract the pieces that are to the left of the dash, please? Thanks, Erin -- Erin Hodgess Associate Professor Department of Computer and Mathematical Sciences University of Houston - Downtown mailto: erinm.hodgess at gmail.com
Hi Erin,
this is one way:
Block <-
c("5600-5699","6100-6199","9700-9799","9400-9499","8300-8399")
splBlock <- strsplit(Block,"-")
sapply(splBlock, "[", 1)
greetings,
Remko
--
View this message in context:
http://r.789695.n4.nabble.com/strsplit-question-tp3896847p3896850.html
Sent from the R help mailing list archive at Nabble.com.
unlist(strsplit(Block[1:5], "-.+$"))
if you are going to want the other pieces later, the most efficient
way depends on the assumptions you can make about your data. If there
are always two elements from the split:
matrix(unlist(strsplit(Block[1:5], "-")), ncol = 2, byrow = TRUE)
## or
do.call("rbind", strsplit(Block[1:5], "-"))
the first option dropping everything after - is marginally more
efficient, followed by the matrix technique. A series of clunkier
options (in my view) would be:
unlist(strsplit(Block[1:5], "-"))[seq(from = 1, to = 2 *
length(Block[1:5]), by = 2)]
or very flexible in terms of extracting the first element (regardless
of how many there are), but computationally less efficient:
sapply(strsplit(Block[1:5], "-"), `[[`, 1)
but this is only slightly less so, and testing on a simple character
vector of length 10^8, was still complete in less than 1 second on a
1.66ghz dual core on R devel r57214 windows x64.
Cheers,
Josh
On Tue, Oct 11, 2011 at 10:20 PM, Erin Hodgess <erinm.hodgess at
gmail.com> wrote:> Dear R People:
>
> I have the following set of data
>> Block[1:5]
> [1] "5600-5699" "6100-6199" "9700-9799"
"9400-9499" "8300-8399"
>
> and I want to split at the -
>
>> strsplit(Block[1:5],"-")
> [[1]]
> [1] "5600" "5699"
>
> [[2]]
> [1] "6100" "6199"
>
> [[3]]
> [1] "9700" "9799"
>
> [[4]]
> [1] "9400" "9499"
>
> [[5]]
> [1] "8300" "8399"
>
>>
>
> What is the best way to extract the pieces that are to the left of the
> dash, please?
>
> Thanks,
> Erin
>
>
> --
> Erin Hodgess
> Associate Professor
> Department of Computer and Mathematical Sciences
> University of Houston - Downtown
> mailto: erinm.hodgess at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/
sapply(strsplit(Block[1:5],"-"), function (x) {x[1]})
comes to mind...
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil@dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
Erin Hodgess <erinm.hodgess@gmail.com> wrote:
Dear R People:
I have the following set of data> Block[1:5]
[1] "5600-5699" "6100-6199" "9700-9799"
"9400-9499" "8300-8399"
and I want to split at the -
> strsplit(Block[1:5],"-")
[[1]]
[1] "5600" "5699"
[[2]]
[1] "6100" "6199"
[[3]]
[1] "9700" "9799"
[[4]]
[1] "9400" "9499"
[[5]]
[1] "8300" "8399"
>
What is the best way to extract the pieces that are to the left of the
dash, please?
Thanks,
Erin
--
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodgess@gmail.com
_____________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
On Oct 12, 2011, at 1:20 AM, Erin Hodgess wrote:> Dear R People: > > I have the following set of data >> Block[1:5] > [1] "5600-5699" "6100-6199" "9700-9799" "9400-9499" "8300-8399" > > and I want to split at the - > >> strsplit(Block[1:5],"-") > [[1]] > [1] "5600" "5699" > > [[2]] > [1] "6100" "6199" > > [[3]] > [1] "9700" "9799" > > [[4]] > [1] "9400" "9499" > > [[5]] > [1] "8300" "8399" > >> > > What is the best way to extract the pieces that are to the left of the > dash, please? >> sub("\\-.*$", "", c("5600-5699", "6100-6199", "9700-9799", "9400-9499", "8300-8399") ) [1] "5600" "6100" "9700" "9400" "8300" -- David Winsemius, MD West Hartford, CT
On Wed, Oct 12, 2011 at 1:20 AM, Erin Hodgess <erinm.hodgess at gmail.com> wrote:> Dear R People: > > I have the following set of data >> Block[1:5] > [1] "5600-5699" "6100-6199" "9700-9799" "9400-9499" "8300-8399" > > and I want to split at the - > >> strsplit(Block[1:5],"-") > [[1]] > [1] "5600" "5699" > > [[2]] > [1] "6100" "6199" > > [[3]] > [1] "9700" "9799" > > [[4]] > [1] "9400" "9499" > > [[5]] > [1] "8300" "8399" >Try this:> x <- c("5600-5699", "6100-6199", "9700-9799", "9400-9499", "8300-8399") > sub("-.*", "", x) # before dash[1] "5600" "6100" "9700" "9400" "8300"> sub(".*-", "", x) # after dash[1] "5699" "6199" "9799" "9499" "8399" and here is another approach:> library(gsubfn) > m <- strapply(x, "\\d+", c, simplify = TRUE) > m[,1] [,2] [,3] [,4] [,5] [1,] "5600" "6100" "9700" "9400" "8300" [2,] "5699" "6199" "9799" "9499" "8399" Now m[1, ] and m[2, ] are the vectors of digits before and after the dash. Note that c in the strapply call can be replaced with as.numeric if you want a numeric matrix instead. -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com