If you only want the character strings, this seems a little simpler:
> strsplit("a bc,def, adef ,,gh", "[ ,]+", perl=T)
[[1]]
[1] "a" "bc" "def" "adef"
"gh"
If you need delimeters (the commas) you could then add them back in again
afterwards.
Tim
------------------------------
Message: 2
Date: Thu, 4 May 2023 23:59:33 +0300
From: Leonard Mada <leo.mada at syonic.eu>
To: R-help Mailing List <r-help at r-project.org>
Subject: [R] Regex Split?
Message-ID: <7b1cdbe7-0086-24b4-9da6-369296eadfdc at syonic.eu>
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Dear R-Users,
I tried the following 3 Regex expressions in R 4.3:
strsplit("a bc,def, adef ,,gh", " |(?=,)|(?<=,)(?![ ])",
perl=T)
# "a"??? "bc"?? ","??? "def"?
","??? ""???? "adef" ","???
"," "gh"
strsplit("a bc,def, adef ,,gh", " |(?<! )(?=,)|(?<=,)(?![
])", perl=T)
# "a"??? "bc"?? ","??? "def"?
","??? ""???? "adef" ","???
"," "gh"
strsplit("a bc,def, adef ,,gh", " |(?<! )(?=,)|(?<=,)(?=[^
])", perl=T)
# "a"??? "bc"?? ","??? "def"?
","??? ""???? "adef" ","???
"," "gh"
Is this correct?
I feel that:
- none should return (after "def"): ",", "";
- the first one could also return "", "," (but probably not;
not fully
sure about this);
Sincerely,
Leonard
------------------------------