Dustin Fife
2014-Mar-18 17:43 UTC
[R] automatically replacing the third period with a break
I've got a dataset with really long column names (e.g.,
CYJ.OSU.OAV.UJC.BUT.RDI). What I'd like to do is replace the fourth period
with a break ("\n") so that when it plots, it will not run off the
page.
Here's what I've got so far:
#### create fake names function
fake.names = function(x){
paste0(LETTERS[sample(1:26,3)], collapse="")
}
#### create the fake names
fake = paste0(unlist(lapply(1:6, fake.names)), collapse=".")
#### replace fourth period with \n
gsub("[[:alnum:]]\\.[[:alnum:]]+\\.[[:alnum:]]+\\.[[:alnum:]]+\\.",
"[[:alnum:]]\\.[[:alnum:]]+\\.[[:alnum:]]+\\.[[:alnum:]]+\n",fake)
which results in something like:
"TW[[:alnum:]].[[:alnum:]]+.[[:alnum:]]+.[[:alnum:]]+\nNQJ.VSI"
Any ideas on how to make it replace that?
[[alternative HTML version deleted]]
Thomas Lumley
2014-Mar-18 19:26 UTC
[R] automatically replacing the third period with a break
On Tue, Mar 18, 2014 at 12:43 PM, Dustin Fife <fife.dustin@gmail.com> wrote:> I've got a dataset with really long column names (e.g., > CYJ.OSU.OAV.UJC.BUT.RDI). What I'd like to do is replace the fourth period > with a break ("\n") so that when it plots, it will not run off the page. > Here's what I've got so far: > > #### create fake names function > fake.names = function(x){ > paste0(LETTERS[sample(1:26,3)], collapse="") > } > #### create the fake names > fake = paste0(unlist(lapply(1:6, fake.names)), collapse=".") > >Backreferences cat( gsub("(([[:alnum:]]+\\.){3})([[:alnum:]]+)\\.", "\\1\\2\n", fake ) ) That is, match three word/period sequences, match a word, match a period, and output the first two things. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]]