Dustin Fife
2014-Mar-18 17:43 UTC
[R] automatically replacing the third period with a break
I've got a dataset with really long column names (e.g., CYJ.OSU.OAV.UJC.BUT.RDI). What I'd like to do is replace the fourth period with a break ("\n") so that when it plots, it will not run off the page. Here's what I've got so far: #### create fake names function fake.names = function(x){ paste0(LETTERS[sample(1:26,3)], collapse="") } #### create the fake names fake = paste0(unlist(lapply(1:6, fake.names)), collapse=".") #### replace fourth period with \n gsub("[[:alnum:]]\\.[[:alnum:]]+\\.[[:alnum:]]+\\.[[:alnum:]]+\\.", "[[:alnum:]]\\.[[:alnum:]]+\\.[[:alnum:]]+\\.[[:alnum:]]+\n",fake) which results in something like: "TW[[:alnum:]].[[:alnum:]]+.[[:alnum:]]+.[[:alnum:]]+\nNQJ.VSI" Any ideas on how to make it replace that? [[alternative HTML version deleted]]
Thomas Lumley
2014-Mar-18 19:26 UTC
[R] automatically replacing the third period with a break
On Tue, Mar 18, 2014 at 12:43 PM, Dustin Fife <fife.dustin@gmail.com> wrote:> I've got a dataset with really long column names (e.g., > CYJ.OSU.OAV.UJC.BUT.RDI). What I'd like to do is replace the fourth period > with a break ("\n") so that when it plots, it will not run off the page. > Here's what I've got so far: > > #### create fake names function > fake.names = function(x){ > paste0(LETTERS[sample(1:26,3)], collapse="") > } > #### create the fake names > fake = paste0(unlist(lapply(1:6, fake.names)), collapse=".") > >Backreferences cat( gsub("(([[:alnum:]]+\\.){3})([[:alnum:]]+)\\.", "\\1\\2\n", fake ) ) That is, match three word/period sequences, match a word, match a period, and output the first two things. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland [[alternative HTML version deleted]]