Hi. Is there a straightforward way to convert a character string containing comma-delimited numbers to a numeric vector? In my application, I use system(executable.string, intern=TRUE) which returns a string like "[0.E-38, 2.096751179214927596171268230, 3.678944959657480671183123052, 4.976528845643001020345216157, 6.072390165503099343887569007, 7.007958550337542210168866070, 7.807464185827177139302778736, 8.486139455817034846608029724, 9.053706780665060873259065771, 9.516172308326877463284426111, 9.876856047379733199590985269, 10.13695826383869052536062804, 10.29580989588667234885515374, 10.35092785255025551187463209, 10.29795676261278695909972578, 10.13052574735986793562227138, 9.839990935943625006580521345, 9.414977153151389385186358494, 8.840562526759586215404890348, 8.096830792651667245232639586, 7.156244887881612948153311800, 5.978569259122249264778017262, 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]" (the output is a single line). In a big run, the string may contain 10^5 or possibly 10^6 numbers. What's the recommended way to convert this to a numeric vector? -- Robin Hankin Uncertainty Analyst National Oceanography Centre, Southampton European Way, Southampton SO14 3ZH, UK tel 023-8059-7743
you could give a try to strsplit(), e.g., strg <- "0.E-38, 2.096751179214927596171268230, 3.678944959657480671183123052" strg <- paste(rep(strg, 5000), collapse = ", ") ################## f.out <- factor(strsplit(strg, ", ")[[1]]) n.out <- as.numeric(levels(f.out))[as.integer(f.out)] I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Robin Hankin" <r.hankin at noc.soton.ac.uk> To: "RHelp help" <r-help at stat.math.ethz.ch> Sent: Monday, March 19, 2007 10:18 AM Subject: [R] character to numeric conversion> Hi. > > Is there a straightforward way to convert a character string > containing comma-delimited > numbers to a numeric vector? > > In my application, I use > > system(executable.string, intern=TRUE) > > which returns a string like > > "[0.E-38, 2.096751179214927596171268230, > 3.678944959657480671183123052, 4.976528845643001020345216157, > 6.072390165503099343887569007, 7.007958550337542210168866070, > 7.807464185827177139302778736, 8.486139455817034846608029724, > 9.053706780665060873259065771, 9.516172308326877463284426111, > 9.876856047379733199590985269, 10.13695826383869052536062804, > 10.29580989588667234885515374, 10.35092785255025551187463209, > 10.29795676261278695909972578, 10.13052574735986793562227138, > 9.839990935943625006580521345, 9.414977153151389385186358494, > 8.840562526759586215404890348, 8.096830792651667245232639586, > 7.156244887881612948153311800, 5.978569259122249264778017262, > 4.499809670330265066808481929, 2.602689685444383764768503589, > 0.E-38]" > > > (the output is a single line). In a big run, the string may > contain > 10^5 or possibly 10^6 numbers. > > What's the recommended way to convert this to a numeric vector? > > > > > > > -- > Robin Hankin > Uncertainty Analyst > National Oceanography Centre, Southampton > European Way, Southampton SO14 3ZH, UK > tel 023-8059-7743 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
Robin Hankin wrote:> Hi. > > Is there a straightforward way to convert a character string > containing comma-delimited > numbers to a numeric vector? > > In my application, I use > > system(executable.string, intern=TRUE) > > which returns a string like > > "[0.E-38, 2.096751179214927596171268230, > 3.678944959657480671183123052, 4.976528845643001020345216157, > 6.072390165503099343887569007, 7.007958550337542210168866070, > 7.807464185827177139302778736, 8.486139455817034846608029724, > 9.053706780665060873259065771, 9.516172308326877463284426111, > 9.876856047379733199590985269, 10.13695826383869052536062804, > 10.29580989588667234885515374, 10.35092785255025551187463209, > 10.29795676261278695909972578, 10.13052574735986793562227138, > 9.839990935943625006580521345, 9.414977153151389385186358494, > 8.840562526759586215404890348, 8.096830792651667245232639586, > 7.156244887881612948153311800, 5.978569259122249264778017262, > 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]" > > > (the output is a single line). In a big run, the string may contain > 10^5 or possibly 10^6 numbers. > > What's the recommended way to convert this to a numeric vector? > >scan() on a text connection:> x <- "[0.E-38, 2.096751179214927596171268230,+ 3.678944959657480671183123052, 4.976528845643001020345216157, + 6.072390165503099343887569007, 7.007958550337542210168866070, + 7.807464185827177139302778736, 8.486139455817034846608029724, + 9.053706780665060873259065771, 9.516172308326877463284426111, + 9.876856047379733199590985269, 10.13695826383869052536062804, + 10.29580989588667234885515374, 10.35092785255025551187463209, + 10.29795676261278695909972578, 10.13052574735986793562227138, + 9.839990935943625006580521345, 9.414977153151389385186358494, + 8.840562526759586215404890348, 8.096830792651667245232639586, + 7.156244887881612948153311800, 5.978569259122249264778017262, + 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]"> tc <- textConnection(gsub("[][ \n]","",x)) > xx <- scan(tc,sep=",")Read 25 items> summary(xx)Min. 1st Qu. Median Mean 3rd Qu. Max. 0.000 4.977 8.097 7.049 9.840 10.350> close(tc)(By far, the hardest bit was getting the gsub regexp right...) Alternatively, just get rid of the brackets and replace commas with whitespace. A problem with sep="," is that it gets confused by line endings following a comma.> tc <- textConnection(gsub(",", " ", gsub("[][]", "", x))) > xx <- scan(tc)Read 25 items> summary(xx)Min. 1st Qu. Median Mean 3rd Qu. Max. 0.000 4.977 8.097 7.049 9.840 10.350> close(tc)> > > -- > Robin Hankin > Uncertainty Analyst > National Oceanography Centre, Southampton > European Way, Southampton SO14 3ZH, UK > tel 023-8059-7743 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
Here is one way. This matches strings which contain those characters found in a number, converting each such string to numeric. library(gsubfn) strapply(x, "[-0-9+.E]+", as.numeric) On 3/19/07, Robin Hankin <r.hankin at noc.soton.ac.uk> wrote:> Hi. > > Is there a straightforward way to convert a character string > containing comma-delimited > numbers to a numeric vector? > > In my application, I use > > system(executable.string, intern=TRUE) > > which returns a string like > > "[0.E-38, 2.096751179214927596171268230, > 3.678944959657480671183123052, 4.976528845643001020345216157, > 6.072390165503099343887569007, 7.007958550337542210168866070, > 7.807464185827177139302778736, 8.486139455817034846608029724, > 9.053706780665060873259065771, 9.516172308326877463284426111, > 9.876856047379733199590985269, 10.13695826383869052536062804, > 10.29580989588667234885515374, 10.35092785255025551187463209, > 10.29795676261278695909972578, 10.13052574735986793562227138, > 9.839990935943625006580521345, 9.414977153151389385186358494, > 8.840562526759586215404890348, 8.096830792651667245232639586, > 7.156244887881612948153311800, 5.978569259122249264778017262, > 4.499809670330265066808481929, 2.602689685444383764768503589, 0.E-38]" > > > (the output is a single line). In a big run, the string may contain > 10^5 or possibly 10^6 numbers. > > What's the recommended way to convert this to a numeric vector? > > > > > > > -- > Robin Hankin > Uncertainty Analyst > National Oceanography Centre, Southampton > European Way, Southampton SO14 3ZH, UK > tel 023-8059-7743 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >