Hi all, I have been beating my head against this problem for a bit, but I can't figure it out. I have a series of strings of variable length, and each will have one or more numbers, of varying format. E.g., I might have: tmpstr = "The first number is: 32. Another one is: 32.1. Here's a number in scientific format, 0.3523e10, and another, 0.3523e-10, and a negative, -313.1" How could I get R to just give me a list of numerics containing the numbers therein? Thanks very much to the regexp wizards! Cheers, Nick -- ===================================================Nicholas J. Matzke Ph.D. Candidate, Graduate Student Researcher Huelsenbeck Lab Center for Theoretical Evolutionary Genomics 4151 VLSB (Valley Life Sciences Building) Department of Integrative Biology University of California, Berkeley Graduate Student Instructor, IB200B Principles of Phylogenetics: Ecology and Evolution http://ib.berkeley.edu/courses/ib200b/ http://phylo.wikidot.com/ Lab websites: http://ib.berkeley.edu/people/lab_detail.php?lab=54 http://fisher.berkeley.edu/cteg/hlab.html Dept. personal page: http://ib.berkeley.edu/people/students/person_detail.php?person=370 Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html Lab phone: 510-643-6299 Dept. fax: 510-643-6264 Cell phone: 510-301-0179 Email: matzke at berkeley.edu Mailing address: Department of Integrative Biology 1005 Valley Life Sciences Building #3140 Berkeley, CA 94720-3140 ----------------------------------------------------- "[W]hen people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together." Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 14(1), 35-44. Fall 1989. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm
HI, One way would be: library(stringr) tmpstr = "The first number is: 32.? Another one is: 32.1. Here's a number in scientific format, 0.3523e10, and another, 0.3523e-10, and a negative, -313.1" pattern<- "(\\d)+|(\\d+\\.\\d+)|(-\\d+\\.\\d+)|(\\d+.\\d+e\\d+)|(\\d+\\.\\d+e-\\d+)" str_extract_all(tmpstr,pattern)[[1]] #[1] "32"???????? "32.1"?????? "0.3523e10"? "0.3523e-10" "-313.1"??? ?as.numeric(str_extract_all(tmpstr,pattern)[[1]]) A.K. ----- Original Message ----- From: Nick Matzke <matzke at berkeley.edu> To: R-help at r-project.org Cc: Sent: Sunday, June 16, 2013 1:06 AM Subject: [R] extract all numbers from a string Hi all, I have been beating my head against this problem for a bit, but I can't figure it out. I have a series of strings of variable length, and each will have one or more numbers, of varying format.? E.g., I might have: tmpstr = "The first number is: 32.? Another one is: 32.1. Here's a number in scientific format, 0.3523e10, and another, 0.3523e-10, and a negative, -313.1" How could I get R to just give me a list of numerics containing the numbers therein? Thanks very much to the regexp wizards! Cheers, Nick -- ===================================================Nicholas J. Matzke Ph.D. Candidate, Graduate Student Researcher Huelsenbeck Lab Center for Theoretical Evolutionary Genomics 4151 VLSB (Valley Life Sciences Building) Department of Integrative Biology University of California, Berkeley Graduate Student Instructor, IB200B Principles of Phylogenetics: Ecology and Evolution http://ib.berkeley.edu/courses/ib200b/ http://phylo.wikidot.com/ Lab websites: http://ib.berkeley.edu/people/lab_detail.php?lab=54 http://fisher.berkeley.edu/cteg/hlab.html Dept. personal page: http://ib.berkeley.edu/people/students/person_detail.php?person=370 Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html Lab phone: 510-643-6299 Dept. fax: 510-643-6264 Cell phone: 510-301-0179 Email: matzke at berkeley.edu Mailing address: Department of Integrative Biology 1005 Valley Life Sciences Building #3140 Berkeley, CA 94720-3140 ----------------------------------------------------- "[W]hen people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together." Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical Inquirer, 14(1), 35-44. Fall 1989. http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Nick try as.numeric( strsplit(gsub("[[:alpha:][:punct:][:space:]]{2,}",",",tmpstr),",")[[1]][-1] ) see ?regexpr for information HTH Duncan Duncan Mackay Department of Agronomy and Soil Science University of New England Armidale NSW 2351 Email: home: mackay at northnet.com.au At 15:06 16/06/2013, you wrote:>Hi all, > >I have been beating my head against this problem for a bit, but I >can't figure it out. > >I have a series of strings of variable length, and each will have >one or more numbers, of varying format. E.g., I might have: > > >tmpstr = "The first number is: 32. Another one is: 32.1. Here's a >number in scientific format, 0.3523e10, and another, 0.3523e-10, and >a negative, -313.1" > >How could I get R to just give me a list of numerics containing the >numbers therein? > >Thanks very much to the regexp wizards! > >Cheers, >Nick > > > >-- >===================================================>Nicholas J. Matzke >Ph.D. Candidate, Graduate Student Researcher > >Huelsenbeck Lab >Center for Theoretical Evolutionary Genomics >4151 VLSB (Valley Life Sciences Building) >Department of Integrative Biology >University of California, Berkeley > >Graduate Student Instructor, IB200B >Principles of Phylogenetics: Ecology and Evolution >http://ib.berkeley.edu/courses/ib200b/ >http://phylo.wikidot.com/ > > >Lab websites: >http://ib.berkeley.edu/people/lab_detail.php?lab=54 >http://fisher.berkeley.edu/cteg/hlab.html >Dept. personal page: >http://ib.berkeley.edu/people/students/person_detail.php?person=370 >Lab personal page: http://fisher.berkeley.edu/cteg/members/matzke.html >Lab phone: 510-643-6299 >Dept. fax: 510-643-6264 > >Cell phone: 510-301-0179 >Email: matzke at berkeley.edu > >Mailing address: >Department of Integrative Biology >1005 Valley Life Sciences Building #3140 >Berkeley, CA 94720-3140 > >----------------------------------------------------- >"[W]hen people thought the earth was flat, they were wrong. When >people thought the earth was spherical, they were wrong. But if you >think that thinking the earth is spherical is just as wrong as >thinking the earth is flat, then your view is wronger than both of >them put together." > >Isaac Asimov (1989). "The Relativity of Wrong." The Skeptical >Inquirer, 14(1), 35-44. Fall 1989. >http://chem.tufts.edu/AnswersInScience/RelativityofWrong.htm > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. >