Hi, I wonder whether there is any convenient function (or package) to extract tables from a HTML page? e.g. from http://www.google.com/finance/historical?q=SHE:002251 I know we can readLines('URL'), gsub('<td>...', '...', source), ... and at last get the numbers; I'm writing to ask whether someone has already contributed a more general function (with the package XML or other packages). Thanks! Regards, Yihui -- Yihui Xie <xieyihui at gmail.com> Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 Mobile: +86-15810805877 Homepage: http://www.yihui.name School of Statistics, Room 1037, Mingde Main Building, Renmin University of China, Beijing, 100872, China
Yihui Xie wrote:> > I wonder whether there is any convenient function (or package) to > extract tables from a HTML page? e.g. from > http://www.google.com/finance/historical?q=SHE:002251 >Try a search on R (I prefer markmail search) <http://r-project.markmail.org/search/?q=extract%20html> Dieter -- View this message in context: http://www.nabble.com/extract-tables-as-data.frames-from-HTML-source-tp22862641p22863816.html Sent from the R help mailing list archive at Nabble.com.
Gabor Grothendieck
2009-Apr-03 10:56 UTC
[R] extract tables as data.frames from HTML source
If the question is specific to getting stock data from google finance then check out getSymbols.google in the quantmod package. Also note that there exists an r-sig-finance list for questions pertaining to R and finance. On Fri, Apr 3, 2009 at 2:18 AM, Yihui Xie <xieyihui at gmail.com> wrote:> Hi, > > I wonder whether there is any convenient function (or package) to > extract tables from a HTML page? e.g. from > http://www.google.com/finance/historical?q=SHE:002251 > > I know we can readLines('URL'), gsub('<td>...', '...', source), ... > and at last get the numbers; I'm writing to ask whether someone has > already contributed a more general function (with the package XML or > other packages). Thanks! > > Regards, > Yihui > -- > Yihui Xie <xieyihui at gmail.com> > Phone: +86-(0)10-82509086 Fax: +86-(0)10-82509086 > Mobile: +86-15810805877 > Homepage: http://www.yihui.name > School of Statistics, Room 1037, Mingde Main Building, > Renmin University of China, Beijing, 100872, China > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Possibly Parallel Threads
- new line between '}' and 'else' in function body
- Line breaks in mathematical formulae in Rd files (PR#13287)
- How to get commands history as a character vector instead of displaying them?
- Special characters in Rd example section will cause errors
- code works in R desktop but not iin RWeb - I got it working