Displaying 11 results from an estimated 11 matches for "theurl".
2010 Jul 03
1
XML and RCurl: problem with encoding (htmlTreeParse)
Hi All,
First method:-
>library(XML)
>theurl <- "http://home.sina.com"
>download.file(theurl, "tmp.html")
>txt <- readLines("tmp.html")
>txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes =
TRUE)
>g <- xpathSApply(txt, "//p", function(x) xmlValue(x))
>head(...
2009 Oct 15
1
Removing Embedded Null characters from text/html
...9; characters. These seem to indicate to R
that it should stop processing the page so I'd like to remove them.
I've been looking around and can't seem to identify exactly what the
character is and consequently how to remove it.
# THE CODE WORKS ON THIS PAGE
library(RCurl)
library(XML)
theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team"
webpage <- getURL(theurl)
# BUT DOES NOT WORK HERE DUE TO EMBEDDED NULL CHARACTERS
theurl <- "http://screen.yahoo.com/b?pr=1/&s=nm&db=stocks&vw=0&b=21"
webpage <- getURL(theurl)
Error i...
2012 Aug 09
2
read htm table error
Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message.
Error in htmlParse(doc) :
error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team
theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team"
tables <- readHTMLTable(theurl)
Regards,
Kiung
[[alternative HTML version deleted]]
2009 Oct 14
2
puzzle using gsub (and encodings maybe)
Hello,
Below is some output that shows my issue.
I have a variable x that I read from a file (more on this below)
> x
[1] "NEW YORK NEW ENGLAND"
> gsub(" -", "-", x) # this does not work!
[1] "NEW YORK NEW ENGLAND"
> Encoding(x) # is x in a special encoding? no
[1] "unknown"
> y = "NEW YORK -NEW
2010 Oct 06
2
Converting scraped data
Dear Colleagues,
I used this code to scrape data from the URL conatined within. This
code should be reproducible.
require("XML")
library(XML)
theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm"
tables <- readHTMLTable(theurl)
n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
class(tables)
test<-data.frame(tables, stringsAsFactors=FALSE)
test[16,c(2:5)]
as.numeric(test[16,c(2:5)])
quartz()
plot(c(1:4), test[15...
2010 Oct 10
1
Create single vector after looping through multiple data frames with GREP
...chlab2.ucr.edu/rwiki/index.php/R_Code_Snippets#unfactor
>> # Transform a factor back into its factor names
>> {
>> return(levels(factors)[factors])
>> }
>>
>> Then, to get your data to where you want it, I'd do this:
>>
>> require(XML)
>> theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm"
>> tables <- readHTMLTable(theurl)
>> n.rows <- unlist(lapply(tables, function(t) dim(t)[1]))
>> class(tables)
>> test<-data.frame(tables, stringsAsFactors=FALSE)
>>
>>
>> result &...
2009 Dec 31
3
XML and RCurl: problem with encoding (htmlTreeParse)
Hi,
I'm trying to get data from web page and modify it in R. I have a
problem with encoding. I'm not able to get
encoding right in htmlTreeParse command. See below
> library(RCurl)
> library(XML)
>
> site <- getURL("http://www.aarresaari.net/jobboard/jobs.html")
> txt <- readLines(tc <- textConnection(site)); close(tc)
> txt <- htmlTreeParse(txt,
2008 Dec 17
1
Extract Data from a Webpage
Hi All:
I would like to extract the provider name, address, and phone number
from multiple webpages like this:
http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489&P2=11490
Based on searching R-help archives, it seems like the XML package
might have something useful for this task. I can load the XML package
and supply the url as an argument to
2012 Jun 07
1
How to set cookies in RCurl
...d read its content. The website is a
restricted access website that I access through a proxy server (which
therefore requires me to enable cookies). I have problems in allowing Rcurl
to receive and send cookies.
The following lines give me:
library(RCurl)
library(XML)
url <- "http://www.theurl.com"
content <- readHTMLTable(url)
content
$`NULL`
V1
1...
2011 Nov 16
1
Checking for monotonic sequence
I am scraping data from a web page using XML (excellent package BTW - that's scraping data the easy way!).
So far, I've got the code:
tables <- readHTMLTable(theurl)
rhf <- tables$tabResHistFull
div1 <- rhf[which(rhf$V1=="Div ps"),]
div1
which is giving me the result:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15
15 Div ps p 32.31 35.64 40.17 42.55 45.13 46.36+17.22 51.11 55.72 70.78 71.72 76....
2006 Apr 10
6
detecting browser type?
I''m wondering how i can detect the browser type for the client. I know this
is possible, but i cant seem to find how to do this, nor any example code
for this.
I would appreciate if someone could point me to some info or just give me an
explanation.
thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: