search for: theurl

Displaying 11 results from an estimated 11 matches for "theurl".

2010 Jul 03
1
XML and RCurl: problem with encoding (htmlTreeParse)
Hi All, First method:- >library(XML) >theurl <- "http://home.sina.com" >download.file(theurl, "tmp.html") >txt <- readLines("tmp.html") >txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) >g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) >head(...
2009 Oct 15
1
Removing Embedded Null characters from text/html
...9; characters. These seem to indicate to R that it should stop processing the page so I'd like to remove them. I've been looking around and can't seem to identify exactly what the character is and consequently how to remove it. # THE CODE WORKS ON THIS PAGE library(RCurl) library(XML) theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team" webpage <- getURL(theurl) # BUT DOES NOT WORK HERE DUE TO EMBEDDED NULL CHARACTERS theurl <- "http://screen.yahoo.com/b?pr=1/&s=nm&db=stocks&vw=0&b=21" webpage <- getURL(theurl) Error i...
2012 Aug 09
2
read htm table error
Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message. Error in htmlParse(doc) : error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team" tables <- readHTMLTable(theurl) Regards, Kiung [[alternative HTML version deleted]]
2009 Oct 14
2
puzzle using gsub (and encodings maybe)
Hello, Below is some output that shows my issue. I have a variable x that I read from a file (more on this below) > x [1] "NEW YORK NEW ENGLAND" > gsub(" -", "-", x) # this does not work! [1] "NEW YORK NEW ENGLAND" > Encoding(x) # is x in a special encoding? no [1] "unknown" > y = "NEW YORK -NEW
2010 Oct 06
2
Converting scraped data
Dear Colleagues, I used this code to scrape data from the URL conatined within. This code should be reproducible. require("XML") library(XML) theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm" tables <- readHTMLTable(theurl) n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) class(tables) test<-data.frame(tables, stringsAsFactors=FALSE) test[16,c(2:5)] as.numeric(test[16,c(2:5)]) quartz() plot(c(1:4), test[15...
2010 Oct 10
1
Create single vector after looping through multiple data frames with GREP
...chlab2.ucr.edu/rwiki/index.php/R_Code_Snippets#unfactor >> # Transform a factor back into its factor names >> { >> return(levels(factors)[factors]) >> } >> >> Then, to get your data to where you want it, I'd do this: >> >> require(XML) >> theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm" >> tables <- readHTMLTable(theurl) >> n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) >> class(tables) >> test<-data.frame(tables, stringsAsFactors=FALSE) >> >> >> result &...
2009 Dec 31
3
XML and RCurl: problem with encoding (htmlTreeParse)
Hi, I'm trying to get data from web page and modify it in R. I have a problem with encoding. I'm not able to get encoding right in htmlTreeParse command. See below > library(RCurl) > library(XML) > > site <- getURL("http://www.aarresaari.net/jobboard/jobs.html") > txt <- readLines(tc <- textConnection(site)); close(tc) > txt <- htmlTreeParse(txt,
2008 Dec 17
1
Extract Data from a Webpage
Hi All: I would like to extract the provider name, address, and phone number from multiple webpages like this: http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489&P2=11490 Based on searching R-help archives, it seems like the XML package might have something useful for this task. I can load the XML package and supply the url as an argument to
2012 Jun 07
1
How to set cookies in RCurl
...d read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url <- "http://www.theurl.com" content <- readHTMLTable(url) content $`NULL` V1 1...
2011 Nov 16
1
Checking for monotonic sequence
I am scraping data from a web page using XML (excellent package BTW - that's scraping data the easy way!). So far, I've got the code: tables <- readHTMLTable(theurl) rhf <- tables$tabResHistFull div1 <- rhf[which(rhf$V1=="Div ps"),] div1 which is giving me the result:        V1 V2    V3    V4    V5    V6    V7          V8    V9   V10   V11   V12   V13   V14  V15 15 Div ps  p 32.31 35.64 40.17 42.55 45.13 46.36+17.22 51.11 55.72 70.78 71.72 76....
2006 Apr 10
6
detecting browser type?
I''m wondering how i can detect the browser type for the client. I know this is possible, but i cant seem to find how to do this, nor any example code for this. I would appreciate if someone could point me to some info or just give me an explanation. thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL: