thr3ads.net - search: "theurl"

Displaying 11 results from an estimated 11 matches for "theurl".

XML and RCurl: problem with encoding (htmlTreeParse)

2010 Jul 03

XML and RCurl: problem with encoding (htmlTreeParse)

Hi All, First method:- >library(XML) >theurl <- "http://home.sina.com" >download.file(theurl, "tmp.html") >txt <- readLines("tmp.html") >txt <- htmlTreeParse(txt, error=function(...){}, useInternalNodes = TRUE) >g <- xpathSApply(txt, "//p", function(x) xmlValue(x)) >head(...

Removing Embedded Null characters from text/html

2009 Oct 15

Removing Embedded Null characters from text/html

...9; characters. These seem to indicate to R that it should stop processing the page so I'd like to remove them. I've been looking around and can't seem to identify exactly what the character is and consequently how to remove it. # THE CODE WORKS ON THIS PAGE library(RCurl) library(XML) theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team" webpage <- getURL(theurl) # BUT DOES NOT WORK HERE DUE TO EMBEDDED NULL CHARACTERS theurl <- "http://screen.yahoo.com/b?pr=1/&s=nm&db=stocks&vw=0&b=21" webpage <- getURL(theurl) Error i...

read htm table error

2012 Aug 09

read htm table error

Hi I am using Version R 2.15 and I haven't been able read html table. Following is my code and error message. Error in htmlParse(doc) : error in creating parser for http://en.wikipedia.org/wiki/Brazil_national_football_team theurl <- "http://en.wikipedia.org/wiki/Brazil_national_football_team" tables <- readHTMLTable(theurl) Regards, Kiung [[alternative HTML version deleted]]

puzzle using gsub (and encodings maybe)

2009 Oct 14

puzzle using gsub (and encodings maybe)

Hello, Below is some output that shows my issue. I have a variable x that I read from a file (more on this below) > x [1] "NEW YORK NEW ENGLAND" > gsub(" -", "-", x) # this does not work! [1] "NEW YORK NEW ENGLAND" > Encoding(x) # is x in a special encoding? no [1] "unknown" > y = "NEW YORK -NEW

Converting scraped data

2010 Oct 06

Converting scraped data

Dear Colleagues, I used this code to scrape data from the URL conatined within. This code should be reproducible. require("XML") library(XML) theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm" tables <- readHTMLTable(theurl) n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) class(tables) test<-data.frame(tables, stringsAsFactors=FALSE) test[16,c(2:5)] as.numeric(test[16,c(2:5)]) quartz() plot(c(1:4), test[15...

Create single vector after looping through multiple data frames with GREP

2010 Oct 10

Create single vector after looping through multiple data frames with GREP

...chlab2.ucr.edu/rwiki/index.php/R_Code_Snippets#unfactor >> # Transform a factor back into its factor names >> { >> return(levels(factors)[factors]) >> } >> >> Then, to get your data to where you want it, I'd do this: >> >> require(XML) >> theurl <- "http://www.queensu.ca/cora/_trends/mip_2006.htm" >> tables <- readHTMLTable(theurl) >> n.rows <- unlist(lapply(tables, function(t) dim(t)[1])) >> class(tables) >> test<-data.frame(tables, stringsAsFactors=FALSE) >> >> >> result &...

XML and RCurl: problem with encoding (htmlTreeParse)

2009 Dec 31

XML and RCurl: problem with encoding (htmlTreeParse)

Hi, I'm trying to get data from web page and modify it in R. I have a problem with encoding. I'm not able to get encoding right in htmlTreeParse command. See below > library(RCurl) > library(XML) > > site <- getURL("http://www.aarresaari.net/jobboard/jobs.html") > txt <- readLines(tc <- textConnection(site)); close(tc) > txt <- htmlTreeParse(txt,

Extract Data from a Webpage

2008 Dec 17

Extract Data from a Webpage

Hi All: I would like to extract the provider name, address, and phone number from multiple webpages like this: http://oasasapps.oasas.state.ny.us/portal/pls/portal/oasasrep.providersearch.take_to_rpt?P1=3489&P2=11490 Based on searching R-help archives, it seems like the XML package might have something useful for this task. I can load the XML package and supply the url as an argument to

How to set cookies in RCurl

2012 Jun 07

How to set cookies in RCurl

...d read its content. The website is a restricted access website that I access through a proxy server (which therefore requires me to enable cookies). I have problems in allowing Rcurl to receive and send cookies. The following lines give me: library(RCurl) library(XML) url <- "http://www.theurl.com" content <- readHTMLTable(url) content $`NULL` V1 1...

Checking for monotonic sequence

2011 Nov 16

Checking for monotonic sequence

I am scraping data from a web page using XML (excellent package BTW - that's scraping data the easy way!). So far, I've got the code: tables <- readHTMLTable(theurl) rhf <- tables$tabResHistFull div1 <- rhf[which(rhf$V1=="Div ps"),] div1 which is giving me the result: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 15 Div ps p 32.31 35.64 40.17 42.55 45.13 46.36+17.22 51.11 55.72 70.78 71.72 76....

detecting browser type?

2006 Apr 10

detecting browser type?

I''m wondering how i can detect the browser type for the client. I know this is possible, but i cant seem to find how to do this, nor any example code for this. I would appreciate if someone could point me to some info or just give me an explanation. thanks in advance. -------------- next part -------------- An HTML attachment was scrubbed... URL:

search for: theurl