thr3ads.net - search: "isurl"

2005 May 02

2

"Special" characters in URI

...rjanc g[au]" R> tmp$URL <- "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au]" R> tmp $term [1] "gorjanc g[au]" $URL [1] "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au]" R> xmlTreeParse(tmp$URL, isURL=TRUE, handlers=NULL, asTree=TRUE) Error in xmlTreeParse(tmp$URL, isURL = TRUE, handlers = NULL, asTree = TRUE) : error in creating parser for http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?term=gorjanc g[au] # so I have a problem with space and [ and ] # let's reduce a prob...

Analyzing Publications from Pubmed via XML

2007 Dec 14

6

Analyzing Publications from Pubmed via XML

...he search provides data but obviously not as an R dataframe. I can get the search to export the data as an xml feed and the xml package seems to be able to read it. xmlTreeParse(" http://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=0_JYbpsax0ZAAPnOd7nFAX-29fXDpTk5t8M4hx9ytT- ",isURL=TRUE) But getting from there to a dataframe in which one column would be the name of the journal and another column would be the year (to keep things simple) seems to be beyond my capabilities. Has anyone ever done this and could you share your script? Are there any published examples where the e...

Minor "bug" in source()

2005 Jul 19

1

Minor "bug" in source()

...- .Internal(parse(file, n = -1, NULL, "?"))) if (verbose) cat("--> parsed", Ne, "expressions; now eval(.)ing them:\n") if (Ne == 0) return(invisible()) if (chdir && is.character(ofile)) { <=== HERE isURL <- length(grep("^(ftp|http|file)://", ofile)) > 0 if (!isURL && (path <- dirname(ofile)) != ".") { owd <- getwd() on.exit(setwd(owd), add = TRUE) setwd(path) } } <snip></snip&...

Extract just some fields from XML

2005 May 08

2

Extract just some fields from XML

...ic fields from an XML document and I am totally puzzled. I hope someone can help me. # URL URL<-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11877539,11822933,11871444&retmode=xml&rettype=citation" # download a XML file tmp <- xmlTreeParse(URL, isURL = TRUE) tmp <- xmlRoot(tmp) Now I want to extract only node 'pubdate' and its children, but I don't know how to do that unless I try to dig into the structure of the XML file. The problem is that structure can differ and then hardcoded set of list indices i.e. tmp[[i]][[j]]... doesn...

htmlParse (from XML library) working sporadically in the same code

2013 Mar 20

1

htmlParse (from XML library) working sporadically in the same code

...TTP resource Error is coming from the following line in htmlParse code: ans <- .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), as.logical(isHTML), as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, as.integer(options), PACKAGE = "XML") By the way, readHTMLTable(h...

Stuck ...can't get sapply and xmlTreeParse working

2011 Jul 05

2

Stuck ...can't get sapply and xmlTreeParse working

...function(x) { url.zill <-paste("http://www.zillow.com/webservice/GetDeepSearchResults.htm?zws-id=X1-ZWz1bup03e49vv_5kvb6&address=",x, sep="") ############## problem line is next ################################# zdoc <-xmlTreeParse(url.zill, useInternalNode=TRUE, isURL=TRUE) ############# problem line above ################################## f$zpid <- sapply(getNodeSet(zdoc, "//result/zpid"), xmlValue) f$zest.low <-sapply(getNodeSet(zdoc, "//valuationRange/low"), xmlValue) f$zest <- sapply(getNodeSet(zdoc, "//zestimate/a...

Fwd: Extract just some fields from XML]

2005 May 10

0

Fwd: Extract just some fields from XML]

...t and I am totally > puzzled. I hope someone can help me. > > # URL > URL<-"http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&id=11877539,11822933,11871444&retmode=xml&rettype=citation" > # download a XML file > tmp <- xmlTreeParse(URL, isURL = TRUE) > tmp <- xmlRoot(tmp) > > Now I want to extract only node 'pubdate' and its children, but I don't > know how to do that unless I try to dig into the structure of the XML > file. The problem is that structure can differ and then hardcoded set > of list indice...

RSXML - Parsing XML Documents on Internet

2004 Sep 29

2

RSXML - Parsing XML Documents on Internet

R Users - I asked about this a few months ago and never did quite figure it out, so with more information, allow me to try again. If I use the following code: library(xml) xmlTreeParse("http://home.comcast.net/~larsenmtl/xmlTestDoc.xml", isURL = TRUE) I receive this error: Error in xmlTreeParse("http://home.comcast.net/~larsenmtl/xmlTestDoc.xml"", : error in creating parser for http://home.comcast.net/~larsenmtl/xmlTestDoc.xml" Now I know that xmlTreeParse uses the libxml facilities for downloading and pa...

Need help reading website info with XML package and XPath

2011 May 30

1

Need help reading website info with XML package and XPath

...getNodeSet needs something else to get to information contained in a bubble on a webpage. Any suggestions or ideas would be GREATLY appreciated. library(XML) url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb" doc <- htmlTreeParse(url, useInternalNode=TRUE, isURL=TRUE) f1 <- getNodeSet(doc, "//a[contains(@href,'homedetails')]") f2 <- getNodeSet(doc, "//span[contains(@class,'price')]") f3 <- getNodeSet(doc, "//LIST[@Beds]") f4 <- getNodeSet(doc, "//LIST[@Baths]") f5 <- getNodeSet(doc, &qu...

XML segfault on some architectures

2011 Jun 07

1

XML segfault on some architectures

...ault *** address 0x500001c4f, cause 'memory not mapped' Traceback: 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), FALSE, as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, PACKAGE = "XML") 2: xmlTreeParse(fileName) Possible actions: 1:...

newbie xml parsing question

2011 May 28

1

newbie xml parsing question

...le to load the web page I'm interested with the following code but I'm not sure of the next step to get the information I'm interested in into R : library(XML) url <- "http://www.zillow.com/homes/511 W Lafayette St, Norristown, PA_rb" doc <-doc <- htmlTreeParse(url1, isURL=TRUE) doc I'd like to be able to pull the following information into R href home details string : /homedetails/236-Arundel-Ave-Horsham-PA-19044/9933810_zpid/#{scid=hdp-site-map-bubble-address} value for Zestimate \ Price: $239,000 Beds : 3 Baths: 1.0 Sqft :1630 I noticed all that informa...

KEGGSOAP installation error

2012 May 15

1

KEGGSOAP installation error

...ault *** address 0x500001c4f, cause 'memory not mapped' Traceback: 1: .Call("RS_XML_ParseTree", as.character(file), handlers, as.logical(ignoreBlanks), as.logical(replaceEntities), as.logical(asText), as.logical(trim), as.logical(validate), as.logical(getDTD), as.logical(isURL), as.logical(addAttributeNamespaces), as.logical(useInternalNodes), FALSE, as.logical(isSchema), as.logical(fullNamespaceInfo), as.character(encoding), as.logical(useDotNames), xinclude, error, addFinalizer, PACKAGE = "XML") 2: xmlParse(url) 3: parseSchemaDoc(fileName)...

Problems with package XML

2002 May 08

0

Problems with package XML

I'm having some difficulties with the package XML. Namely, issuing the following commands: > library(XML) > hp <- htmlTreeParse('http://www.liacc.up.pt/~ltorgo/index.html',isURL=T) I get a flood of messages like this : Save workspace image? [y/n/c]: readline: warning: rl_prep_terminal: cannot get terminal settings My system is: > version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gn...

search for: isurl