you may 'try' to read the URL first (x=try(readLines(...))); then
check inherits(x, "try-error") to see if an error has occurred.
try() will not stop your code from being evaluated even if errors occur:
> for(i in 1:3){try(stop('error'))}
Error in try(stop("error")) : error
Error in try(stop("error")) : error
Error in try(stop("error")) : error> i
[1] 3
> for(i in 1:3){(stop('error'))}
Error: error> i
[1] 1
Regards,
Yihui
--
Yihui Xie <xieyihui at gmail.com>
Phone: 515-294-6609 Web: http://yihui.name
Department of Statistics, Iowa State University
3211 Snedecor Hall, Ames, IA
On Wed, May 5, 2010 at 3:34 AM, Wincent <ronggui.huang at gmail.com>
wrote:> Dear all, I want to download webpage from a large number of webpage.
> For example,
>
> ########
> link <- c("http://gzbbs.soufun.com/board/2811006802/",
> "http://gzbbs.soufun.com/board/2811328226/",
> "http://gzbbs.soufun.com/board/2811720258/",
> "http://gzbbs.soufun.com/board/2811495702/",
> "http://gzbbs.soufun.com/board/2811176022/",
> "http://gzbbs.soufun.com/board/2811866676/"
> )
> # ?the actual vector will be much longer.
>
> ans <- vector("list",length(link))
>
> for (i in seq_along(link)){
> ?ans[[i]] <- readLines(url(link[i]))
> ?Sys.sleep(8)
> }
> #######
>
> The problem is, the sever will not response if the retrieval happens
> too often and I don't know what the optimal time span between two
> retrieval.
> When the sever does not response to readLines, it will return an error
> and stop. What I want to do is: when an error occurs, I put R to sleep
> for say 60 seconds, and redo the readLines on the same link.
>
> I did some search and guess withCallingHandlers and withRestarts will
> do the trick. Yet, I didn't find much example on the usage of them.
> Can you give me some suggestions? Thanks.
>
> --
> Wincent Rong-gui HUANG
> Doctoral Candidate
> Dept of Public and Social Administration
> City University of Hong Kong
> http://asrr.r-forge.r-project.org/rghuang.html
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>