Dear members,
I am running a large scraping code in a very powerful
AWS ec2 instance:
DATES <- getFirms Dates()
It iterates over 500 stocks from a website. Despite the power of the machine,
the execution is very slow.
If I abort the function (by ctrl + C), after, say 150th iteration, the DATES
object will still contain the scraped data untill the 150th iteration, right? (
The rest of the 350 entries will be NA's, I suppose).
Many thanks in advance.....
Yours sincerely,
AKSHAY M KULKARNI
[[alternative HTML version deleted]]
This would be easy for you to test on a small example on your local computer. But the answer is "no". Nothing is assigned if the function does not return normally... and Ctrl+C is anything but normal. On July 13, 2022 12:19:58 PM PDT, akshay kulkarni <akshay_e4 at hotmail.com> wrote:>Dear members, > I am running a large scraping code in a very powerful AWS ec2 instance: > >DATES <- getFirms Dates() > >It iterates over 500 stocks from a website. Despite the power of the machine, the execution is very slow. > >If I abort the function (by ctrl + C), after, say 150th iteration, the DATES object will still contain the scraped data untill the 150th iteration, right? ( The rest of the 350 entries will be NA's, I suppose). > >Many thanks in advance..... > >Yours sincerely, >AKSHAY M KULKARNI > > > > [[alternative HTML version deleted]] > >______________________________________________ >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.-- Sent from my phone. Please excuse my brevity.
You could write a function that returns an environment (or list if you
prefer) containing the results collected before the interrupt by using
tryCatch(interrupt=...). E.g.,
doMany <- function(names) {
resultEnv <- new.env(parent=emptyenv())
tryCatch(
for(name in names) resultEnv[[name]] <- Sys.sleep(1), # replace
Sys.sleep(1) by getStuffFromWeb(name)
interrupt = function(e) NULL)
resultEnv
}
Use it as
> system.time(e <- doMany(state.name)) # hit Esc or ^C after a few seconds
^C user system elapsed
0.001 0.000 4.390> names(e)
[1] "Alabama" "Alaska" "Arizona"
"Arkansas"> eapply(e, identity)
$Alabama
NULL
$Alaska
NULL
$Arizona
NULL
$Arkansas
NULL
-Bill
On Wed, Jul 13, 2022 at 12:20 PM akshay kulkarni <akshay_e4 at
hotmail.com>
wrote:
> Dear members,
> I am running a large scraping code in a very
> powerful AWS ec2 instance:
>
> DATES <- getFirms Dates()
>
> It iterates over 500 stocks from a website. Despite the power of the
> machine, the execution is very slow.
>
> If I abort the function (by ctrl + C), after, say 150th iteration, the
> DATES object will still contain the scraped data untill the 150th
> iteration, right? ( The rest of the 350 entries will be NA's, I
suppose).
>
> Many thanks in advance.....
>
> Yours sincerely,
> AKSHAY M KULKARNI
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]