Dear R users.
We are currently developing a R package (GTrendsR) that allows to retrieve
data from google trends. To do so, I’m using the RCurl library. At this
point everything works perfectly (i.e. the data obtained from R is identical
to the data obtained directly from the web site). However, after 5-10
queries I get a “quota excess limit” message. If I log manually on google
trend web site, it still works (i.e. no quota problems).
So, that let me think it must be something related to the way I connect to
google with R. More specifically, I suspect it something about how I define
the connection with curlSetOpt in relation with cookies. I know it might not
be obvious, but if someone has an idea :)
Here’s my code.
gConnect = function(usr, psw)
{
loginURL <- "https://accounts.google.com/accounts/ServiceLogin"
authenticateURL <-
"https://accounts.google.com/accounts/ServiceLoginAuth"
ch <- getCurlHandle()
curlSetOpt(curl = ch,
ssl.verifypeer = FALSE,
useragent = "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6;
en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13",
followlocation = TRUE,
cookiejar = "./cookies",
cookiefile = "./cookies")
## Google Account login
loginPage <- getURL(loginURL, curl = ch)
galx.match <- str_extract(string = loginPage, pattern
ignore.case('name="GALX"\\s*value="([^"]+)"'))
galx <- str_replace(string = galx.match, pattern
ignore.case('name="GALX"\\s*value="([^"]+)"'),
replacement = "\\1")
authenticatePage <- postForm(authenticateURL, .params = list(Email = usr,
Passwd = psw, GALX = galx), curl = ch, .opts = list(verbose = F))
return(ch)
}
With regards,
Phil
--
Philippe Massicotte, Ph. D.
Stagiaire postdoctoral – Postdoctoral Research Fellow
Université du Québec à Trois-Rivières (UQTR)
Département de Chimie-Biologie
Centre de Recherche sur les Interactions Bassins Versants- Écosystèmes
aquatiques (RIVE)
Pavillon Léon-Provancher Local 3413
3351, boul. des Forges CP 500
Trois-Rivières (QC) G9A 5H7
CANADA
Tel: (819) 376-5011 #3402
Fax: (819) 376-5084
Courriel: philippe.massicotte@uqtr.ca
Web site : <http://anotherrblog.blogspot.ca/>
http://anotherrblog.blogspot.ca/
[[alternative HTML version deleted]]