thr3ads.net - similar to: "Web scraping different levels of a website"

Displaying 20 results from an estimated 100 matches similar to: "Web scraping different levels of a website"

Web scraping different levels of a website

2018 Jan 18

Web scraping different levels of a website

I am web scraping a page at http://catalog.ihsn.org/index.php/catalog#_r=&collection=&country=&dtype=&from=1890&page=1&ps=100&sid=&sk=&sort_by=nation&sort_order=&to=2017&topic=&view=s&vk= From this url, I have built up a dataframe through the following code: dflist <- map(.x = 1:417, .f = function(x) { Sys.sleep(5) url <-

Web scraping different levels of a website

2018 Jan 18

Web scraping different levels of a website

Hey Ilio, On the main website (the first link that you provided) if you right-click on the title of any entry and select Inspect Element from the menu, you will notice in the Developer Tools view that opens up that the corresponding html looks like this (example for the same link that you provided) <div class="survey-row"

Scraping from different level URLs website

2018 Jan 23

Scraping from different level URLs website

I am doing a research on World Bank (WB) projects on developing countries. To do so, I am scraping their website in order to collect the data I am interested in. The structure of the webpage I want to scrape is the following: 1. List of countries the list of all countries in which WB has developed projects<http://projects.worldbank.org/country?lang=en&page=> 1.1. By clicking on a

Our Sympathies

2001 Sep 14

Our Sympathies

The following is a message to be sent to the President of the United States of America. Although we may not be able to do a great deal from where we are, but for the people of America just knowing we care and feel their sadness will help. Please put your name on the following list and send it to all you know and who care. If you are the 100th name and every 100th there on could you please also

Function to recognise convert dates between gregorian and other calendars (e.g. Persian)?

2009 Jan 07

Function to recognise convert dates between gregorian and other calendars (e.g. Persian)?

Dear list, I will shortly have some data that contains numeric dates in the Persian / Jalali calendar format, which I would like to convert to gregorian. At the moment there doesn't seem to be a function for this in R, but it would be great if someone could come up with same - I would attempt it but the algorithm is very complex and this is also way beyond my fairly rudimentary knowledge of

rvest

2016 Dec 06

rvest

Estimados Hace un tiempo que no uso rvest, corrí un código viejo, anda sin problemas, escribo el nuevo y hay algo que me olvide. Básicamente desde el navegador de internet selecciono el xpath, copio y pego este en R, pero me sale el siguiente error. > text <- Pagina.R %>% + html_nodes(xpath='//*[@id="content"]/p')%>% + html_text() >

Dataverse (reading files with .tab and .7z suffixes)

2018 May 13

Dataverse (reading files with .tab and .7z suffixes)

Ilio Fornasero writes: > Yet, I am at this point. > > > > > ## 01. Finding the dataverse server and making a search > Sys.setenv("DATAVERSE_SERVER" =3D "dataverse.harvard.edu") > dataverse_search(".Hunger") > > > ## 02. Loading the dataset (in this example, I have chosen the word ".Hunge= > r" to get > # one list and

Instrucciones uso rvest

2015 Dec 23

Instrucciones uso rvest

Hola buenos días: Os remito una duda (en un documento word para su mejor expresión) sobre el uso de la libreria rvest. Mi problema es que como no soy informatico me pierdo un poco, he visto los ejemplos que hay colgados y los he seguido, pero el tema es que quiero acceder a los datos del INE, que en ocasiones estan un poco escondidos con menu de selecciones y no se como hacerlo con rvest para

Filter data

2010 Dec 02

Filter data

Hello, I understand that question is probably stupid, but ... I have data (polity IV index) "country","year","democ","autoc","polity","polity2" "1","Afghanistan ",1800,1,7,-6,-6 "2","Afghanistan ",1801,1,7,-6,-6 "3","Afghanistan

Dataverse

2018 May 13

Dataverse

Hello. I am trying to find a way to retrieve data from Harvard Dataverse website. I usually don't have problem in web-scraping data but the problem here is that there are a bunch of data formats such as .tab, .7z and so and I just can't find a way to retrieve the data I am interested in woth an unique solution. Any hint? [[alternative HTML version deleted]]

write.csv covert Åland to <c5>land

2020 Oct 20

write.csv covert Åland to <c5>land

Hi there, I tried to export the names of country to a csv file with write.csv(). In the resulted file, ?land was coverted to <c5>land. Is there any way could prevent this happening? Thanks! > abc [1] "?land" > write.table(abc, file = "") "x" "1" "<c5>land" Best, Jinsong

Convert a list of $NULL into multiple dataframes

2018 May 18

Convert a list of $NULL into multiple dataframes

I have the following list: > tables $`NULL` V1 V2 V3 1 Year 1992 1993 $`NULL` V1 V2 V3 V4 1 Age Average (cm) N SD 2 18-19 178.3 6309 6.39 I want to turn it

Cambiar el formato de datos

2019 Feb 19

Cambiar el formato de datos

Después del "gather()" puedes hacer un "arrange()" que es una ordenación. Y dentro de "arrange()" le indicas la variable por la que ordenas (no hacen falta comillas)... Lo ordenará alfabéticamente. Saludos, Carlos Ortega www.qualityexcellence.es El mar., 19 feb. 2019 a las 13:47, Antonio Rodriguez Andres (< antoniorodriguezandres70 en gmail.com>) escribió:

write.csv covert Åland to <c5>land

2020 Oct 20

write.csv covert Åland to <c5>land

Hi there, Why the same string is displayed in different form? > abc[,1] [1] "?land" "Afghanistan" > abc name 1 <c5>land 2 Afghanistan And more... > dput(abc, "aa.txt") > dget("aa.txt") name 1 <c5>land 2 Afghanistan > dget("aa.txt")[,1] [1] "<c5>land"

Merging rows in a dataframe

2011 Jun 16

Merging rows in a dataframe

Hi R Help list I'm looking to visualise US foreign aid 1946-2009 and I have the dataset for this. The trouble is it's a bit too complex and I need to simply it I want to merge all of the rows with the same country together and add up the individual totals to make one total figure per country per year Below is an example of the kind of data. The real dataset has 2447 rows and covers 63

read.delim problem with trailing spaces

2004 Oct 06

read.delim problem with trailing spaces

I'm trying to read a comma delimited dataset that uses '.' for NA. I found that if the last field on a line was a missing '.' it was not read as NA, but just a '.', and the life variable was made a factor. The data looks like this, income,imr,region,oilexprt,imr80,gnp80,life Afghanistan,75,400.0,4,0,185.0,.,37.5 Algeria,400,86.3,2,1,20.5,1920,50.7

Cambiar el formato de datos

2019 Feb 19

Cambiar el formato de datos

> gather(pobla, key = year, value = totpop, year60:year63) Country year totpop 1 Afghanistan year60 8996351 2 Albania year60 1608800 3 Algeria year60 11124888 4 Andorra year60 13411 Gracias Carlos Antonio On Tue, 19 Feb 2019 at 12:54, Carlos Ortega <cof en qualityexcellence.es> wrote: > Sí, tienes varias formas. > > Mira la función

write.csv covert Åland to <c5>land

2020 Oct 20

write.csv covert Åland to <c5>land

It looks like an encoding problem. It works fine for me with R encoding set to UTF-8 Here is part of my sessionInfo() results [1] LC_CTYPE=en_CA.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_CA.UTF-8 LC_COLLATE=en_CA.UTF-8 [5] LC_MONETARY=en_CA.UTF-8 LC_MESSAGES=en_CA.UTF-8 I would suggest issuing the command sessionInfo() and seeing what your encoding is. On Tue, 20 Oct 2020 at 08:22,

write.csv covert Åland to <c5>land

2020 Oct 20

write.csv covert Åland to <c5>land

You don't say, but I'd guess you're using Windows. In your code page, the character ? is probably not representable. At some point in the sequence of operations involved in printing the dataframe R puts the string into the native encoding, and since that's impossible on your system, it substitutes the <c5> instead. The fact that you can sometimes display it is because

diff() for panel data

2008 Sep 21

diff() for panel data

Hello, everyone! I'd like to find out how I can do first log differences in a panel? (The Penn World Table data that's available in the PWT package) The regular diff() function ignores the country/index/"panel unit", with depressing results. A second request, how can I best "filter" the data (e.g. generate a data frame with the data for a single country or a single

similar to: Web scraping different levels of a website