Displaying 20 results from an estimated 1300 matches similar to: "Scrapping Content from a website"
2011 Dec 05
12
Using nokogiri
HI,
I want to grab some information about university names, and I found
this term called "web scraping"
I search about it in google, and there are tools in ruby.
One of them is nokogiri but I''m a bit confused because it seems that
it only gets information that its already in an html or xml
I found a webpage that have a list of university names as a
<select>
2010 Jan 25
4
Does Amazon.com blocks scraping?
Hi there
Does anyone know if Amazon.com has any sort of server side script that tries
to block scraping activities? I first noticed that if I didn?t change the
agent alias, it would fetch a page exactly like the normal one, but without
the intial search field(maybe a silly way to prevent scraping). Then after
it, I changed to some other alias, and submit a search. I got the result
page as
2009 Sep 17
1
Load Error Using Mechanize Gem
Hi,
I''m getting a "Could not open any of [xml2, xslt, exslt] (LoadError)" error when trying to run a simple Ruby program taken from the EXAMPLES.rdoc file of the Mechanize gem.
The error is in this line of the Nokogiri module of libxml.rb: ffi_lib ''xml2'', ''xslt'', ''exslt''
Not sure if there are missing gems, and if so,
2010 Jan 08
7
input form fields not in the #<WWW::Mechanize::Form array
Hi
This may be a dumb question with an obvious answer.
It would seem that an input form field identified with an ''id'' qualifier
and not with a ''name'' qualifier is not recognised by Mechanize - at
least it isn''t in the form field list.
Is there any way of getting at these elements or am I, as I suspect,
fresh out of luck. But you never know ...
2009 Mar 11
0
problem scrapping ATnT site (Matt White)
Try to use firebug to assist you finding these changes. I never used AT&T
website, but you may need to login and find the download url using firebug
as I did:
http://zenmachine.wordpress.com/2007/11/11/scraping-with-firebug-and-wwwmechanize/
regards,
gm
On Tue, Mar 10, 2009 at 4:12 PM, <mechanize-users-request at rubyforge.org>wrote:
> Send Mechanize-users mailing list submissions
2009 May 05
3
Only partially reading a page!
I am trying to get a page which includes a form, but the form is
missing from the WWW::Mechanize::Page object. I retrieve it via:
page = web_agent.submit(a_different_form)
For debugging this problem, I then immediately write the resulting
page to two different logs:
File.open(''big.html'',''wb'') { |f| f.write(page.body) }
2010 Jun 03
3
issue submitting a form
Hi. Recently I started rebuilding my old Mechanize script, which I
used to automatically log in to a certain site and retrieve files from
it. Old version worked great, however, when I did the update it
started complaining. Here''s the log of the error:
/Users/lukastolyarov/.gem/ruby/1.8/gems/mechanize-1.0.0/lib/mechanize/form/field.rb:30:in
`<=>'': undefined method
2012 Sep 12
7
multinomial MCMCglmm
Dear all,
I would like to add mixed effects in a multinomial model and I am trying
to use MCMCglmm for that.
The main problem I face: my data set consits of a trapping data set,
where the observation at eah trap (1 or 0 for each species) have been
aggregated per traplines. Therefore we have a proportion of
presence/absence for each species per trapline.
ex:
ID_line mesh habitat Apsy Mygl
2009 Jul 26
3
Failed to build gem native extension
On Sat, Jul 25, 2009 at 9:14 PM, Jeffrey
Roberts<jeffrey.l.roberts at gmail.com> wrote:
> Hello all, I have looked up and down on google for a solution to this going
> on several days now, I am really hoping someone here can help me out.
>
> I have all my deps in order, and I believe the error is that it is looking
> in /usr/lib when it should be looking in /usr/lib64, I am
2009 Apr 18
1
[PATCH] When user provides an encoding then use it, otherwise autodetect it
Hi,
I''ve created a small patch to mechanize.
The problem:
The user should be able to set his own page encoding, for example when
<meta /> encoding information can is invalid or auto detection
routines fail to guess correctly.
Actual results:
Mechanize will use the encoding detected by it (or Nokogiri/libxml2)
always when there''s a <meta /> with encoding information
2007 Sep 14
1
Unable to scrap gmail.com - EOFError: End of file reached
Hi all,
I am so excited to use mechanize! It has opened a whole new world of
projects for me :)
I am trying to login into the Gmail.com server, as described in
http://schf.uc.org/articles/2007/02/14/scraping-gmail-with-mechanize-and-hpricot
but am running into a few issues...
irb(main):010:0> page = agent.submit form
EOFError: end of file reached
from
2009 Dec 23
4
html parser / assertions in a model
I am using http.get in a model to parse html code returned from a Oracle
server.
My first try was to use assertions (assert and assert_select) to test
and parse the html code. But I have problems including the methods in
the model. Have tried both "include" and copy/paste to get assertion
methods into my model. Works as a model, but I get load errors when I am
using the model from a
2013 Jan 07
4
JSON::ParserError in controller
Hi All
I''m trying to build an application which requires to scrap information
from a webpage. On trying to perform the action, I get an error while
trying to convert the html data to JSON. Has anyone experienced this
before and if so can you please tell me how to solve this problem ?
Please see below for code snippet and error log.
Thanks in advance
Anush
require
2011 Mar 27
2
LinkedIn still not working?
To clarify the issue I posted earlier:
I am on OS X 10.6.7. Trying to use Mechanize to log in to LinkedIn. As others have posted about in the past, when I submit the form it kicks me back to the LinkedIn landing page, and does not log me in.
I read the earlier discussion of the issue that mentioned how cookie values were being improperly dequoted when stored. But I thought that issue was fixed.
2013 Feb 08
0
R 3 scrapping expresiónes regulares
Hola a todos:
Hoy vi sobre la versión 3 de R, por curiosidad llegue a:
http://developer.r-project.org/
pero no puedo comentar en que nos mejora la versión 3, no sería serio de
mi parte.
En ese mismo lugar bajo el título "A Regular Expressions Package For R",
dentro de material viejo, me viene a la cabeza las preguntas de hace
poco a esta lista, sobre las expresiónes regulares, y dentro
2009 Oct 21
4
XML file using Nokogiri gem
Hello friends,
Can you guys give me some idea about how to Create XML file using
Nokogiri gem.
--
Posted via http://www.ruby-forum.com/.
2009 Oct 13
9
Nokogiri: to_s WITHOUT html surrounding's tags?
Hi all
n = Nokogiri::HTML("<h1>H1</h1>")
n.to_s
# => <!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.0 Transitional//EN\"
\"http://www.w3.org/TR/REC-html40/loose.dtd\">\n<html><body><h1>H1</h1></body></html>
Is there a method that only outputs the stuff I''ve read, and not the
whole valid XHTML stuff?
2013 Feb 01
4
Scrapping con R
Buenas tardes a todos:
No sé si alguno de vosotros sabe si con R es posible buscar una palabra en
una web (por ejemplo, buscar "Alicante" en www.lasprovincias.es) y que,
cada vez que lo encuentre, vaya almacenado las urls en un data.frame
gracias de antemano!
--
Beatriz Martínez
[[alternative HTML version deleted]]
2013 Sep 02
2
Why the string interpolation is not working inside the Nokogiri method `#search` ?
Why the string interpolation is not working inside the Nokogiri method
`#search` ?
require ''nokogiri''
doc = Nokogiri::HTML::Document.parse <<-eotl
<div>
<p>foo</p>
<p>foo</p>
<p>bar</p>
</div>
eotl
doc.class # => Nokogiri::HTML::Document
class Person
attr_accessor :name
end
ram = Person.new
2011 Nov 17
2
please suggest a web page automation tool
I''ve been trying to get a couple of web automation tools going today
and having a hard time. Seems like everything I try is outdated and
won''t run on my environment. Mechanize seemed like a good one but it
doesn''t want to run:
require ''rubygems''
require ''open-uri''
require ''mechanize''
agent = Mechanize.new