Constantin Gavrilescu
2009-Mar-07 03:37 UTC
Parsing html files => putting them in fixtures for testing
I''m using Hpricot parser to scrape web pages. I saved two of these pages for a test in lack of a better way, I put the html files in the fixtures like this: dl_found_tickets: html: "<%= File.read( ''test/fixtures/html/ search_dl_found_tickets.html'' ).gsub(''"'', ''\"'') %>" [...] Even though the crawl class works fine, the test fails, so it''s got to be something wrong with the fixture. I a test that compares the html string loaded from the fixture and the one loaded from the fixture and they''re not the same. From what I can tell, it''s only whitespace difference, from some end-of-line conversions, I guees. This test fails: def test_html_fixtures assert_equal File.read( ''test/fixtures/html/ search_plate_found_ticket.html'' ).slice(0, 250), crawls (:dl_found_tickets).html.slice(0, 250) end 1) Failure: test_html_fixtures(CrawlTest) [test/unit/crawl_test.rb:16:in `test_html_fixtures'' /usr/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/ active_support/testing/setup_and_teardown.rb:60:in `__send__'' /usr/lib/ruby/gems/1.8/gems/activesupport-2.2.2/lib/ active_support/testing/setup_and_teardown.rb:60:in `run'']: <"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0//EN\">\r\n<!--Bean tags and additional tags for use in this page.-->\r\n\r\n\r\n\r\n\r\n\r\n\r \n\r\n\r\n<!--End of bean tags and additional tags for use in this page.-->\r\n<html>\r\n<head>\r\n<link rel=\"stylesheet\" type=\"text/ css\" h"> expected but was <"<!DOCTYPE HTML PUBLIC \"-//W3C//DTD HTML 4.0//EN\"> <!--Bean tags and additional tags for use in this page.-->\n\n\n\n\n\n\n\n<!--End of bean tags and additional tags for use in this page.--> <html> <head> <link rel=\"stylesheet\" type=\"text/css\" href=\"css/style">. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
you are right. it seems like an error that occurrs after reading your yml-file. obviously there are some additional whitespaces/carriage returns added to it. but you could just gsub them. but other than that, one question: why do you want to save a html string in a yml-file (and not just read your html file whenever you want to)? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---
Constantin Gavrilescu
2009-Mar-07 16:44 UTC
Re: Parsing html files => putting them in fixtures for testing
On 7 mar, 07:39, MaD <mayer.domi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:> you are right. it seems like an error that occurrs after reading your > yml-file. obviously there are some additional whitespaces/carriage > returns added to it. but you could just gsub them.It may not be just the whitespace, because the parser gives different results on the YML string and the File.read string. Isn''t there a function to quote YML strings? I searched for it, and could not find it.> but other than that, one question: why do you want to save a html > string in a yml-file (and not just read your html file whenever you > want to)?I like it this way because I just load the object from the fixture in my tests. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org To unsubscribe from this group, send email to rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en -~----------~----~----~----~------~----~------~--~---