thr3ads.net - Rails - Help with regex needed [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Kim

2008-Oct-22 04:14 UTC

Help with regex needed

Hi here is the array I am scanning:
["\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
frameset~2489041&FF=rwr+121&1,1,\">The Academic Writer: A Brief
Guide</
a>\n</td>\n<td >\n&nbsp;Ede, Lisa\n</td>\n\n<td
>\n&nbsp;Valley
Reserves -- VR 282  -- AVAILABLE\n</td>\n\n<td
>\n&nbsp;\n</td>\n\n</
tr>\n<tr>\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
frameset~1334646&FF=rwr+121&1,1,\">Cultural literacy : what
every
American needs to know / E.D. Hirsch, Jr. ; with an appendix, What li</
a>\n</td>\n<td >\n&nbsp;Hirsch, E. D. (Eric Donald),
1928-\n</td>\n
\n<td >\n&nbsp;Valley Reserves -- LC149 .H57 1987  --
AVAILABLE\n</td>
\n\n<td >\n&nbsp;\n</td>]

I am trying to pull out the essential (everything but the newlines and
such) value in between the <td></td>.

Here is the regex I am trying:
s.first.scan(/\<td \>(.*?)\<\/td\>/mi)
But I don''t get the first <td> a href value.

Any help would be appreciated. Kim
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Kim

2008-Oct-22 04:38 UTC

head link

Help with regex needed

Here is the array I am scanning:
["\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
frameset~2489041&FF=rwr+121&1,1,\">The Academic Writer: A Brief
Guide</
a>\n</td>\n<td >\n&nbsp;Ede, Lisa\n</td>\n\n<td
>\n&nbsp;Valley
Reserves -- VR 282  -- AVAILABLE\n</td>\n\n<td
>\n&nbsp;\n</td>\n\n</
tr>\n<tr>\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
frameset~1334646&FF=rwr+121&1,1,\">Cultural literacy : what
every
American needs to know / E.D. Hirsch, Jr. ; with an appendix, What li</
a>\n</td>\n<td >\n&nbsp;Hirsch, E. D. (Eric Donald),
1928-\n</td>\n
\n<td >\n&nbsp;Valley Reserves -- LC149 .H57 1987  --
AVAILABLE\n</td>
\n\n<td >\n&nbsp;\n</td>]

I am trying to get the values (all but newlines and such) out from in
between the <td> </td>

Tried this :
s.first.scan(/\<td \>(.*?)\<\/td\>/mi)
But I never get the first <td> a href values.

Any help is appreciated. Thanks. Kim
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Mukund

2008-Oct-22 05:10 UTC

head link

Re: Help with regex needed

Use hpricot plugin to handle HTML parsing.

On Oct 22, 9:14 am, Kim
<Kim.Gri...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:> Hi here is the array I am scanning:
> ["\n<td> <a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~2489041&FF=rwr+121&1,1,\">The Academic Writer: A
Brief Guide</
> a>\n</td>\n<td >\n Ede, Lisa\n</td>\n\n<td
>\n Valley
> Reserves -- VR 282  -- AVAILABLE\n</td>\n\n<td
>\n \n</td>\n\n</
> tr>\n<tr>\n<td> <a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~1334646&FF=rwr+121&1,1,\">Cultural literacy : what
every
> American needs to know / E.D. Hirsch, Jr. ; with an appendix, What li</
> a>\n</td>\n<td >\n Hirsch, E. D. (Eric Donald),
1928-\n</td>\n
> \n<td >\n Valley Reserves -- LC149 .H57 1987  --
AVAILABLE\n</td>
> \n\n<td >\n \n</td>]
>
> I am trying to pull out the essential (everything but the newlines and
> such) value in between the <td></td>.
>
> Here is the regex I am trying:
> s.first.scan(/\<td \>(.*?)\<\/td\>/mi)
> But I don''t get the first <td> a href value.
>
> Any help would be appreciated. Kim help would be appreciated. Kim--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Rob Biedenharn

2008-Oct-22 05:19 UTC

head link

Re: Help with regex needed

On Oct 22, 2008, at 12:38 AM, Kim wrote:> Here is the array I am scanning:
> ["\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~2489041&FF=rwr+121&1,1,\">The Academic Writer: A
Brief
> Guide</
> a>\n</td>\n<td >\n&nbsp;Ede, Lisa\n</td>\n\n<td
>\n&nbsp;Valley
> Reserves -- VR 282  -- AVAILABLE\n</td>\n\n<td
>\n&nbsp;\n</td>\n\n</
> tr>\n<tr>\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~1334646&FF=rwr+121&1,1,\">Cultural literacy : what
every
> American needs to know / E.D. Hirsch, Jr. ; with an appendix, What  
> li</
> a>\n</td>\n<td >\n&nbsp;Hirsch, E. D. (Eric Donald),
1928-\n</td>\n
> \n<td >\n&nbsp;Valley Reserves -- LC149 .H57 1987  --
AVAILABLE\n</td>
> \n\n<td >\n&nbsp;\n</td>]
>
> I am trying to get the values (all but newlines and such) out from in
> between the <td> </td>
>
> Tried this :
> s.first.scan(/\<td \>(.*?)\<\/td\>/mi)
> But I never get the first <td> a href values.
>
> Any help is appreciated. Thanks. Kim
because the first <td> is not a <td > of course.  Your regexp looks
for:
  /\<td \>
       ^
Did you mean something like %r{<td\b[^>]*>(.*?)</td>}mi

Note that %r{} for a regexp literal can be more convenient when you  
hope to match a slash.

-Rob

Rob Biedenharn		http://agileconsultingllc.com
Rob-xa9cJyRlE0mWcWVYNo9pwxS2lgjeYSpx@public.gmane.org



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Mark Thomas

2008-Oct-22 13:26 UTC

head link

Re: Help with regex needed

On Oct 22, 12:14 am, Kim
<Kim.Gri...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:> Hi here is the array I am scanning:
> ["\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~2489041&FF=rwr+121&1,1,\">The Academic Writer: A
Brief Guide</
> a>\n</td>\n<td >\n&nbsp;Ede, Lisa\n</td>\n\n<td
>\n&nbsp;Valley
> Reserves -- VR 282  -- AVAILABLE\n</td>\n\n<td
>\n&nbsp;\n</td>\n\n</
> tr>\n<tr>\n<td>&nbsp;<a
href=\"/search~S13?/rWR%20121/rwr+121/1,7,9,B/
> frameset~1334646&FF=rwr+121&1,1,\">Cultural literacy : what
every
> American needs to know / E.D. Hirsch, Jr. ; with an appendix, What li</
> a>\n</td>\n<td >\n&nbsp;Hirsch, E. D. (Eric Donald),
1928-\n</td>\n
> \n<td >\n&nbsp;Valley Reserves -- LC149 .H57 1987  --
AVAILABLE\n</td>
> \n\n<td >\n&nbsp;\n</td>]
>
> I am trying to pull out the essential (everything but the newlines and
> such) value in between the <td></td>.
I agree with Mukund. Use Hpricot:

html = Hpricot(s.first)

html.search( "td" ) do |cell|
  puts cell.inner_html
end

-- Mark.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Ruby on Rails: Talk" group.
To post to this group, send email to
rubyonrails-talk-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
To unsubscribe from this group, send email to
rubyonrails-talk+unsubscribe@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/rubyonrails-talk?hl=en
-~----------~----~----~----~------~----~------~--~---

Rails - Oct 2008 - Help with regex needed

Help with regex needed

Help with regex needed

Re: Help with regex needed

Re: Help with regex needed

Re: Help with regex needed