Hi all, I''ve been looking at the performance of my fb app and one glaring issue seems to be with the parsing speed of rexml in processing the results. Has anyone looked into porting the facebooker parser from rexml to libxml? If not, any reason I shouldn''t try? Thanks! Yu-Shan. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/facebooker-talk/attachments/20081015/d52870eb/attachment.html>
On Oct 15, 2008, at 5:44 PM, Yu-Shan Fung wrote:> Hi all, > > I''ve been looking at the performance of my fb app and one glaring > issue seems to be with the parsing speed of rexml in processing the > results. Has anyone looked into porting the facebooker parser from > rexml to libxml? If not, any reason I shouldn''t try?Have you actually benchmarked this? If you have, and it is truly an issue and you can make it transparent, go for it. I would be shocked if this is a bottleneck for the typical web application. Mike> > > Thanks! > Yu-Shan. > _______________________________________________ > Facebooker-talk mailing list > Facebooker-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/facebooker-talk-- Mike Mangino http://www.elevatedrails.com
Thanks Mike, that''s what I did. Ours is probably not what you''d consider a typical web application. We analyzes the user''s facebook network extensively. A single request for user.friends! for a user with 900 friends takes almost 400s to run, almost 300 of which is spent doing this: REXML::Document#initialize To make matters worse, the current parser runes through REXML::Document#initialize twice, once in Errors.process, and another in the regular Parser.process In fact, a simple fix to Errors.process already shaved half of the 300s (commented was the original): class Errors < Parser#:nodoc: def self.process(data) data = data.body rescue data # either data or an HTTP response if data.include?(''<error_response'') error_code = error_msg = nil matches = /<error_code>(\d+)<\/error_code>/.match(data) if matches error_code = matches[0] end matches = /<error_msg>(.*?)<\/error_msg>/.match(data) if matches error_msg = matches[0] end exception = EXCEPTIONS[error_code.to_i] || StandardError raise exception.new(error_msg) # response_element = element(''error_response'', data) rescue nil # if response_element # hash = hashinate(response_element) # exception = EXCEPTIONS[Integer(hash[''error_code''])] || StandardError # raise exception.new(hash[''error_msg'']) # end else nil end end Not half as pretty, but a lot faster... On Wed, Oct 15, 2008 at 3:27 PM, Mike Mangino <mmangino at elevatedrails.com>wrote:> > On Oct 15, 2008, at 5:44 PM, Yu-Shan Fung wrote: > > Hi all, >> >> I''ve been looking at the performance of my fb app and one glaring issue >> seems to be with the parsing speed of rexml in processing the results. Has >> anyone looked into porting the facebooker parser from rexml to libxml? If >> not, any reason I shouldn''t try? >> > > Have you actually benchmarked this? If you have, and it is truly an issue > and you can make it transparent, go for it. I would be shocked if this is a > bottleneck for the typical web application. > > Mike > > > >> >> Thanks! >> Yu-Shan. >> _______________________________________________ >> Facebooker-talk mailing list >> Facebooker-talk at rubyforge.org >> http://rubyforge.org/mailman/listinfo/facebooker-talk >> > > -- > Mike Mangino > http://www.elevatedrails.com > > > >-- "Reality is that which, when you stop believing in it, doesn''t go away." - Philip K. Dick, American Writer -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://rubyforge.org/pipermail/facebooker-talk/attachments/20081015/5f98bd3c/attachment-0001.html>
REXML is notorious for its slowness. I would love to see libxml replace it in facebooker. Let me know if you would like any help. Cheers. On Wed, Oct 15, 2008 at 4:16 PM, Yu-Shan Fung <ambivalence at gmail.com> wrote:> Thanks Mike, that''s what I did. Ours is probably not what you''d consider a > typical web application. We analyzes the user''s facebook network > extensively. A single request for user.friends! for a user with 900 friends > takes almost 400s to run, almost 300 of which is spent doing this: > REXML::Document#initialize > > To make matters worse, the current parser runes through > REXML::Document#initialize twice, once in Errors.process, and another in the > regular Parser.process > In fact, a simple fix to Errors.process already shaved half of the 300s > (commented was the original): > > class Errors < Parser#:nodoc: > def self.process(data) > data = data.body rescue data # either data or an HTTP response > if data.include?(''<error_response'') > error_code = error_msg = nil > matches = /<error_code>(\d+)<\/error_code>/.match(data) > if matches > error_code = matches[0] > end > matches = /<error_msg>(.*?)<\/error_msg>/.match(data) > if matches > error_msg = matches[0] > end > exception = EXCEPTIONS[error_code.to_i] || StandardError > raise exception.new(error_msg) > # response_element = element(''error_response'', data) rescue nil > # if response_element > # hash = hashinate(response_element) > # exception = EXCEPTIONS[Integer(hash[''error_code''])] || > StandardError > # raise exception.new(hash[''error_msg'']) > # end > else > nil > end > end > > > Not half as pretty, but a lot faster... > > > > On Wed, Oct 15, 2008 at 3:27 PM, Mike Mangino <mmangino at elevatedrails.com> > wrote: >> >> On Oct 15, 2008, at 5:44 PM, Yu-Shan Fung wrote: >> >>> Hi all, >>> >>> I''ve been looking at the performance of my fb app and one glaring issue >>> seems to be with the parsing speed of rexml in processing the results. Has >>> anyone looked into porting the facebooker parser from rexml to libxml? If >>> not, any reason I shouldn''t try? >> >> Have you actually benchmarked this? If you have, and it is truly an issue >> and you can make it transparent, go for it. I would be shocked if this is a >> bottleneck for the typical web application. >> >> Mike >> >> >>> >>> >>> Thanks! >>> Yu-Shan. >>> _______________________________________________ >>> Facebooker-talk mailing list >>> Facebooker-talk at rubyforge.org >>> http://rubyforge.org/mailman/listinfo/facebooker-talk >> >> -- >> Mike Mangino >> http://www.elevatedrails.com >> >> >> > > > > -- > "Reality is that which, when you stop believing in it, doesn''t go away." > - Philip K. Dick, American Writer > > _______________________________________________ > Facebooker-talk mailing list > Facebooker-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/facebooker-talk > >
On Oct 15, 2008, at 7:16 PM, Yu-Shan Fung wrote:> Thanks Mike, that''s what I did. Ours is probably not what you''d > consider a typical web application. We analyzes the user''s facebook > network extensively. A single request for user.friends! for a user > with 900 friends takes almost 400s to run, almost 300 of which is > spent doing this: > REXML::Document#initialize >Ah, okay. That sounds like something worth fixing.> To make matters worse, the current parser runes through > REXML::Document#initialize twice, once in Errors.process, and > another in the regular Parser.process > In fact, a simple fix to Errors.process already shaved half of the > 300s (commented was the original): >Do you have this as a commit in a fork on github? If so, I can pull this in. I would like to see the error code handling pulled out in functions like extract_error_code and extract_error_message, but other than that, this looks good. Mike> class Errors < Parser#:nodoc: > def self.process(data) > data = data.body rescue data # either data or an HTTP response > if data.include?(''<error_response'') > error_code = error_msg = nil > matches = /<error_code>(\d+)<\/error_code>/.match(data) > if matches > error_code = matches[0] > end > matches = /<error_msg>(.*?)<\/error_msg>/.match(data) > if matches > error_msg = matches[0] > end > exception = EXCEPTIONS[error_code.to_i] || StandardError > raise exception.new(error_msg) > # response_element = element(''error_response'', data) rescue nil > # if response_element > # hash = hashinate(response_element) > # exception = EXCEPTIONS[Integer(hash[''error_code''])] || > StandardError > # raise exception.new(hash[''error_msg'']) > # end > else > nil > end > end > > > Not half as pretty, but a lot faster... > > > > On Wed, Oct 15, 2008 at 3:27 PM, Mike Mangino <mmangino at elevatedrails.com > > wrote: > > On Oct 15, 2008, at 5:44 PM, Yu-Shan Fung wrote: > > Hi all, > > I''ve been looking at the performance of my fb app and one glaring > issue seems to be with the parsing speed of rexml in processing the > results. Has anyone looked into porting the facebooker parser from > rexml to libxml? If not, any reason I shouldn''t try? > > Have you actually benchmarked this? If you have, and it is truly an > issue and you can make it transparent, go for it. I would be shocked > if this is a bottleneck for the typical web application. > > Mike > > > > > Thanks! > Yu-Shan. > _______________________________________________ > Facebooker-talk mailing list > Facebooker-talk at rubyforge.org > http://rubyforge.org/mailman/listinfo/facebooker-talk > > -- > Mike Mangino > http://www.elevatedrails.com > > > > > > > -- > "Reality is that which, when you stop believing in it, doesn''t go > away." > - Philip K. Dick, American Writer-- Mike Mangino http://www.elevatedrails.com
On Wed, Oct 15, 2008 at 4:16 PM, Yu-Shan Fung <ambivalence at gmail.com> wrote:> Thanks Mike, that''s what I did. Ours is probably not what you''d consider a > typical web application. We analyzes the user''s facebook network > extensively. A single request for user.friends! for a user with 900 friends > takes almost 400s to run, almost 300 of which is spent doing this: > REXML::Document#initializeReally 400 seconds?! Yeesh. Joe