RubyFlow The Ruby and Rails community linklog

libxml-ruby no slower than Nokogiri after all

A week ago, libxml-ruby 1 was released and rather quickly some benchmarking results of libxml-ruby vs Hpricot, REXML and Nokogiri went up. Unexpectedly, they showed libxml-ruby as about 10% slower than Nokogiri. Turns out this shouldn’t be and Charlie Savage has worked out why and resolved the problem. Nice investigation.

Comments

It’s always the smallest, obscure settings, isn’t it? :P

BTW, just out of curiosity, if nokogiri and libxml-ruby use libxml2 and use the same method calls, and has about equivalent performance, why do we need both of them? Does one provide a better alternative in some conditions?

Thanks!

Wasn’t the Hpricot test badly flawed, and particularly practical?

Insert a “not” in the appropriate place. (I realize opinions may differ on which place that is.)

@brendan - Nokogiri has a PushParser for SAX that libxml-ruby doesn’t have yet I don’t think. Chuck in some SAX machines[1] and you can do some powerful stuff with massive or endless XML files/streams without having to load big old DOM trees into memory.

1) Shameless self promotion: http://github.com/shanna/xml-sax-machines/tree/master

@Shane thanks for the explanation. This means I need to learn more about SAX. :D

Post a comment

You can use basic HTML markup (e.g. <a>) or Markdown.

As you are not logged in, you will be
directed via GitHub to signup or sign in