Skip to content
Print

Performance is Good: ActiveObjects vs ActiveRecord

14
Aug
2007

So ActiveObjects is a fairly cool ORM.  However, coolness alone does not an enterprise ORM make.  In fact, the real qualifications for an enterprise-ready framework are as follows:

  • Stability
  • Performance

I’m sure there are other questions which factor into design decisions on whether or not to use a library, but those are the two which I look at most closely.  Stability is usually a hard metric to find, since it usually depends on a lot of adopters hammering the library until it breaks, is fixed and then hammered again.  However, performance numbers are almost always easy to come by, since all that is required are a few simple benchmark tests to just get a ballpark-number.

Since benchmarks are so fun, I’ve decided to do a few for ActiveObjects.  Or rather, I’ve decided to run a simple (read, very simple) benchmark test with ActiveObjects as well as a number of other ORMs.  At the moment, I’ve only been able to run the test with ActiveRecord (sorry guys, Hibernate’s a really complex framework), but I think the numbers are still worth looking at.

ActiveRecord claims only a 50% overhead compared to manual database access (that number is actually listed as a feature).  There has been some dispute over whether the test used to obtain that particular figure was valid or not, but that’s besides the point.  ActiveObjects should be able to do at least that well, right?

Well, as it turns out, it can.  Here are the numbers from my reasonably simple benchmark:

ActiveObjects
==============
Queries test: 55 ms
Retrieval test: 68 ms
Persistence test: 55 ms
Relations test: 154 ms

ActiveRecord
=============
Queries test: 154 ms
Retrieval test: 6 ms
Persistence test: 76 ms
Relations test: 75 ms

Surprisingly close numbers actually.  I had assumed that there would be some significant disparity, one way or another.  However, as you can see ActiveObjects is fairly comparable to ActiveRecord on a set of extremely trivial tests.  There are some jumps and obvious areas of strength/weakness in both frameworks, but on average they’re pretty similar in performance.

As my friend Lowell Heddings pointed out, ORM benchmarks are far more useful if you actually examine the SQL generated to see how efficient it really is from a theoretical standpoint.  So, to make things easier I sed/grepped the logs and arrived at the following SQL outputs for each respective ORM.

Details

Now, I will be the first to admit that this is hardly at even test to begin with.  Obviously there are different strengths and weaknesses in every library, and though I tried to be impartial in the designing of the benchmarks, I probably accidentally favored one ORM over the other.  Also, there are inherent performance advantages to Java over Ruby, especially in the area of database access.  In short, ActiveObjects probably had a sizeable advantage coming right out of the gate, so take my numbers with a grain of salt.

The test itself consisted of four phases, each involving three entities: Person, Profession and Workplace.  Person has a many-to-many relation with Profession through a fourth entity, Professional.  Workplace has a one-to-many relation with Person.  These relations were exploited directly in the relations benchmark (e.g. Person#getProfessions(), Workplace#getPeople(), etc).  Each entity had a number of fields, including one CLOB (or TEXT, as MySQL refers to them) in the Person entity.  The tables for each respective schema were pre-populated with the same data, which involved several rows with different values (except for the CLOB, which was a roughly 4000 character paragraph and the same for every row).  In the ActiveObjects Person entity, I used the @Preload annotation to eagerly load firstName and lastName.

For the retrieval test, the benchmark iterates through every Person row and grabs firstName, lastName, age, alive, and bio.  Since ActiveObjects only preloaded firstName and lastName, it suffered a bit here. 

The persistence test iterates through every person row and changes the first and last name to one selected from a pool of names I populated with random names which came to mind.  It then goes through the same iteration again and sets the age, alive flag and the bio to our 4000 word Pulitzer-winning essay.  Each row is saved through each iteration, thus each row is saved exactly twice throughout the test.  ActiveObjects came out ahead here probably because of its use of PreparedStatements, as well as the more efficient UPDATE statement generation.

The relations test involved first finding all of the Professions associated with each individual Person and retrieving the Profession name.  Next, the Workplace for the Person is retrieved, then all of the Person(s) associated with that Workplace and their firstName and lastName values accessed.

The queries test was little more than getting all of the Person(s), all of the Workplace(s), all of the Professional mappings, along with all of the Profession(s).  ActiveObjects far outperformed ActiveRecord in this area since ActiveRecord uses SELECT * for everything and eagerly loads the row values.  This means (especially with a CLOB thrown into the mix) that ActiveRecord’s initial query time will be very long, while it’s field access time will be very quick.  Most ORMs function in this way, and it can be a very good thing at times (our benchmark is one of those times).

Lessons Learned

  • Eager loading can be a good thing
  • ActiveObjects generates some weird SQL for relations access

Obviously I can only do so much about the eager loading issue.  I believe pretty strongly that ActiveObject’s approach (in lazy loading most things) is the right one for most use-cases.  However, the second lesson to be learned here is one which I think I need to take a bit more to heart: keep it simple SQL.

Normally, ActiveObjects will generate a query something like the following for accessing a one-to-many relation:

SELECT DISTINCT a.outMap AS outMap FROM (
    SELECT ID AS outMap,workplaceID AS inMap FROM people 
       WHERE workplaceID = ?) a

Yuck!  For obvious reasons, this is an incredibly inefficient bit of querying.  Actually, not only is it inefficient, but needlessly so.  You and I of course know that we could replace the above query with the much simpler:

SELECT ID FROM people WHERE workplaceID = ?

So why doesn’t ActiveObjects do that?  Frankly, I was lazy in my coding of the EntityProxy#retrieveRelations method, so a lot of ugly SQL slipped through the cracks in cases where it really wasn’t necessary.  I’ve spent a bit of time on this, and I think I’ve got the issue resolved.  The problem is that ActiveObjects was assuming that any relation (one-to-many or many-to-many) can have multiple mapping fields, thus requiring a wrapping DISTINCT outer query around a subquery SELECT which is UNIONed with an arbitrary number of other SELECTs, corresponding to the other mapping fields.  Obviously, it is almost never the case that we have to deal with multiple mapping paths, so I added a short-circuit to the logic which creates far simpler queries if at all possible.  As a result, the benchmark numbers for the relations test in ActiveObjects are between 80 and 100 ms.  Still slower than ActiveRecord, but much improved.

It’s worth noting that if we ran each benchmark twice, we would see a marked improvement in the ActiveObjects performance the second time through.  Not just because a lot of the values would be cached, but also because the prepared statements in question would have been compiled and stored.  This is a fairly major area in which ActiveRecord falls short since it doesn’t utilize prepared statements, thus having a constant runtime for its queries and remaining unable to take advantage of cached, compiled queries.

So in short, ActiveObjects may be really neat, but it’s performance numbers don’t seem all that superior to those of ActiveRecord, a Ruby ORM with numerous known shortcomings in this area.  I guess I need to work on things a bit more.  :-)   Next up, either manual JDBC code or Hibernate running the same benchmark, depending on how soon I’m able to figure out Hibernate’s crazy XML mapping schema.

Note: I forgot to mention this… You can get the source for my benchmark from the ActiveObjects SVN repository: svn co https://activeobjects.dev.java.net/svn/activeobjects/trunk/Benchmarks

Comments

  1. Performance is not bad at all. I use AO with Click framework and benchmarking a simple page Dhat lists items from one table:
    persons = ao.getEntityManager().find(Person.class,Query.select().where(“id > ?”, 10));

    … gives me 450 req/sec using Apache Benchmark. ab -n 1000 -c 30 … what is 6x times faster than using the same example on Rails and same machine. To get real life results we must compare frameworks not just an ORMs.

    Congratulations for such tremendous work. I especialy like migrations, and simplicity of AO usage. On Click I use Spring to instantiate AO as singleton that is used on pages.
    Do you have some feature list you would like to implement in future?

    David

    David Marko Wednesday, August 15, 2007 at 2:52 am
  2. On the issue of what an enterprise ORM makes, you mention stability and performance:

    Okay, with stability, I agree. And of course, that needs to be shown over time.

    But with performance, I think the world is not that black and white. Of course, it fails if it completely sucks in its performance. But it is not performance alone, that makes an enterprise framework.

    Q: What have mave rails’ ActiveRecord so popular?
    A: Ease of use.

    You mention that you are still battling hibernate config for the simple performance test. Well, that is not ease of use.

    Of course, it must not have an architecture that makes it have orders of magnitude worse performance than its competitors. Make it utterly easy to use, and make it do exactly what you would expect it to, and it will get followers. And then, maybe, later on, optimize the last 10-15%.

    Tech Per Wednesday, August 15, 2007 at 8:02 am
  3. Glad you like the framework so far. :-) We aim to please.

    As for the feature list, eh, not really. It’s sort of all in my head at the moment. I do know that we’ll be tapering down the constant blur of new features coming down to the 1.0 release, with the intention to focus more on stability and ensuring AO is well tested. I do intend to add a custom Connection delegate to allow for ORM-external code within transactions. I also intend to implement some limited (for the moment) result set paging by extending Query#limit to allow for start bounds. There are a few other things like finding away around that pesky Oracle JDBC bug which disables migrations, as well as a set of Wicket integration classes.

    However, for the most part I think the feature set is beginning to settle down for 1.0. Beyond that, I’m not sure. :-) I’m certainly open to suggestions

    daniel Wednesday, August 15, 2007 at 10:11 am
  4. @Tech

    Excellent point. I doubt I would use a framework in an enterprise setting that I can’t figure out how to configure. And I’m sure the same goes for most people. It’s something I’ve been trying to keep in mind while designing activeobjects.

    daniel Wednesday, August 15, 2007 at 10:34 am
  5. hi daniel,
    have you heard of PolePosition? it’s a benchmark tool for ORMs and JDBC. i used it a year ago to compare Hibernate, TopLink and JDO.
    unfortunately, it was last released (0.20) in june 2005, but it contains implementations for jdbc and hibernate 2 (which is probably not too difficult to upgrade).
    it also generates nice diagrams (in html, pdf).

    maybe you want to take a look at it:
    http://www.polepos.org/

    Gerolf Seitz Thursday, August 16, 2007 at 2:54 am
  6. Hmm, interesting. It seems that PolePosition is more geared toward profiling databases however. I suppose it if were profiling the database *through* activeobjects that would be interesting data, but since the tests were designed to test *database* performance, not database access layer performance, the results probably wouldn’t be as accurate or as informative as we would like. I honestly don’t know: are there any ORM benchmarking tools?

    daniel Thursday, August 16, 2007 at 4:11 pm
  7. poleposition *does* profile databases through whatever ORM framework it’s testing – so you can get ORM performance numbers: run your different ORMs on the same database to remove that variable, and you’ve got numbers for each of the tools.

    forrest Sunday, September 16, 2007 at 3:49 pm
  8. Unfortunately, the metrics and tests which are interesting to measure database performance are vastly different to those which are interesting in measuring ORM performance. With an ORM, most of the differentiating code is in the caching. How well does it cache relations? How effective is its value caching? How efficient are the SQL queries it generates? That sort of thing. You can’t really measure such things effectively if you’re *trying* to test database performance.

    daniel Sunday, September 16, 2007 at 5:29 pm
  9. Not sure I’d be too interested in investigating an ORM alternative produced by folk who are incapable of working out how to use the competitors – scary.

    Been using Hibernate for severnal months now an dI’m new to ORM, and not found the learning curve that much of an issue – and I’m no Superman.

    I’d like to have seen some comparisons not just of perf but capability too.

    JohnLon Friday, October 19, 2007 at 2:55 pm

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*