Skip to content

Are GUIDs Really the Way to Go?

7
Nov
2007

I recently read a (slightly inflammatory) posting entitled “The Gospel of the GUID”.  In it, the author attempts to put forward several arguments in favor of using GUIDs for all database primary keys (as opposed to the more pedestrian sequence-generated INTEGER).  I’ve heard similar arguments in the past, and I keep coming back to the same conclusion: it’s all bunk.

To get right to the heart of my opinion, I really don’t see a compelling reason to use GUIDs for 90% of database use-cases.  Consider for a moment some of the really large databases in this world (Wikipedia, Slashdot’s comments table, Digg, etc).  I can’t think of a single web application in that league which uses GUIDs.  If you don’t believe me, look at the code for MediaWiki, the URL structure for Digg and the idiocy involving INTEGER vs BIGINT on Slashdot.  While de facto practice may not dictate optimal design, it does point to a trend that’s worthy of notice.  More importantly, it proves that INTEGER (or rather, BIGINT) primary key fields are practicable under real-world stress.  But what about the theoretical use-case?

Well to be honest, the theoretical use-case doesn’t interest me all that much.  I mean, if you tell me that your schema can support theoretically 29 trillion rows in its users table, I’ll geek out right along with you.  But if you try to feed that to a client, you’ll get either blank stares, or the equally common: “but there are only 5 billion people to chose from.”  (unless you’re talking to someone in upper management)  For an even better party, try telling your client that you can merge their data with one of their competitor’s in 45 minutes flat.  Somehow I doubt that argument will fly.

Now let’s look at the cons involved.  GUIDs are really, really hard to remember and type by hand.  When you’re in a hurry, pulling a little data-mine-while-you-wait for your boss, you’re not going to want to dig around and find that string you copy/pasted into Notepad.  Oh, and heaven forbid you actually type the GUID wrong!  Finding a single-bit deviation in a 64-bit alpha-numeric is not a trivial task let me tell you.  It’s at about this point that you start to think that maybe INTEGER keys would have been a better way to go.  At least then you can throw the userID into good ol’ short-term memory and bang it out ten seconds later in your SQL query window.  Oh, in case you’ve never seen a GUID before, here’s a couple quick examples:

  • {3F2504E0-4F89-11D3-9A0C-0305E82C3301}
  • {52335ABA-2D8B-4892-A8B7-86B817AAC607}
  • {79E9F560-FD70-4807-BEED-50A87AA911B1}

I’m not even sure how to read that, much less remember it.

Another thing worth considering is that GUIDs are VARCHARs within the database.  This means that not all ORMs will handle them appropriately.  Hibernate will, as will iBatis and ActiveObjects.  But if you’re using something like Django or ActiveRecord, you’re out of luck (disclaimer: Django might actually support VARCHAR primary keys, I’m not sure).  On top of that, some database clustering techniques don’t seem to work nicely when applied to GUID-based key schemes.  I haven’t run into this myself, but my good friend Lowell Heddings has told me tales of strange mystics and evil tides when playing with distributed indexes and GUIDs in MS SQL Server.  Technically speaking, while any sane database may work with a VARCHAR for the table’s primary key, the architects really had INTEGERs in mind when they designed the algorithms.  This means that by using GUIDs, you’re flirting with a less-well-tested use-case.  It’s certainly not unsupported, but you should consider yourself in the minority of users if you go that route.

Oh, and one little myth I’d like to debug: there really isn’t that much more overhead in grabbing the generated value of an INTEGER primary key than in sending an in-code generated GUID value.  Granted, some databases make this harder than others, but that’s their fault isn’t it?  As long as you’re using a well designed ORM, you really won’t see much of a performance increase from removing that miniscule callback.  In fact, depending on how your GUID algorithm works, you may see a significant degradation in overall performance due to extra overhead in your application.

Conclusion

So, the obligatory thirty second overview for our attention-deprived era:

Benefits of GUIDs perfectly unique across all your data and tables; way, way more head-room in terms of values
Problems with GUIDs really, really un-developer friendly; weird side-effects in databases and ORMs; unconventional approach to a “solved” problem; easy way to upset your DBA

So it really seems the only upshot for GUIDs is their massive range.  In trade-off, you lose usability from a raw SQL standpoint, your result sets are that much more cluttered and you risk odd bugs in your persistence backend.  Hmm, I wonder which one I’m going to use in my next app?

Before you make the choice, ask yourself: “how much uniqueness do I really require?”  Over-designing a solution is just as much a problem as under-designing one.  Try to find the mid-point that works best for your use-case.

So Long WTP, Embedded Jetty for Me

5
Nov
2007

As some of you may know, I’ve been doing some work for Teachscape (think Eelco Hillenius) on their Wicket-based application.  It’s been a learning experience for me as much as a working position in that I’ve got to experiment with a lot of technologies and techniques I haven’t tried before.  For example, Teachscape is based on Hibernate and Spring, so I’ve been getting some really in-depth exposure to both technologies (I still dislike them both).  More interesting to me though is the way the application is run in a developer environment. 

It uses Jetty, not Tomcat or Glassfish.  There’s a main class within the application which sets up the Jetty instance, configures it to use the Wicket servlet and starts the server.  This means that running the web application is as simple as “Run As…” -> “Java Application”.  It starts fast, is more responsive than a straight-up Tomcat instance, and best of all, hot code replace works properly in debug mode.  Now having used this approach at Teachscape for some time, I’m starting to really miss it when I jump back to my own, WTP-based Wicket projects.

So after much wrestling with pros and cons, I decided to switch one of my more substantial projects over to using embedded Jetty.  I did this partially for the experience, partially because Jetty makes development so much easier, and partially because it lets me do cool things like package the app up in a single JAR and easily run it from any Java-equipped computer.

Step One: Dependencies

Dependencies are always an issue.  Modern web applications have in excess of 20-30 JAR files which are either packaged into the WAR, or thrown into the app server’s lib/ directory.  Some applications (like the one I was working with) have many more.  As you can imagine, this poses a more-than-minor inconvenience.

WTP makes dependency management (comparatively) easy, since everything is just thrown indiscriminately into the WebContent/WEB-INF/lib/ directory.  Since the JARs just sit there in a centralized spot, deploying is easy, the Eclipse classpath configuration is trivial and I don’t need to work all that hard to upgrade a certain library (such as wicket-1.3-beta3 to beta4). 

Of course, I can keep this approach when I switch to Jetty, but it’s not the cleanest thing to do.  What’s best is to configure the Eclipse classpath to reference the JAR files in their unzipped directories in my system somewhere (preferably through the use of classpath variables).  After all, that’s what I would do for a standard Java project, and with Jetty, a web application is no different from any other Java app.  With this rosy-eyed decision having been made, I deleted the lib/ directory and began reconfiguring the classpath.

Eclipse wasn’t too bad, since I already had half the libraries set as variables to begin with, but the Ant build was another story.  Two hours of typing into a build.properties file and several cups of strong coffee later, I was just about at the end of my rope.  The dependencies were all set, and the project was building, but it was a nightmare.  I can only imagine what’s going to happen if I have a significant influx of libraries and dependencies…

Lesson learned: use Maven.  Actually, don’t use Maven, since it’s just more headache than it’s worth.  What I should have done (had I really thought things through), is take advantage of the now-incubating Apache project, Buildr.  It’s basically a Maven-like build system with scripts based on Ruby and Rake.  Instead of writing an epic-length POM file, you write a 10-20 line buildfile (which is actually a Ruby script with some DSL syntax) and Buildr figures out the rest.  Unfortunately, by the time I bethought myself of this option, I was almost finished with the Ant configuration, and I’m too stubborn to give up half-way through something.

Step Two: The Launcher

The next step in my evil plan was to create a main class somewhere in the project which could configure Jetty and start the server.  Unfortunately, this is a bit harder than it sounds on paper.

The problem is not that Jetty has a tricky API, or any sort of gotchas in its configuration.  The problem is there’s very little useful documentation which shows how to turn the trick.  So, for the record, here’s the source for the main class I wrote:

public class StartApplication {
    public static void main(String[] args) throws Exception {
        Server server = new Server(8080);
        Context context = new Context(server, "/", Context.SESSIONS);
 
        ServletHolder servletHolder = new ServletHolder(new WicketServlet());
        servletHolder.setInitParameter("applicationClassName",
                "com.myapp.wicket.Application");
        servletHolder.setInitOrder(1);
        context.addServlet(servletHolder, "/*");
 
        server.start();
        server.join();
    }
}

Not really too much there once you see it all laid out for you, but it took me way longer than it should have to figure out that it needed to be done that way.  Executing this class starts a new Jetty instance serving my application at http://localhost:8080.  I can start the app in debug mode, getting proper hot code replace (something WTP never got right) and every debug feature Eclipse offers for Java apps.  To stop the server, all I need to do is hit that cryptically labeled button in the Console view which kill the running application.  Oh, and there’s no need to maintain a web.xml file, Jetty doesn’t need it.

Surprisingly, that’s all that’s different between WTP and embedded Jetty from a code standpoint.  I had expected to have to make significant changes elsewhere in my code base, but that was the extent of my fiddling.  Pretty slick!

Evaluation

Once the Jetty environment was up and running, it was time to decide whether or not it was worth it.  After all, I could always do an svn revert if I really didn’t like the results.

Feature WTP/Tomcat Jetty
Rough startup time 10 sec 2 sec
Page responsiveness Decent (a bit sluggish) Excellent (pages generate insanely fast)
Server errors Cryptic and well formatted Cryptic and ugly
Debugging Passable, no hot code replace Perfect, just as good as Java apps
Ease of setup Very good Harder than it should be
Setup time 10 minutes 3 hours

Overall, I’ll take Jetty any day, but I definitely need to use something like Buildr the next time I do this.  That would have pretty much dropped the setup time from 3 hours down to maybe 30 min (if that).  Oh well, live and learn.

So the moral of the story is that I’m done messing around with Eclipse WTP, trying to beat it into submission and putting up with all its oddities and extreme slowness.  I’m keeping it installed, if only for the HTML editor, but no more will I attempt to use it as a runtime controller.

Would you Pay for Java on Leopard?

1
Nov
2007

I don’t normally do this, but I think this is interesting enough to be worth a cross-post.  Jonathan Locke (originator of the Wicket framework) has proposed two polls basically asking the question: Would you pay Sun for a port of Java 6 on OS X, and if so, how much?

Personally, I think it’s an interesting question.  Probably half of the opinions I’ve seen about the state (or lack thereof) of Java on Leopard have ended with railing against Sun for not handling Java on that OS itself.  Of course, James Gosling already officially debunked why Sun isn’t handling Java on Mac (Apple wanted to do it), but that doesn’t stop people from spreading the FUD that Sun is just being lazy.

The fact remains though that Apple has really dropped the ball on this one.  Whether Jobs likes it or not, Java is a relevant force in computing today, and no mainstream OS is complete without a working JVM.  Despite this fact, Apple has given no indication that it is even interested in releasing Java 6 for Leopard, nor does it seem to care that Java 5 doesn’t work properly.  Given these facts, it only makes sense that someone else will need to step forward and carry the torch of Java on Mac.  Who better than Sun?

But the million dollar question here (well, not quite that much) is how much would you be willing to pay for Sun (or any other company for that matter) to port Java 6 to MacOS X?  $50?  $99?  Cast your vote on the polls below:

General Purpose Editor Within Eclipse

31
Oct
2007

I’ve blogged before about the difficulties I’ve had in finding a solid, general-purpose text editor for my system.  I looked into VIM for Windows, E, SciTi and many more before finally settling on jEdit.  It’s a really good editor, if a bit rough around the edges.  A lot of people (myself included) would put it on par with TextMate in terms of features, and superior to it in some ways thanks to its cross-platform nature.

As a separate application from my IDE, jEdit performs superbly, but although this solves the problem of editing arbitrary files from file explorer, it still leaves open the problem of editing arbitrary file types within Eclipse.  What I had been doing is using jEdit as an external editor, opening it up any time I needed to open a weird file type within Eclipse.  This works fairly well, but it’s heavy on memory, not integrated with tools like Mylyn (as if there were any tools like Mylyn) and it’s just annoying, dealing with a separate app like that. 

What I really want is some sort of embedded jEdit editor canvas within a normal Eclipse editor part.  One would think this would be very possible, given SWT_AWT and its capabilities.  In fact, I was just about to crack open the jEdit source to see if I could roll something myself when an eternal axiom sprang to mind: Google is your friend.  Actually in this case, I used Eclipse-Plugins.info (which IMHO is still the best Eclipse plugins site, despite lacking an active administrator) and did a quick search for any plugins mentioning the word “jedit”.  A few minutes later, I was perusing the page for the little-known Color Editor plugin.

Color Editor is basically a simple Eclipse editor which will open any file for you.  It’s not much more sophisticated than the standard Text Editor which is the default for unknown file types.  However, what it *does* do is parse jEdit’s mode.xml files, providing semi-advanced syntax highlighting for over 150 file types.  Granted, it doesn’t have all of jEdit’s nice editing features or plugins like SuperAbrevs, but if I need that I’ll open up jEdit itself.  For most of what I do, quick-and-dirty syntax highlighting is all I need.

The problems with this plugin are mainly caused by the fact that it’s quite old and hasn’t really been updated recently.  It still defaults to the old-style jEdit colors, which are ugly as sin.  Also, it doesn’t support as granular syntax highlighting as the current version of jEdit (only 2 comment types, 2 literals, etc).  It doesn’t support easy adding of modes (you have to repackage the plugin JAR file), nor does it allow you to simply point it to the same mode catalog used by jEdit itself (which would simplify management of editor modes).  Despite all of that, it’s still a really nice idea.

The plugin works not by embedding a jEdit editor canvas using SWT_AWT, but by just using the standard Eclipse syntax highlighting techniques coupled with the jEdit mode files.  The downside to this approach is the need to write a whole bunch of mode parser code which is effectively already done within jEdit.  Also, odd bugs can leak in around the edges, since the editor is effectively reverse-engineering the jEdit editor part.  However, the approach does have a very unexpected (and pleasant) silver lining: fonts look good.

I’ve had tons of problems with jEdit’s font rendering on Windows, mainly due to the fact that Swing’s font renderer doesn’t seem to be as sophisticated as Vista’s (or at least, less capable of dealing with monospaced fonts).  But since Color Editor uses native font rendering, the text looks 100% native:

jEdit

jEdit Fonts

Eclipse Color Editor

Color Editor Fonts

I assume you can see the difference.  :-)   So, fonts look great, but if you examine the body of the method in the file a bit more, you’ll see examples of those odd bugs I was talking about.  For some reason, Color Editor thinks that the suite variable, as well as the BaseTests, DBTests, SchemaTests and TypeTests classes are all methods, rather than local variables and classes.  This is annoying, but it’s not the end of the world.  Granted, I haven’t been using this tool for all that long, but I’m guessing that instances like this are fairly rare, and not cause for immediate alarm.  You’ll also notice some evidence of the lessened flexibility in the syntax highlighting engine (fewer types of comments in this case) in the way the javadoc and single-line comments are colorized differently.

Download Color Editor Plugin from gstaff.org  (no update site available)

All you have to do is stick the JAR in your eclipse/plugins directory and start Eclipse with the -clean option (usually unnecessary, but just to be safe).  Color Editor will automatically be registered as the default editor for unknown file types.  If you want to change the default colors (as well you should), you can find the preference under Coloring Editor -> Colors (no idea why the conjugation difference between the editor name and the preference pane).  It’s a bit clumsy to try and set all of the colors to a predefined theme (I made mine look like the current jEdit defaults), but it’s all possible.

Hopefully you’ll find this a useful tool in editing those random shell scripts and who-knows-what-else which got included in the project, but for which Eclipse doesn’t have a separate plugin.

ActiveObjects 0.6 Released

30
Oct
2007

As a minor side-bar in this (hopefully) noise-less blog, I’d like to announce the release of ActiveObjects 0.6.  If you could care less about ActiveObjects and/or random announcements about it, please feel free to completely ignore this post.

ActiveObjects 0.6 is the most stable release yet (hopefully).  With this release, we see the rise of RawEntity, a superinterface to Entity which allows for greater customization, particularly in the area of primary keys.  Most developers will never need to even be aware of this interface, but for those that have such requirements, it should be very helpful.  Likewise, this release also allows for arbitrary types to be persisted into the database, through the use of custom classes which manage the mapping between Java type and database type.  (hint: this even allows for database-specific types such as PostgreSQL’s MATRIX if you really want them)

Most importantly, 0.6 is the release where I actually buckled down and started writing some documentation.  What’s available on the project page right now is still a little sparse, but rest assured this will be rectified soon (not sure when, but soon).  The main focus for the moment has been javadocing the public API.  This is far from complete at the moment, but all the important (and lengthy) classes are done (specifically, everything in the net.java.ao package).  With this documentation, it should hopefully be somewhat easier to use ActiveObjects in a project without resorting to desperate Google searches at the wee hours of the morning.

Most of the interesting stuff in this release I’ve already covered in other posts on this blog, so I won’t bore you by repeating all of it.  Suffice it to say, if you’ve been waiting for a more stable release to start playing with ActiveObjects, this is it.  I won’t guarantee that the API won’t change at all leading up to version 1.0, but I can say that most of the earth-shattering stuff is behind us.  Documentation is in place, and we’ve got a large (and growing) number of tests which are run to ensure quality and stability in the core functionality.  Download it, try it out, break it, file bugs, you know the drill.  I welcome all suggestions, comments, questions and pro-Hibernate rants.

Download activeobjects-0.6 from java.net