Skip to content

One Wild Week with Ubuntu Linux

29
Oct
2007

As I mentioned in my last post, I managed to erase my hard drive and ruin my productivity for a week all in one fell swoop.  Since I didn’t have a Vista install disk handy, I had to make do with Ubuntu Linux for over a week while I waited for Microsoft to send me a replacement.  Obviously life doesn’t just pause and wait for my computer to catch up, so I was literally forced to use Linux on a daily basis for exactly the same things that I would normally use Windows.  This gave me quite a unique opportunity: A chance to test out an alternate OS under identical circumstances to its competitor.

Compiz

The first thing on my list to try was enabling Compiz.  Yes, I know it’s mostly eye-candy, but there’s also some very useful stuff in there.  3D compositing is extremely useful in a WS if for no other reason than the UI always feels sprightly and responsive, rather than a constant frustrating shade of low-performance.  For example, if you drag a window in vanilla Gnome, there’s repaint lag.  Imperceptible though it may be on most modern systems, it’s still there.  If you drag a Window with Compiz enabled, the lag virtually goes away.  Also, if a window stops responding, Compiz drops its color saturation (similar to Vista’s white overlay), whereas on a non-composited WS, the app would just freeze and stop painting.  Other Compiz treats like subtle shadows (which really help the eye to differentiate windows quickly), a slicker workspace switcher and a really nice expose clone all combine to produce a very competitive environment for those of us coming from the likes of Vista and Leopard.

The downside is it’s very difficult to get working.  As with all things Linux: if you want it done right, you have to do it yourself.  Actually, with Linux it’s more like: If you want it done, you have to do it yourself without documentation.  One of Ubuntu Gutsy Gibbon’s big selling points is that it’s so easy to get working out of the box.  One of the hot new features being pushed by Canonical is that Compiz is now enabled by default on installations.  This is supposed to mean that you don’t have to do anything to get it working.  Unfortunately, it didn’t work out of the box for me (I have an ATI video card).  The error message I got when trying to configure the “Desktop Effects” was utterly unhelpful, and I had to actually figure out for myself that I needed to manually install xserver-xgl (no documentation, no “helpful hints”, nothing).  Even then, it still took a manual kick-in-the-pants to actually get Compiz running for the first time.  And even then things still weren’t a hundred percent.  Setting changes took a logout/login to apply (without any notice).  Key combos conflicted and didn’t work half the time.  All in all, it was a mess.

The second thing was to try to get Eclipse, jEdit and gnome-terminal into a usable state so that I could actually do real work with them.  Fortunately, gnome-terminal is quite amenable to the effort.  I daresay it’s probably the most polished app on the whole OS; everyone should be using it!  So 30 seconds after I opened up gnome-terminal for the first time, I had disabled that oh-so-annoying terminal bell, set the colors to something a bit easier on the eyes and appended ~/.bin to my PATH.  I figured the next easiest would probably be Eclipse.  After all, Linux GTK is supposed to be SWT’s second most stable WS, and Eclipse itself has been running on Linux practically from version 1.0.  Unfortunately, it wasn’t quite so simple…

Eclipse

Oh I didn’t have any real trouble getting Eclipse to start on my Linux system, but starting was about all it could do for a while.  For one thing, it took me nearly three hours to get all of the Europa plugins I use installed again.  Granted, it might have something to do with the major fall release coinciding with the day I was trying to configure my setup, but OSU’s servers are faster than that.  I’m guessing there was something weird going on within the update manager that doesn’t normally happen on Windows (it usually takes me about 10 minutes to configure a fresh Eclipse install).  Once I had gotten everything installed, I tried to open up my old workspaces.  It was about this point that I started to run into trouble.

Everyone should know better than to try to open up a workspace from another system unmodified on a completely different one; especially when crossing OSes.  That’s not to say that I haven’t successfully performed such a procedure in the past, but I didn’t want to take any chances with another three hour wait.  I did the “right thing” and deleted the .metadata directory and open the workspace afresh.  First thing, I used the handy “Import Preferences” wizard to grab all of my saved syntax highlighting, font sizes and classpath variables from a previously exported .epf file.  I sucked the preferences into my workspace, looked around, and everything seemed dandy. 

At least, it did until I hit the “User Libraries” preference pane.  Here, Eclipse gave me an error (something about absolute paths and IClasspath) and refused to show the pane.  This was a situation where something was wrong with the underlying preference for the user libraries and Eclipse wasn’t even allowing me to dig in and fix it!  After messing around with the internal structure of the .metadata directory for an hour or so, trying to erase just that particular preference set, I gave up in disgust, deleted the .metadata directory and started again.

This time, I ran the .epf file through sed (one of the many advantages of once again having a Linux shell at my disposal) and got rid of all the “C:” instances throughout the file.  Eclipse seemed to like this a lot better, and actually deigned to show me the preference pane this time, without the annoying error.  I was able to change all of the paths for the libraries, and once again things compiled nicely.

Unfortunately, not every user is going to have the know-how to do what I did.  I hate to sound immodest, but I do have a bit more Eclipse experience than the average Joe trying to switch his environment from Windows to Linux (a common occurrence these days).  If a problem arising from a common operation leaves me scratching my head for over an hour, it’s probably something which should be addressed.  This sort of complete blocker of a problem which necessitates manual hacking of auto-generated files is really a big no-no.  I may have gotten it working in the end, but the point is that it wasn’t easy.  All it happened to be was yet another source of stress to load onto my already bursting plyurethane sphere.

Tools and the Environment

Ironically, configuring jEdit was a walk in the park, especially compared to Eclipse.  All I needed to do was download the .deb package from jedit.org, provide my user password, copy over my modified JAR (some things just can’t be fixed in the preferences) and away I went.  Ten minutes later I had all my favorite plugins installed and I was developing with pleasure random bash scripts on remote servers.  It still took a little Linux foo to get the warm start working, but compared to what I had been through with the Eclipse setup, it was a walk in the park.

After this, there were no more “big things” to get setup.  I had more-or-less everything I needed to be productive once more, so I turned my attention to the smaller things (yes, I do count 3D compisiting as a “big thing”).  My first task was to disable the track pad and enable center button scrolling on my ThinkPad.  As it turns out, center button scrolling is as easy as telling xserver to emulate a 3-button mouse.  Unfortunately, it seems that disabling the track pad was a bit more complex.  I discovered (to my surprise) that there is actually no way to do this in Linux itself.  I had to go into the BIOS and completely deactivate the device, just to keep it from randomly firing when I brushed it with my palm.

Here I also started to look into things like fonts.  Unfortunately, even installing the Windows fonts package didn’t improve the situation.  Linux has an excellent renderer if all you want to display is Monospace 10pt (which it renders quite nicely), but anything else looks terrible at low sizes and “not quite relaxing” when larger.  Unfortunately, fonts are like the wind: sometimes pleasant, sometimes annoying as hell, always unfixable. 

So I decided to set my sights on a more attainable goal and focused on the mouse acceleration curve.  Windows Vista has a very nice feature which adaptively adjusts the mouse curve based on speed, making the mouse more precise at low speeds and more responsive at high speeds.  This leads to a general feeling of greater ease in mousing.  You don’t really notice what a difference this makes until you try to wrestle the mouse from point A to point B on a system which doesn’t have such a nice curvature.

Anyway, adjusting the mouse curve seems like a pretty normal thing to want to do, right?  Linux is supposed to be an incredibly configurable operating system, so I figured I shouldn’t have a problem.  Well as it turns out, there is no way to do this in xserver.  I couldn’t find any tool, any how-to anywhere which gave instruction on how to rectify this glaring lack of control.  After wasting quite a bit of time Googling around and browsing “man xorg.conf”, I threw in the towel and decided to put up with the sore mouse-finger for a week.

Conclusion

Linux is great.  (what, not the conclusion you were expecting?)  I absolutely love Linux’s terminal, startup times (roughly 3x faster than Vista) and file systems (with ReiserFS managing /home, my perceived FS speeds were in the range of 2x better than NTFS on the same drive).  Unfortunately, Linux just doesn’t have what it takes to be a desktop operating system for the average user.  What I mean by this is it just takes too much manual tweaking and fussing to get things to work right.  As a developer, I may have the ability to fix little glitches that arise in my environment, but that doesn’t mean I have the time or inclination.  I want something that works…now; and I don’t want to lose hair over whether or not the graphical environment is even going to start on the next boot.  I guess it’s back to Vista for me!  *sigh*

Apple Blows Another Great Opportunity

27
Oct
2007

I hate to be yet another blogger taking a potshot at Apple in the wake of the Leopard release, but I just have to say it: Apple, WTF are you thinking?!  There, I said it, now we can be rational about things.

For those of you living in very cramped fox-holes for the past two years, MacOS X 10.5 (Leopard) is Apple’s latest incarnation of the cult-classic OS, MacOS X.  It’s got multiple workspaces, file system versioning, read-only ZFS support and eye-twisting shadows which make your desktop look about half a mile thick.  It’s got a totally redesigned Finder (which coincidentally looks just like iTunes) and added eye candy for both the Dock and the menu bar.  What it doesn’t have is Java 6.

Sun released Java 6 back in what, last November?  Apple’s had quite a while to get their act in gear and bring the latest major release to the table.  In fact, they’ve had even longer than a year, since Java 6 was in open development long before its release.  Apple did release a few developer previous of Java 6 to ADC members, but they discontinued the practice several months ago and haven’t made anything available since.  It’s not as bad as all that though, the preview releases weren’t too much more than a renamed Java 5 with a few new generic APIs.  Either way, Apple really has no excuse for not having Java 6 ready at least to coincide with the latest version of its OS, if not sooner.

To be totally honest, I don’t see how Apple is even justifying this decision to itself and its stockholders.  Consider how many Java developers have switched to Macintosh over the last few years.  I can count on one hand the number of developers I know and respect who still use Windows or Linux as their primary development machine.  It’s startling the shift in the market which has taken place, partially driven by Rails’s major push of TextMate and the waves it caused throughout the rest of the development community, but also caused by the fact that MacOS X really is a very slick, very stable BSD incarnation which can run smoothly as a desktop.  Well, that and the fact that the Apple hardware just looks so cool.

The thing is, all of these Java developers who’ve switched to Mac recently are going to start second guessing that decision.  Java 6 is now a year overdue for the Mac platform, and Apple is giving no indication of rectifying the situation any time soon.  What’s worse, is the version of Java 5 which does come pre-installed on Leopard seems buggy and unstable (disclaimer: I haven’t actually tested this myself, I just have it on good authority).  Without a modern, stable Java, many developers will be simply unable to use the platform as their primary system.  And guess where these developers will turn?  Either to Linux and all the headaches thereof, or back into Microsoft’s waiting (and well-patented) arms.  Is Apple really so big that it can just give the finger to such a large market segment?

Consider too, what this is going to mean for the future of the Mac platform.  In the last couple years, we’ve seen a vast increase in the quality and quantity of applications available for Macintosh.  I don’t think that it’s a coincidence that this has correlated directly with the up-surge in developers switching their primary platform to OS X.  Think about it, developers who use a certain platform are going to write software with that platform in mind.  It’s only natural.  With more and more developers focusing on Macintosh, the quality of applications for the platform increases, as well as number of new projects focusing exclusively or primarily on the platform for final deployment.  In short, it’s exactly what Apple needs to make the platform a dominant player in the market 5 years from now.  By flipping off the developers, Apple is basically saying “Yes, we know you want to write state-of-the-art applications that run exclusively on our platform, bringing more customers to our outlet stores, but the fact is that we don’t want you writing applications for our platform.  Have you heard of Linux?”

Now I know that speaking out against Apple is like blaspheming a divinity, people have been stoned for less, but it still needs to be said.  For the record, I like Macintosh.  I like the Apple products, and I’ve always loved the Mac OS (ever since my first computer running OS 7).  That said, I have never liked Apple as a company, and this latest fiasco is reminding me why not.  Hopefully Apple will see the error of their ways and offer Java 6 as an update sooner rather than later.  And if not, there’s always Windows!

How to Reimage Your Drive and Lose Your OS

22
Oct
2007

…in five easy steps.

So basically, I royally screwed my computer this weekend. For those of you who didn’t know, Ubuntu 7.1 (Gutsy Gibbon) came out last Thursday, and I decided that I just had to try it. I had just written an impassioned review of why Linux isn’t the superior OS for developers, but I figured that maybe this time it would be different. A lot of my headaches with Linux step from the fact that you have to spend so much time mucking about in its uncharted guts just to get your mouse (insert peripheral here) working. In short, it’s not a very user friendly OS. And even as a developer, I want an OS that stays out of my way and does what I ask, when I ask of it.

Naturally, I remembered how much pain I experienced the last time I tried to wipe my OS and replace it with Linux. I was back on XP then, and it took a full three days to get everything back to productive levels. Even then, things were never the same. Since I try to never repeat the same mistake twice (leaving more time for new and exciting mistakes) I decided to take some precautions first.

Linux Drive Imaging 101

So for those of you who didn’t know, it’s possible to save the data on a hard drive bit-by-bit into a file on another machine or hard drive. Once this file is in place, it’s also possible to restore the data on the drive bit-by-bit (otherwise the process wouldn’t be very useful) and achieve exactly the same state as immediately prior to sucking the image off of the drive. Colleges use this technique often for lab machines, allowing them to be altered at will without affecting the next student’s work. There are other applications, but it’s a long and dull list which would really detract from the focus of this post.

Being an enterprising computer user savvy in various forms of Linux voodoo, I decided to attack this problem myself. Now, I’ve actually built software to accomplish just this task in the past, so I figured it would be a cakewalk. After about 5 seconds of deliberation, I settled on the GNU Linux utility dd as the foundation for my scheme. dd is naturally cheaper than commercial offerings like Ghost or Alteris, and it also has the significant advantage of being installed on just about every Linux system, including live CDs such as Knoppix or the Ubuntu installer. This will work to our advantage.

dd is just about the simplest command in the GNU suite. All it does is read one file bit-by-bit and spit the output into another file. If either an input file or an output file is unspecified, stdout or stdin will be used instead. Thus, dd can be useful when chained in an array of pipes. At its simplest form, dd just passes bits from stdin to stdout, functioning as a pipe:

cat file.txt | dd > file.txt   # how useless can we get?

Of course, what we’re trying to do here is actually get data off of a hard drive and squirt it into a file. Does this actually help us? Linux 101: everything is a file. And when I say everything, I mean disks, peripherals, PCI devices, network devices, everything. This means that if we really wanted to, we could use dd to copy the entire contents of our hard drive and send it to our printer! (though I’m not entirely sure why we would want to) More usefully, this means that it’s trivial to copy data from a drive into a file since the drive is really a file to begin with. And what is dd but a fancy way to copy bits from one file to another?

dd if=/dev/sda of=my-drive.img    # image the first sata/scsi drive into "my-drive.img"

Grand! The problem here is that when the smoke clears several hours later (dd is unfortunately quite slow), we will have an image file roughly twice the size of the hard drive itself. (oh, caveat time. You should not try to do this while the computer is booted. Such a command should be run from a live CD onto an external drive or a duel-boot OS) The reason this file is so huge is that data is represented exactly the same both on the disk, and in the image file. Files of course are themselves represented in a disk in a certain way, and thus there’s a great deal of overhead here. An entire file system is being packaged within a file, bit-by-bit, and then itself being packed onto another, possibly completely dissimilar file system. Naturally there’s going to be quite a bit of inefficiency.

The nice thing about most file systems is when you read the data bit-by-bit from the drive, you will notice large blocks of repeated values or patterns. Anyone who knows anything about compression will tell you that this is a sure sign of high compressibility. Sure enough, if you pipe dd through bzip2, the resultant file will be almost exactly the size of the data on the disk. So even if you have a 250 GB hard drive, if you’ve only used 30 GB, the resulting .img.bz2 will only be 30 GB (more or less). This is really nice; and while not as efficient as some systems like Ghost, this image size should do nicely for our needs. The problem here is that the bzip compression algorithm is insanely slow on compression. It’s quite fast to decompress a file, but its use would extend the imaging process for a 60 GB drive from roughly 4 hours to well over 12.

A good compromise in this situation would be to use gzip instead. Gzip, while not as efficient a compression algorithm, is much faster on compression than bzip. It’s quite a bit slower to decompress, but not inordinately so. Gzip’s other problem is that it’s nowhere near as efficient a compression algorithm as bzip. 30 GB of data on a 60 GB disk will compress down to a roughly 45 GB image file using gzip. That’s 15 GB more than bzip2, but well worth it in terms of compression speed in my book.

Using the magic of bash command piping, we can accomplish the drive imaging in a single command:

dd if=/dev/sda | gzip > image-`date +%m%d%y`.img.gz

This will produce a file with a name like “image-101907.img.gz”, depending on the current date. To push the image back onto the drive, we use a little more bash piping:

zcat image-*.img.gz | dd of=/dev/sda

All I needed to do was use these two commands, imaging onto an NFS share on my server and I could backup my full OS image at will. It’s incredibly simple, and simple plans always work…right?

Take Warning Dear Friends

Unfortunately (for me), I didn’t account for the fact that hard drives inevitably fail, usually at the worst possible moment. I imaged my Windows Vista installation, then proceeded to install Ubuntu 7.10. Everything went smoothly in the install, but when it came to the tasks I perform on a daily basis, the stability just wasn’t up to snuff. After about six hours of fiddling with settings, kernel recompilations and undocumented package requirements (hint, if Compiz doesn’t work with your ATI card out of the box, manually apt-get install the xserver-xgl package) I decided to revert back to Windows. The frustration just wasn’t worth it (more about this in a future post).

So I booted my computer back into the live CD, NFS mounted the server, imaged the Linux install (just in case) and started the reimaging process only to discover that the server’s 1 TB hard drive had corrupted the image file. I was quite literally horrified. Fortunately I had anticipated something like this when creating the image, so I had cp‘d the image file over onto a separate RAID array prior to overwriting the drive. I juggled the NFS mounts and tried imaging from that file only to discover that it was incomplete! It seems that the image file had been corrupted on the disk as it was created, meaning that it wasn’t a valid image file when I copied it to the separate drive.

Needless to say, I was quite upset. I don’t even have a DVD copy of Windows Vista (it was preinstalled on my computer), so I have to shell out $15 and wait a week for Microsoft to ship me a replacement disk. In the mean time, I can’t just do without my computer, so I fired up dd again and pushed the Linux image back onto my drive. Of course, not being the OS I really wanted, it worked perfectly…

All of my data was backed up, so I haven’t really lost anything but the time and frustration of having to do a clean reinstall of Windows on my system (a significantly more challenging task than an install of Linux). The moral of the story is: if you have data which you’re really going to need, even if it’s only a few hours down the road, make absolutely sure that it’s on a stable drive. What’s more, if you do store it redundantly (as I attempted to), compare checksums to make sure the redundant copy actually worked. If you only spend a few minutes verifying the correctness and integrity of your critical backups, little red flags should be triggered mentally. In short, don’t try this at home – I wish that I hadn’t.

Custom Data Types in ActiveObjects

17
Oct
2007

ORMs really interest me, so naturally I read a lot of material regarding ORMs of all kinds, especially Hibernate and ActiveRecord.  One of the more interesting reviews I read recently complained about the rigidity of the type system in the Rails ORM.  According to the author’s examination of the code, ActiveRecord just uses a monolithic switch/case statement to determine the appropriate Ruby type from the SQL type in the result set.  This may make sense from a simplicity standpoint, but it may not be the best approach when it comes to flexibility.

The problem with this approach is that it’s impossible to easily add new types to the ORM.  Granted, the framework authors could do it by modifying the switch/case statement(s) – and the approach does usually require more than one statement – and releasing a whole new version of the framework.  This is not a significant issue as the framework authors already have access to the full library sources.  The real trial is with third-party developers who require custom data types.

An alternative approach (suggested in the article) is to implement a series of type delegates inheriting from a common superclass, or possibly using a mixin as allowed by Ruby.  These type classes would each be responsible for a single type, handling the mapping both to and from the language-native type to the database type.  This would allow for both easy addition and modification of core types by the framework authors, but also trivial support for arbitrary types as implemented by third-party developers.

Not one to shirk good advise when I hear it, I’ve decided to go with this approach to types in ActiveObjects.  Formerly, I must admit I had gone with the multiple, giant switch/case statements.  This seemed to make sense when I first implemented the framework, but it developed, it became apparent that this was inadequate, especially if third-party types are desired.  This decision led to the refactoring of the type system and subsequent creation of the TypeManager class.

TypeManager is basically the singleton manager for the entire type system.  It maintains the list of available DatabaseType(s) and can resolve both Java classes and SQL types to the appropriate delegate.  A number of core types (VarcharType, IntegerType, etc) are added to the singleton instance of TypeManager, ensuring that basic functionality works without any extra effort on the part of the developer.  If a type other than the core types is needed, all that is necessary is to add the type delegate instance to the TypeManager prior to the type’s usage in either migrations or data access.  Thusly:

public interface Company extends Entity {
    public String getName();
    public void setName(String name);
 
    public Class<?> getJavaType();
    public void setJavaType(Class<?> type);
}
 
public class ClassType extends DatabaseType<Class<?>> {
 
    public ClassType() {
        super(Types.VARCHAR, 255, Class.class);
    }
 
    @Override
    public Class<?> convert(EntityManager manager, ResultSet res, 
                Class<? extends Class<?>&gt; type, String field) throws SQLException {
        try {
            return Class.forName(res.getString(field));
        } catch (Throwable t) {
            return null;
        }
    }
 
    @Override
    public void putToDatabase(int index, PreparedStatement stmt, 
                Class<?> value) throws SQLException {
        stmt.setString(index, value.getName());
    }
 
    @Override
    public Object defaultParseValue(String value) {
        try {
            return Class.forName(value);
        } catch (Throwable t) {
            return null;
        }
    }
 
    @Override
    public String valueToString(Object value) {
        if (value instanceof Class) {
            return ((Class<?>) value).getName();
        }
 
        return super.valueToString(value);
    }
 
    @Override
    public String getDefaultName() {
        return "VARCHAR";
    }
}
 
// ...
TypeManager.getInstance().addType(new ClassType());
 
Company[] stringCompanies = manager.find(Company.class, "javaType = ?", String.class);
for (Company c : stringCompanies) {
    System.out.println(c.getName() + " former held type " + c.getJavaType().getName());
 
    c.setJavaType(Exception.class);
    c.save();
}

The most complicated bit of the example above is the database type itself.  Yet even this delegate isn’t too horrible.  The ClassType class first specifies in its constructor which types it corresponds to, both database and Java.  Multiple Java class types can be specified, allowing for cases like IntegerType which maps to both Integer.class and int.class.

The rest of the database type is fairly self-explanatory.  There are methods to read the Java value out of a JDBC ResultSet, put the Java value back into a JDBC PreparedStatement, as well as three methods to handle some of the non-database type-sensitive operations, such as parsing a String value into a type-specific value and visa-versa.  These database non-specific conversions are required for things like parsing the value of a @Default or an @OnUpdate annotation.  Finally, getDefaultName() allows the default DDL rendering of the type to be specified.  This can be overridden in the DatabaseProvider implementation for that particular database, but the use of getDefaultName() allows for third party types that the database provider developers may not have foreseen.  Thus, it effectively opens the door to third-party types in migrations.

Of course, no example would be complete without another one to complement it!  Here’s how we could create a type delegate for the java.awt.Point class:

public class PointType extends DatabaseType<Point> {
    private static final Pattern PATTERN = Pattern.compile("x=(\\d+),y=(\\d+)");
 
    protected PointType() {
        super(Types.VARCHAR, 45, Point.class);
    }
 
    @Override
    public Point convert(EntityManager manager, ResultSet res, 
            Class<? extends Point> type, String field) throws SQLException {
        return (Point) defaultParseValue(res.getString(field));
    }
 
    @Override
    public void putToDatabase(int index, PreparedStatement stmt, Point value) 
                throws SQLException {
        stmt.setString(index, valueToString(value));
    }
 
    @Override
    public Object defaultParseValue(String value) {
        Point back = null;
        Matcher matcher = PATTERN.matcher(value);
 
        if (matcher.find()) {
            back = new Point();
            back.x = Integer.parseInt(matcher.group(1));
            back.y = Integer.parseInt(matcher.group(2));
        }
 
        return back;
    }
 
    @Override
    public String getDefaultName() {
        return "VARCHAR";
    }
}

One thing of note here which has changed from the previous example of ClassType is that the second parameter to the super constructor is now 45, instead of 255.  This parameter is actually the default precision of the SQL type when rendered into the database.  If the SQL type doesn’t have a precision or should just take the database default, a negative value should be specified for this parameter.  Another item of note is that we’re delegating work between methods in a way that I simply didn’t do for ClassType.  Because the rendering of the type in the database is in VARCHAR (String) form, we can rely upon our default String conversion methods to render into the database.  As an aside, the superclass implementation of valueToString(Object) uses the toString() method for that particular value.

As you can see, the type system in ActiveObjects is incredibly powerful and capable of satisfying many use-cases that were impossible in previous versions or other ORMs.  Hopefully this brief glimpse into advanced uses of the type system will aid you in databasing efforts.

Custom Primary Keys with ActiveObjects

15
Oct
2007

One of the main complaints I’ve heard leveled against ActiveObjects is that it’s just not suitable for mapping to legacy schemas.  More generically, concerns have been mooted that it enforces naming conventions and field conventions which aren’t suitable/preferable for some projects.  I suppose at first both of these were true.  After all, ActiveObjects’s entire premise was convention over configuration, and this requires some restrictions by default.  However, I don’t think it’s entirely accurate any longer.

Over the last few months, I’ve added several features which satisfy three primary goals:

  • Customize the table name convention
  • Customize the field name convention
  • Allow for primary key fields (and types) other than id INTEGER

The first two goals were easily met through the addition of TableNameConverter and FieldNameConverter.  These two classes are used by every feature within ActiveObjects, from migrations to simple data access, to determine the database table and field names from the class and method names respectively.  The canonical example of this is table name pluralization, which can be accomplished in the following way:

EntityManager manager = new EntityManager(
    "jdbc:mysql://localhost/test", "username", "secret");
manager.setTableNameConverter(new PluralizedNameConverter());

Not too horrible.  The second use-case is assigning a different field name convention than the default camelCase.  For example, some people really like the ActiveRecord (Rails) field naming convention.  (e.g. “first_name” as opposed to “firstName”)  This can easily be accomplished by specifying a field name converter:

EntityManager manager = new EntityManager(
    "jdbc:mysql://localhost/test", "username", "secret");
 
// lower_case convention
manager.setFieldNameConverter(new UnderscoreFieldNameConverter(false));

Custom table and field name converters are also possible, allowing for a great deal of flexibility in name conventions.  Additionally, it’s always possible to specify field and table names directly in the entities, using the @Accessor, @Mutator and @Table annotations respectively.

Custom Primary Keys

The most challenging goal (from a library standpoint) is to allow for primary key fields other than “id”.  This is partially such a challenge because it had been hard coded literally everywhere in ActiveObjects that the “id” field is the field to use in any sort of SELECT, JOIN, INSERT, UPDATE, etc.  In short, changing this required finding all of these instances and converting the code to query a centralized source for the data.  A few days of fiddling with Eclipse’s text search accomplished this without inordinate pain, but the hard part was coming.

The question remained: how to specify the primary key within the entity itself?  After all, it’s been hard coded and sort of magically “worked” based on the method definition in the Entity superinterface.  There had been a syntax to specify a second PRIMARY KEY for the schema migration, but ActiveObjects didn’t treat these fields any differently, and this sort of syntax wouldn’t really cut it if we were trying to completely override the existing getID() method in the superinterface.

The solution is to refactor all of the interesting functionality in Entity up into a super-superinterface, RawEntity.  Thus the only method defined within Entity would be getID(), annotated appropriately to be recognized as a PRIMARY KEY field.  This would do away with all the magic tricks under the surface which assumed the existence of the getID() method.  ActiveObjects can easily parse the class to find the PRIMARY KEY field amongst the methods, both defined and inherited.  The only compromise which must be made is only one PRIMARY KEY can now be allowed per table.  This isn’t such an issue, since 99% of the time, that’s all you need anyway.  Usually that remaining 1% can be more properly accomplished using UNIQUE and some sort of auto-generation of values.

Since we’ve refactored interesting functionality up into RawEntity and kept getID() within Entity, no legacy code needs to be changed.  Any entities previously written against ActiveObjects will run without modification or any behavior changes.  We are merely allowed the flexibility of specifying our own primary keys.  So, without further ado, the obligatory example:

public interface Person extends Entity {
    public String getFirstName();
    public void setFirstName(String firstName);
 
    public String getLastName();
    public void setLastName(String lastName);
 
    public Company getCompany();
    public void setCompany(Company company);
 
    public House getHome();
    public void setHome(House home);
}
 
public interface Company extends RawEntity<String> {
 
    @PrimaryKey
    @NotNull
    @Generator(UUIDValueGenerator.class)
    public String getCompanyKey();
 
    public String getName();
    public void setName(String name);
 
    @OneToMany
    public Person[] getEmployees();
}
 
public interface House extends RawEntity<Integer> {
 
    @PrimaryKey
    @NotNull
    @AutoIncrement
    public int getHouseID();
 
    // ...
 
    @OneToMany
    public Person[] getOccupants();
}
 
public class UUIDValueGenerator implements ValueGenerator<String> {
    public String generateValue(EntityManager em) {
        // generate uuid
        return uuid;
    }
}
 
// ...
Person p = manager.get(Person.class, 1);
Company c = manager.get(Company.class, "abff999dd99ddf0a225f");

Maybe a bit longer of an example than you were expecting, but it does cover the material well.  What’s happening here is the Person entity has a standard, “id” primary key.  This follows the same convention that ActiveObjects has been enforcing since the beginning of time (or at least since I started the project).  Company and House are the interesting entities here.

House defines a getHouseID() method of type int which is marked as a PRIMARY KEY as well as being auto-incremented by the database (SERIAL on PostgreSQL, AUTO_INCREMENT on MySQL, etc).  This is the same sort of declaration that you would find if you looked in the source for Entity.  The difference is that House will not contain the “id” field and its PRIMARY KEY will be “houseID”.  The really interesting entity here Company.

Company defines a primary key that is not only a different field, but also an entirely different type.  Also, its value is generated automatically not by the database, but by the application itself.  This is a fairly common use-case in those crazy databases which use UUIDs as primary keys.  Not only does this field define “companyKey” as a different type than INTEGER, but it also ensures that the “companyID” FORIEGN KEY field in the “person” table is also of type VARCHAR.

Another item of note in this example is that the RawEntity interface is parameterized.  This is to allow the get(...) method in EntityManager to stay type-checked, ensuring that the values passed are actually valid primary key values for the entity in question.  Of course, there’s nothing that can be done to ensure that the actual method definition of the primary key is of the proper type.  However, at some point the developer must be trusted to make sure their entity model doesn’t violate the dictates of logic.

Conclusion

With this latest addition to the ActiveObjects feature set, it should be possible to use the ORM with any schema whatsoever.  While AO may still be an implementation of the active record pattern, and thus less powerful than solutions such as Hibernate, there should be no problems applying AO to just about any sane use-case.