Skip to content

JRuby Interop DSL in Scala

24
Mar
2008

JRuby is an amazing bit of programming.  It has managed to rise from its humble beginnings as a hobby project on SourceForge to the most viable third-party Ruby implementation currently available.  As far as I am aware, JRuby is the only Ruby implementation other than MRI which is capable of running an unmodified Rails application.  But JRuby’s innovation is not just limited to a rock-solid Ruby interpreter, it also provides tight integration between Java and Ruby.

There’s a lot of material out there on how to replace Java with Ruby “glue code” in your application.  The so-called “polyglot programming” technique states that we should embrace multiplicity of language in our applications.  Java may be very suitable for the core business logic of the application, but for actually driving the frontend UI, we may want to use something more expressive (like Ruby).  JRuby provides some powerful constructs which allow access to Java classes from within any Ruby application.  For example:

require 'java'
 
JFrame = javax.swing.JFrame
JButton = javax.swing.JButton
JLabel = javax.swing.JLabel
 
BorderLayout = java.awt.BorderLayout
 
class MainWindow < JFrame
  def initialize
    super 'My Test Window'
 
    setSize(300, 200)
    setDefaultCloseOperation EXIT_ON_CLOSE
 
    label = JLabel.new('You pushed the button', JLabel::CENTER)
      label.visible = false
    add label
 
    button = JButton.new 'Push Me'
    button.add_action_listener do
      label.visible = true
    end
    add(button, BorderLayout::SOUTH)
  end
end
 
window = MainWindow.new
window.visible = true

sshot-1  sshot-2 

Not a terribly complex example, but it illustrates some of the major advantages of JRuby.  Notice how clean and concise this code is.  It wouldn’t have been much longer had I done this using Java, but it would certainly have been less readable.  Ruby is absolutely perfect for this sort of use case (driving a UI).

As I said though, there are a myriad of examples showing this sort of thing.  As such, it’s not a very interesting topic for a posting.  What the masses have failed to cover, however, is how to accomplish the opposite: calling from Java into Ruby.

Likely the reason this topic has received less attention is because Java is the language will the veritable zoo of libraries and frameworks.  The amount of effort and research that has been put into Java simply dwarfs the comparative immaturity of the Ruby offerings.  Given the disparity, why would you even want to call into Ruby from Java?  This conclusion seems logical until one remembers that almost any application which uses Ruby for the frontend must actually pass flow control to Ruby at some point.  This means calling some sort of Ruby code.

The Java Way

There is some information available on the JRuby Wiki.  The wiki article really should include the caveat that “some experimentation may be required.”  Sufficient information is available, but it is neither intuitive nor convenient.  From Java, the syntax for executing an arbitrary Ruby statement looks like this:

ScriptEngineManager m = new ScriptEngineManager();
ScriptEngine rubyEngine = m.getEngineByName("jruby");
ScriptContext context = engine.getContext();
 
context.setAttribute("label", new Integer(4), ScriptContext.ENGINE_SCOPE);
 
try {
    rubyEngine.eval("puts 2 + $label", context);
} catch (ScriptException e) {
    e.printStackTrace();
}

It’s a typical Java API: over-bloated, over-designed and over-generic.  What would be really nice is to have a syntax for accessing Ruby objects that is as seamless as accessing Java from Ruby.  I want to be able to call Ruby methods and use Ruby classes with the same ease that I can use Java methods and classes.  In short, I want an internal DSL for Ruby.

Unfortunately, Java is a bit constrained in this regard.  Java’s syntax is extremely rigid and does not lend itself well to DSL construction.  It’s certainly possible, but the result is usually less than satisfactory.  We could certainly construct an API around the the Java Scripting API (JSR-233) which provides more high-level access (such as direct method calls and object wrappers), but it would be clunky and only a marginal improvement over the original.

The good news is there’s another language tightly integrated with Java that has a far more flexible syntax.  Rather than building our JSR-233 wrapper in Java, we can avail ourselves of Scala’s power and flexibility, hopefully arriving at a DSL which approaches native “feel” in its syntax.

The Scala Way

Since we’re attempting to construct a tightly-integrated API for language calls, the most effective route would be to apply techniques already discussed in the context of DSL design.  As always, we start with the syntax and allow it to drive the implementation:

// syntax.scala
import com.codecommit.scalaruby._
 
object Main extends Application with JRuby {
  require("test")
 
  associate('Person)(new Person(""))
 
  println("Received from multiply: " + 'multiply(123, 23))
  println("Functional test: " + funcTest('test_string))
 
  val me = new Person("Daniel Spiewak")
  println("Name1: " + me.name)
  println("Name2: " + (me->'name)())
 
  me.name = "Daniel"
  println("New Name: " + me.name)
 
  println("Person#toString(): " + me)
 
  val otherPerson = 'create_person().asInstanceOf[AnyRef]
  println("create_person type: " + otherPerson.getClass())
  println("create_person value: " + otherPerson.send[String]("name")())
 
  eval("puts 'Ruby integration is amazing'")
 
  def funcTest(fun:(Any*)=>String) = fun()
}
 
class Person(name:String) extends RubyClass('Person, name) {
  def name = send[String]("name")()
  def name_=(n:String) = 'name = n
}

And the associated Ruby code:

# test.rb
class Person
  attr_reader :name
  attr_writer :name
 
  def initialize(name)
    @name = name
  end
 
  def to_s
    "Person: {name=#{name}}"
  end
end
 
def test_string
  'Daniel Spiewak'
end
 
def multiply(a, b)
  a * b
end
 
def create_person
  Person.new 'Test Person'
end

Obviously we’re going to need some heavy implicit type conversions.  The important thing to note is that we don’t see any residue of the Java Scripting API, it’s all been encapsulated by our DSL.  We’ve taken an API which is oriented around single-call, low-level invocations and created a high-level wrapper framework which allows method calls, instantiation and even some form of type-checking.

Starting from the top, we see a call which should be familiar to Rubyists, the require statement.  In our framework, this method call is just a bit of syntactic sugar around a call to eval(String).  This semantics are basically the same as within Ruby directly, with the exception of how Ruby source files are resolved.  Any script file on the CLASSPATH is fair game, in addition to the normal Ruby locations.  This allows us to easily embed Ruby scripts within application JARs, libraries and other Java distributables.

Moving down a bit further, we find a somewhat mysterious call to the curried associate(Symbol)(RubyObject) method.  The purpose of this invocation will become more apparent later on.  Suffice it to say that this step is necessary to allow Scala class wrappers around existing Ruby classes.

On the next line of interest, we see for the first time how the framework allows for seamless Ruby method invocation.  Unlike Ruby, Scala doesn’t allow us to simply handle calls to non-existent methods.  Because of this limitation, we have to be a bit more clever in how we structure the syntax.  In this case, we use Scala symbols to represent the method.  There doesn’t seem to be a terribly good explanation of symbols in Scala, but there’s plenty of information regarding how they work in Ruby.  Since the concepts are virtually identical, techniques are cross-applicable.

The key to the whole “symbols as methods” idea is implicit type conversion.  The JRuby trait inherits a set of conversions which look something like this:

implicit def sym2Method[R](sym:Symbol):(Any*)=>R = send[R](sym2string(sym))
implicit def sym2MethodAssign[R](sym:Symbol) = new SpecialMethodAssign[R](sym)
 
private[scalaruby] class SpecialMethodAssign[R](sym:Symbol) {
  def intern_=(param:Any) = new RubyMethod[R](str2sym(sym2string(sym) + "="))(param)
}

Though we haven’t looked at it yet, it is possible to infer the purpose of the send(String) method.  It’s function is to prepare a call to a Ruby method without actually invoking it.  This distinction allows us to pass Ruby methods around as method parameters, just like standard Scala methods.  The method returned is actually an instance of class RubyMethod[R] (where R is the return type).  Scala allows classes to extend structural types like methods, allowing us to redefine the method invocation semantics for wrapped Ruby calls.

class RubyMethod[R](method:Symbol) extends ((Any*)=>R) {
  import JRuby.engine
 
  override def apply(params:Any*) = call(params.toArray)
 
  private[scalaruby] def call(params:Array[Any]):R = {
    val context = engine.getContext()
    val plist = new Array[String](params.length)
 
    for (i <- 0 until params.length) {
      plist(i) = "res" + i
      context.setAttribute(plist(i), JRuby.resolveValue(params(i)), ScriptContext.ENGINE_SCOPE)
      plist(i) = "$" + plist(i)
    }
 
    evaluate(() => if (plist.length > 0) {
      sym2string(method) + "(" + plist.reduceLeft[String](_ + ", " + _) + ")"
    } else {
      sym2string(method) + "()"
    })
  }
 
  protected def evaluate(invoke:()=>String):R = {
    val toRun = invoke()
    Logger.getLogger("com.codecommit.scalaruby").info(toRun)
 
    JRuby.handleExcept(JRuby.wrapValue[R](engine, engine.eval(toRun, engine.getContext())))
  }
}

The gist of this code is simply to assign every parameter value to an attribute in the Ruby runtime.  Attributes of ENGINE_SCOPE (as defined by JSR-233) are represented as global variables within Ruby.  These variables are named sequentially starting from zero.  (e.g. $res0, $res1, …)  As you can imagine, this technique tends to be a bit of a concurrency killer.  To keep things simple, I decided to completely ignore the issues associated with asynchronous execution.  It is certainly possible to adapt the framework to function in a multi-threaded environment, but I didn’t bother to do it.  (one of the perks of blogging is a license to extreme laziness)

Once these parameters are assigned, the method call is evaluated within the context of the runtime.  This is done by literally generating the corresponding Ruby code (done in the anonymous method) and then wrapping the return value in an instance of RubyObject (if necessary).  Note that the send(String) method does not actually kick-start this invocation process at all.  Rather, it creates an instance of RubyMethod[R] which corresponds to the method name.  This class extends (Any*)=>R, so it may be used in the normal “method fashion” - by appending parentheses which enclose parameters (if any).

Supporting Cast

At this point, it’s worth taking a moment to examine the specifics of the framework class hierarchy.  A number of classes exist to wrap around Ruby objects and methods.  We’ve already seen a few of them (RubyMethod[R] and RubyObject), but it’s worth going into more detail as to their purpose and relation to one another. 

Note that these class names often conflict with existing classes in the JRuby implementation.  This odd coincidence is precipitated by the fact that the framework seems to deal with a lot of the same concepts as the JRuby runtime (go figure).  Rather than obfuscating my class naming to avoid conflict, I just assume that you will either make use of the enhanced Scala import feature (as I have in the implementation), or just avoid using the JRuby internal classes.

image

  • RubyObject - The root of the object hierarchy.  This abstract class is designed to encapsulate the core functionality of the generic object (roughly: send, -> and eval) as well as containing all of the implicit type conversions.  Most of the syntax-defining magic happens here (more on this later).
  • JRuby - This is the primary type interface between the developer and the framework.  Classes which wish to make use of Ruby integration must inherit from this trait.  This is where the Logger (for executed statements) is initialized and deactivated.  Within the corresponding object, all of the backend resources are managed.  This is where the actual ScriptEngineManager instance lives, as well as a set of utility methods to handle wrapping and unwrapping of framework-specific objects.
  • RubyWrapperObject - This implementation of RubyObject is designed to wrap around instances which already exist within the Ruby interpreter.  For example, if a Ruby method returns an instance of ActiveRecord::Base, it will be represented in Scala by a corresponding instance of RubyWrapperObject.  Note that objects which are equivalent in the Ruby interpreter are not guaranteed to be pointer-equivalent.  However, the equals(Object) method is well-defined within RubyObject, thus comparisons between RubyObject instances will return sane results.  The == method in Scala is defined in terms of equals(Object), so existing code will behave rationally.
  • RubyClass - With the exception of the JRuby trait, this is likely the only class within the framework which the developer will have to reference explicitly.  This class allows developer-defined Scala classes to wrap around existing Ruby classes, providing type-safe method calls and even extended functionality.  More on this feature later.
  • RubyMethod - We’ve already seen how this serves as a wrapper around calls to Ruby methods.  However, its default implementation assumes that the method is defined in the global namespace.  This is impractical for many method calls (such as dispatch on an object).
  • RubyInstanceMethod - This class solves the problem of object dispatch with RubyMethod.  All of the core functionality is identical to its superclass with the exception of the generated Ruby code.  Instead of just generating a method call passing parameters, this class will generate a method call on a given Ruby object.  Thus, this class depends upon RubyWrapperObject which maintains a reference to a corresponding Ruby instance.

Alternative Dispatch

Not every method call is made on the enclosing scope.  Sometimes it is necessary to call a method in an object to which you have a reference.  For example, a method may return an instance of a some Ruby class.  This instance will be automatically wrapped by a Scala instance of RubyWrappedObject.  Since this Scala class doesn’t actually define any methods which correspond to the Ruby class, it is necessary to once more utilize the “symbols as methods” trick in method dispatch.  There are two ways to call a method on an object like this: the send[R](String) method (where R is the return type), and the -> (arrow) operator.

Using the arrow operator is a lot like normal method calls, except with symbols instead of method names.  Just like dispatch on the enclosing scope, the call is converted into an instance of RubyMethod (actually, an instance of RubyInstanceMethod) which can then be used as a standard Scala method.  The difference between using arrow and dispatching on the enclosing scope is the syntax must be a little more contrived.

Parentheses have the second-highest priority of all the Scala operators (the dot operator (.) has the highest).  This means that if we simply “follow our nose” where the syntax is concerned, we will arrive at an order of invocation which leads to an undesirable result.  Consider the following sample:

val obj = 'create_person()
obj->'name()

The first call is a standard dispatch on the enclosing scope.  The second call is what is interesting to us.  Reading this line naturally (at least to old C/C++ programmers) we would arrive at the following sequence of events:

  1. Get a reference to the name method from the instance contained within obj
  2. Invoke the method, passing no parameters

Unfortunately, this is not how the compiler sees things.  Because parentheses bind tighter than the arrow operator, it actually resolves the expression in the following way:

  1. Get a reference to the name method contained within the enclosing scope
  2. Invoke the method, passing no parameters
  3. Invoke the -> method on the instance within obj, passing the result of name as a parameter

This is obviously not what we wanted.  Unfortunately, there’s no way to make the arrow operator bind tighter than parentheses.  This is a good thing from a language standpoint, but it causes problems for our syntax.

The solution is to enclose any “arrow dispatch” statement within parentheses so as to force the order of evaluation:

val obj = 'create_person()
(obj->'name)()

It looks a bit weird, but it’s the only way Scala will allow this to work.  This call now evaluates properly, calling the name method on the obj instance, passing no parameters.

There’s actually another problem associated with arrow dispatch in our DSL: Scala already has an implicit meaning for the arrow operator.  The following sample should look familiar to those of you who have worked with Scala in other applications:

val numbers = Map(1 -> "one", 2 -> "two", 3 -> "three")

By default, Scala defines the arrow operator as an alternative syntax for defining 2-tuples.  This is good for most things, but bad for us.  What we want is to define a new implicit type conversion which converts Any into a corresponding instance of RubyWrappedObject.  This would allow us to satisfy the syntax given above.  However, Scala’s 2-tuple syntax already defines an implicit type conversion for the Any type which deals with the arrow operator.  Rather than examining the context to attempt to disambiguate, the Scala compiler simply gives up and prints an error stating that the implicit type conversions are ambiguous.  This poses a bit of a problem and nearly killed the arrow operator idea in design.

The solution is actually to override Scala’s built-in conversion by defining our own conversion with the same name and signature but which provides us with the option of using our own arrow operator definition.  The behavior we want is to allow normal use of the arrow operator when dealing with Any -> Any, but convert to RubyWrappedObject and dispatch when dealing with Any -> Symbol.  After a little digging through the Scala standard library, I arrived at the following solution (defined in RubyObject):

implicit def any2ArrowAssoc[A](a:A) = new SpecialArrowAssoc(a)
 
private[scalaruby] class SpecialArrowAssoc[A](a:A) extends ArrowAssoc(a) {
  def ->(sym:Symbol) = (a match {
    case obj:RubyObject => obj
    case other => new RubyWrapperObject(other)
  })->sym
}

Notice that we extend Scala’s pre-existing ArrowAssoc[A] class (which handles the special 2-tuple syntax) and then overload the -> method to work differently with symbols.  This code now does precisely what we need.  By introducing this extra layer of indirection, as well as by overriding Scala’s existing conversion, we’re able to support the arrow syntax as shown in the above examples.

Sending Messages

There is one final form of dispatch which allows typed return values: send[R](String).  This is actually the method to which all the other dispatch forms delegate (as it is the most general).  This method is very similar to the Ruby send method which allows Smalltalk-style message passing on arbitrary objects.  The really important thing about this method though is that it will automatically cast the return value from the method to whatever type you specify, allowing you to define type-safe wrappers around existing Ruby methods in Scala:

def multiply(a:Int, b:Int) = send[Int]("multiply")(a, b)
 
val result:Int = multiply(123, 23)

send is effectively defined as a curried function since it takes a method name as a parameter and returns an instance of RubyMethod as a result.  This mimics the behavior of dispatch with symbol literals in that you can use send to generate type-safe partially-applied functions for corresponding Ruby methods.

Note that send could just as easily have taken a symbol as a parameter, rather than a string.  However, the metaphor throughout the DSL is “symbols as methods”, thus string was used to avoid logical conflict.  Scala itself was perfectly happy passing symbol literals around in addition to treating them as methods.

Class Wrapping

The final bit of code in the example now so far above us serves as a sample of how one might wrap an existing Ruby class within Scala.  Person is actually a class defined in Ruby (as you can see from the Ruby sources).  It has a read/write attribute, name, as well as an overridden to_s method.  RubyObject already contains the logic for handling calls to toString() and proxying them to Ruby’s to_s, but the name attribute must be handled explicitly in code.

The goal is basically to provide a type-safe wrapper around the Person Ruby class.  We could just as easily dispatch on the automatically wrapped instance of RubyWrappedObject using either syntax described above, but an explicit wrapper is a bit nicer.  The compiler can check things for us, and we can even add methods to the class (at least, as far as Scala is concerned) in true Ruby “open class” style.  All that is necessary to accomplish this wrapper is to extend RubyClass and to define the delegating wrapper methods:

class Person(name:String) extends RubyClass('Person, name) {
  def name = send[String]("name")()
  def name_=(n:String) = 'name = n
}

We specify which Ruby class we are wrapping as the first parameter in the constructor for RubyClass.  The parameters which follow are passed directly to the constructor of the corresponding Ruby class.  This Ruby constructor is invoked automatically, instantiating the corresponding wrapped Ruby object in the background.  Notice that we specify the name of the Ruby class using a symbol.  This is the one place in the framework that we break with the “symbols as methods” metaphor.  The consequence is a nice, clean syntax for Ruby class wrapping.  Unfortunately, it also means that wrapping a class within a non-included namespace (e.g. ActiveRecord::Base) can be a little clunky.  The only way to do it is to explicitly invoke the Symbol(String) constructor.  (this is required because Scala symbols can only contain alpha-numerics and underscores)

Once we have our wrapped class signature, it’s easy to define the delegate methods.  Scala encourages a blurring of field and method, similar to Ruby.  As such, it supports a very Ruby-esque syntax for accessor/mutator pairs.  This makes the wrapped syntax just a bit nicer.  For the accessor, we make a call to the send method, specifying the return type necessary for the wrapper.  The mutator allows us to be a bit more creative.

We don’t really need type-safe return values for a mutator.  We would normally just set the return type as Unit and ignore the result.  Thus we can once again use the symbol dispatch syntax.  Notice that this time we’re not directly treating a symbol as a method.  We’re apparently assigning a value to the symbol using the = operator (corresponds to the operator= assignment operator in C++).  This is possible through a separate implicit type conversion which generates a one-off utility instance:

private[scalaruby] class SpecialMethodAssign[R](sym:Symbol) {
  def intern_=(param:Any) = new RubyMethod[R](str2sym(sym2string(sym) + "="))(param)
}

As you can see, all this method does is generate a new symbol which includes the ‘=’ character and returns the result of dispatching on the corresponding Ruby method.  Note that mutators in Ruby are defined as “=“, thus appending “=” to the method name is the appropriate behavior.

Return Value Wrapping

There’s actually a slight problem involved in allowing Scala wrappers around existing Ruby types.  Well, not so much a problem as an inconsistency.  The problem is simply this: if a Ruby method creates an instance of a Ruby class for which there is a Scala wrapper and returns this value through the framework into Scala, one would expect this value would be wrapped into an instance of the Scala wrapper.  If you look in the example far above, there is an example of this in the create_person method.  The method creates an instance of Ruby class Person and returns it as a result.

Somehow, the framework must identify that there is a corresponding Scala wrapper and then properly create an instance.  This actually poses something of a dilemma in two ways.  Number one, Scala has no equivalent to Ruby’s ObjectSpace, so there’s no way to get a comprehensive list of all classes which have been defined.  Even if we could get this list, the corresponding Ruby class is specified in the constructor parameters to RubyClass, so there’s no way to obtain the information statically from outside the class.  Number two, we have to somehow create an instance of the Scala wrapper class without creating a corresponding instance of the wrapped Ruby class (since we already have one).  This means we need some sort of override in the RubyClass constructor.

The best solution to all of these problems is to introduce the associate method.  The usage is demonstrated at the top of the example where we associate the Person Ruby class with the Person Scala wrapper class.  More specifically, we associate the Ruby class with a pass-by-name parameter which defines how to instantiate the Scala class.  This is an important distinction as it solves our second problem of instance creation.  The framework has no way of knowing what parameters must be passed to the Scala wrapper constructor, so the instantiation itself must be passed:

associate('Person)(new Person(""))

As I mentioned previously, this is a pass-by-name parameter which means that it will not be immediately evaluated, but rather on-demand somewhere in the body of associate.  The associate method actually takes this value and wraps it in an anonymous method which invokes the instantiation each time a value of Ruby type Person must be wrapped.  Just prior to invoking the constructor, an override is put in place within the RubyClass singleton object (not shown in the class hierarchy) to prevent the creation of a corresponding Ruby instance.  This is what allows the new instance of Scala class Person to correspond with an existing Ruby value.  Here again we’re sacrificing concurrency for a hacky work-around to a complex problem.  Any sort of “proper” implementation would have to solve this problem in a more elegant way.

It Never Ends!

This post, that is.  There’s so much more I could ramble on about (I never even talked about how exceptions are handled), but this entry is already far too long.  Hopefully the material presented here only serves to whet your appetite for slicker JRuby-Scala integration and all the benefits it can bring.  I’ve packaged up the framework presented here as a downloadable archive.  The package includes the Ruby engine for the Java Scripting API as well as a jar-complete build made from the JRuby SVN.  The project may work with JRuby 1.0, but I doubt it.  Anyway, JRuby 1.1 is due shortly, so why bother.  Remember that this is extremely untested and very experimental.  (I did warn you about the concurrency issues, right?)  If this is interesting to people, I may do a proper release into an OSS project somewhere.  For right now, I just don’t have the time.  :-(

I hope this entry gives you an idea of what’s involved in Scala DSL implementation, as well as an idea of where such a technique may be useful in your own projects.  After all, what would be better than everyone being able to write their own Rails-killer and define highly fluid APIs!

Should ORMs Insulate Developers from SQL?

25
Feb
2008

This is a question which is fundamental to any ORM design.  And really from a philosophical standpoint, how should ORMs deal with SQL?  Isn’t the whole point of the ORM to sit between the developer and the database as an all-encompassing, object oriented layer?

A long time ago in an office far, far away, a very smart cookie named Gavin King got to work on what would become the seminal reference implementation for object relational mapping frameworks the world over (or so Java developers would like to think).  This project was to be bundled with JBoss, possibly the most popular enterprise application server, and would support dozens of databases out of the box.  It was to offer heady benefits such as totally object-oriented database access, transparent multi-tier caching and a flexible transaction model.  At its core though, Hibernate was design to resolve a single problem: application developers hate SQL.

No really, it’s true!  Bread-and-butter application developers really dislike accessing data with SQL.  This has led to endless conflict (and bad jokes) between application developers and database administrators.  Often times the developer team would write a set of boilerplate lines in Java and then copy/paste these arbitrarily throughout their code, swapping in the relevant query as supplied by the DBA.  For obvious reasons, this would become very hard to maintain and just intensified the bad blood between developer and database.

If you think about it though, it’s a bit odd that this intense dislike would mutate from just hating the insanity of JDBC to hating JDBC, SQL and RDBMS in general.  SQL is a very nice, almost mathematical language which allows phenomenally powerful queries to be expressed simply and elegantly.  It abstracts developers from the headache of database-specific hashing APIs and algorithms which are almost filesystems in complexity.  The language was designed to make it as easy as possible to get data out of a relational database.  The fact that this effort backfired so utterly is a source of endless confusion to me.

But irregardless, we were talking about ORMs.  When it was first introduced, Hibernate held out the promise that developers would never again have to wade knee deep through a sea of half-set SQL.  Instead, developers would pass around POJOs (Plain Old Java Object(s)), modifying their values like any other Java bean and then handing these objects off to the data mapper, which would handle the details of persistence.  Furthermore, Hibernate promised that developers would never again have to worry about which databases support which non-standard SQL extensions.  Since developers would never have to work with SQL, anything database-specific could be handled within the persistence manager deep in the bowels of Hibernate itself.

This all seems lovely and wonderful, but there’s a catch: it doesn’t work so well in practice.  Now before you stone me, I’m not talking about Hibernate specifically now, but ORMs in general.  It turns out to be completely impossible to interact with a relational database solely through an object-oriented filter.  This is easily seen with a simple example:

SELECT * FROM people WHERE age > 21 GROUP BY lastName

How in the world are you going to represent that in an object model?  Sure, maybe you can provide a little abstraction for the query details, but it starts to get complex if you try to handle things like grouping non-declaratively.  The developers working on Hibernate quickly realized this problem and came up with an innovative solution: write their own query language!  After all, SQL is too confusing, so why not invent an entirely new query language with the “feel” of SQL (to keep the DBAs happy) but without all of the database-specific wrinkles?

This query language is now called “HQL”, and as the name implies, it’s really SQL, but not quite.  Here’s how the aforementioned example would look in HQL (disclaimer: I’m not a Hibernate expert, so I may have gotten the syntax wrong):

FROM Person WHERE :age > 21 GROUP BY :lastName

Remarkably similar, that.  Executing this query in a Hibernate persistence manager yields an ordered list of Person entities pre-populated with data from the query.  It seems to make a lot of sense, but there are a number of problems with this approach.  First, it requires Hibernate to literally have its own compiler to translate HQL queries into database-specific SQL.  Second, it hasn’t really solved the core problem that many developers have with SQL: it’s a declarative query language.  As you can see, HQL is really just SQL in disguise, so it really doesn’t eliminate SQL from your database access, just dresses it in a funny hat.

Other ORMs have appeared over the years, taking alternative approaches to the problem of object-relational mapping, but none of them quite eliminating the query language.  Even DSL-based ORMs like ActiveRecord fail to remove SQL entirely:

class Person < AR::Base; end
 
Person.find(:all, :conditions => 'age > 21', :group => 'lastName')

It’s sort of SQL-free, but you can still see bits and pieces of a query language around the edges.  In fact, what ActiveRecord is actually doing here is building a proper SQL query around the SQL fragments which are passed as parameters.  It’s a system which is ripe for SQL injection, but surprisingly leads to very few problems in real-world applications.  This is the approach which is also taken by ActiveObjects for its database query API.

So ORMs in and of themselves seem to have failed to entirely eliminate SQL from the picture, but what about other frameworks?  There are a few quite recent efforts which seem to have nearly succeeded in eliminating the direct use of SQL completely from application code.  Ambition is perhaps the best (and most clever) example of this, though others like scala-rel are catching up fast.  Ambition is designed from the ground up to interact naturally with ActiveRecord, so the two combined perhaps represent the first “true” ORM: one which does not require the developer at any point to deal with any SQL whatsoever.

But was it really worthwhile?  As clever as things like Ambition are, is it really that much easier than just writing queries in SQL?  As Nathan Hamblen so eloquently said (when referring to a totally different topic):

…is the end of the ORM rainbow.  You get there, throw yourself a party and realize that important things are broken.

A quote taken out of context perhaps, but I think it applies to the “cult of SQL genocide” with as much validity.  In the end, by denying yourself access to the powerful and well-understood mechanism that is SQL, you’re just crippling your own application and forcing yourself to write more code instead of less.

So what’s the “right” approach?  Is there a happy medium between ActiveRecord+Ambition and full-blown SQL on Rails?  I think so, and that is the approach I have been trying to implement with ActiveObjects.  As I’m sure you know, ActiveObjects takes a lot of its inspiration from ActiveRecord, so the syntax for querying the database is very similar:

EntityManager em = ...
em.find(Person.class, Query.select().where("age > 21").group("lastName"));
 
// ...or
em.find(Person.class, "name > 21");   // no grouping

You still have the full power of SQL available to you.  You can still write complex, nested boolean conditionals and funky subqueries, but there’s no longer any need to be burdened with the whole of SQL’s verbosity.  As with vanilla ActiveRecord, this code intends to be a bit of a hand-holder, shielding innocent application developers from the fierce world of RDBMS.

Is this the right way to go?  I’m honestly not sure.  I’ve met a lot of developers that would give their left eye to never have to look at another SQL statement again (for developers already missing a right eye, this isn’t much of a stretch).  On the other hand, there are purists like myself who revel in the freedom afforded by a powerful, declarative language.  It’s hard to say which path is better, but at the end of the day, it’s really the question itself that matters.  Giving application developers the choice to select whichever approach they feel is most appropriate, that is the solution.

Adding Type Checking to Ruby

6
Feb
2008

What’s the first thing you think of when you consider the Ruby Language?  Dynamic types, right?  Ruby is famous (infamous?) for its extremely flexible type system, and as a so-called “scripting language”, the core of this mechanism is a lack of type checking.  This feature allows for some very concise expressions and a great deal of flexibility, but sometimes makes your code quite a bit harder to understand.  More importantly, it weakens the assurances that a certain method will actually work when passed a given value.

Several different solutions have been proposed to workaround this limitation.  The canonical technique involves intensifying tests and increasing test coverage.  Ruby has some excellent unit test frameworks (such as RSpec) which serve to ease the pain associated with this approach, but no matter how you slice it, tests are a pain.  Having to rely on tests to take the place of type checking in the code assurance process can be extremely frustrating.

Another, less common technique is to simply perform dynamic type checks within the method itself.  Like so:

def create_name(fname, lname)
  raise "fname must be a String" unless fname.kind_of? String
  raise "lname must be a String" unless lname.kind_of? String
 
  fname + " " + lname
end

This code explicitly checking the dynamic kind of the parameter values to ensure that they are of type or subtype of String.  The issues with this sample should be relatively obvious.

Primarily, it’s ugly!  This sort of repetitious, boiler-plate conditional checking is exactly the sort of thing Ruby tries to avoid.  What’s more is the added bulk of all of these repetitive checks (assuming you perform one check per-parameter per-method) because far more unwieldy than just improving the rspec test coverage.

While manually type checking may be a bad solution syntactically, it’s on the right track conceptually.  What we really want is some sort of assertion that the parameters are of a certain type, but that won’t overly bloat our existing code.  We need some sort of framework that will “weave in” (think AOP) its type assertions without getting in the way our our algorithms.

Well it turns out that someone’s already done thisEivind Eklund kindly pointed me to his type checking framework in a comment on a previous post.  The basic idea is to perform the type checking assertions, but to factor the work out into an API encapsulated by an intuitive DSL.  So rather than performing all those nasty unless statements as above, we could simply do something like this:

typesig String, String
def create_name(fname, lname)
  fname + " " + lname
end

It’s really as simple as that.  Passing the type values to the typesig method just prior to a method declaration give the cue to the Types framework to perform some extra work on each call that method.  Now we have the runtime assurances that the following code will not work (with a very intuitive error message):

create_name("Daniel", 123)

Will produce the folling output:

ArgumentError: Arg 1 is of invalid type (expected String, got Fixnum)

But the fun doesn’t stop there.  Ruby encourages the “duck typing” pattern, where algorithm developers concern themselves not with what the value is but rather what it does.  This means that the type checking really should be done based on what methods are available, not just the raw type.  It turns out that the Types framework supports this as well:

class Company
  def name
    "Blue Danube"
  end
end
 
class Person
  def name
    "Daniel Spiewak"
  end
end
 
typesig String, Type::Respond(:name)
def output(msg, value)
  puts msg + " " + value.name
end
 
c = Company.new
p = Person.new
 
output("The company name is: ", c)
output("The person is: ", p)
 
output("The programmer is: ", "a genius")    # error

Types can check not only the kind of the object but also to what methods it responds.  This is crucial to enabling its adoption into modern Ruby code bases, many of which rely heavily on this “duck typing” technique.

You can think of the Types framework just like another layer in your testing architecture.  Obviously it’s not performing any sort of static type checking (since Ruby has no compile phase).  All it’s doing is providing that extra certainty that you’re never passing something weird from somewhere in your code, something that would break your algorithm.

So what’s the catch?  Well, obviously you need to have the Types framework installed.  It’s not as easy as just typing gem install types either, since the framework actually predates Ruby Gems.  You’ll have to download the framework and then copy around the types.rb file yourself.  But this is just deployment semantics.  The more interesting issue are the limitations of the code itself.

As far as I can tell, the only restriction on the framework is that it must be used within a proper class, not in the root scope.  This means that all of my examples above would have to be enclosed in a class, rather than just copy-pasted into a .rb file and run in place.  But other than this one limitation, the framework is incredibly flexible.  I really haven’t shown you the seriously interesting stuff in terms of the API (there are more examples at the top of the types.rb file).  In many ways, Types is actually more powerful than any static type checking mechanism could be (yes, I’m even including Scala in that evaluation).

I haven’t had a chance to use Types on any serious project myself, but I can see tremendous potential, particularly for companies with large-scale Ruby/Rails deployments or even smaller projects looking for just a bit tighter code assurance.  As far as I’m concerned, there shouldn’t be a non-trivial Ruby project attempted without this lovely library, Rails or no Rails.

Databinder Gets ActiveObjects Support

2
Feb
2008

Nathan Hamblen (otherwise known as n8han from Coderspiel) has been putting a lot of work recently into his persistence interop framework, Databinder.  Interestingly enough, some of this work has involved ActiveObjects.  Nathan has taken some of the code I did for the wicket-activeobjects module, adapted it to Databinder and enhanced it 10 fold.  As of right now in the Databinder SVN, it is possible to achieve complicated interaction between your Wicket views and the ActiveObjects database model without any undo hassle. (announcement)

For those of you who don’t know, Databinder can be thought of as a compatibility layer between Wicket and Hibernate.  The idea being to smooth some of the rough spots that can lead to extensive boiler plate when using Hibernate and Wicket together for an application.  Yes, I know there’s a wicket-extras project which has the same goal, but in my opinion, Databinder does it better.  Databinder acts as a completely natural layer between Wicket and Hibernate, not between you and either of the two frameworks.  This means that you can write code naturally which uses Wicket and write vanilla-standard Hibernate code without any framework weirdness introduced by Databinder.  In short, the framework does precisely what it needs to when it needs to and stays out of the way for the rest of the time.

Anyway, there’s not much more I can say about the framework, it really speaks for itself.  I’ll I can tell you is that if you’re using Wicket and ActiveObjects together without Databinder, you’re really missing out.  ActiveObjects integration is included in the unreleased version 1.2, currently only obtainable directly from the SVN.  So fire up your SVN client (svn://databinder.net/databinder/trunk), run off a Maven build and get hacking!

Wicket 1.3 Released…Broken Guice and All

3
Jan
2008

I usually try to avoid such negative topics, but this time I really couldn’t help myself.  Once in a while something in the “current events” section of the blogosphere will bug me enough to merit a slam post.  The “support” for Google Guice in the latest stable release of Wicket is one of those things…

To start with, the good news: Martijn Dashorst has announced the release of Apache Wicket 1.3!  This really is a great release all-around and the guys in the band deserve a round of applause.  This release fixes the number one “bug” with Wicket: it’s rather odd package namespace.  (wicket.*)  :-)  Welcome to the land of happy packages and tired fingers.

Wicket is really starting (or just proceeding at an accelerated rate) to feel like a rock solid, production-ready framework.  I’ve used it quite a bit over the last few years, and I’ll say flat-out that I don’t think any framework matches it for productivity and maintainability (that includes Rails, dynlang notwithstanding).

One of the new features in 1.3 (important enough to merit inclusion in Martijn’s 20-odd points) is support for Google Guice dependency injection.  This is a huge deal for those of us who have nominated Guice for the “cleverest framework of the decade” award.  Support for Guice in Wicket makes it possible to utilize dependency injection right in your page classes (where it’s most needed).  Wicket has had similar support for Spring for a while now, but it was only recently that Al Maw got the chance to refactor the guts out into the wicket-ioc project and thus enable support for alternative DI frameworks like Guice.

This all seems well and good, but unfortunately Wicket’s Guice support is not quite up to par with the rest of the framework.  I tried the support a while back in beta4 and ran headlong into a fairly serious problem.  The following code doesn’t work:

public class MyModule extends AbstractModule {
    // ...
 
    @Override
    public void configure() {
        EntityManager manager = new EntityManager(uri, username, password);
        bind(EntityManager.class).toInstance(manager);
    }
}

This is fairly standard Guice configuration code.  All that’s happening here is I’m binding all injected fields of type EntityManager to a given instance.  Of course the classic use of DI is to have it instantiate the injected values based on classname (or class literal in Guice’s case).  However, this panacea of IOC breaks down when working with classes which lack default constructors (like EntityManager).  This is why Guice enables developers to bind classes to instances (as I’m doing above).  The problem is this code will crash when executed using wicket-guice.

I opened an issue in the Wicket JIRA back in November when I first identified the bug (WICKET-1130).  I even included some simple example code I could use to repeat the problem!  Since then the issue has been reassigned and bumped back in fix version twice, all without any word on if the problem is being looked at or how soon I could expect a solution.  Now I know the Wicket devs are busy and all with tons of more sweeping issues and last-minute polish for the 1.3 release, but this is pretty absurd.

Honestly, if this were a trivial edge-case that only effected me and my neighbor’s cow, I wouldn’t put up much of a fuss.  But the fact is, this is something so broad and repeatable that it will touch just about anyone who seriously uses Guice with Wicket.  Even if there wasn’t time to fix the problem before the 1.3 release, I would hope there would be some sort of prominent notice (”KNOWNBUGS” anyone?) included with the distributable.  Unfortunately the only reference to the problem I was able to find on the Wicket site was an obscure wiki article (well written though) done by Uwe Schäfer.  All this entry serves to do is further aggravate me since it means someone else has run headlong into this problem and been annoyed by it enough to write an article (still without receiving response from the Wicket core devs).

Uwe does propose a workaround (add a protected no-arg constructor to the injected class), but that’s impractical for my use case (and if I ran into it, you can bet your boots half a dozen other people did too).  I’m certainly not going to randomly add broken constructors to the ActiveObjects API, and I wouldn’t even have the option to consider doing so except for the fact that I’m the developer on the class I was trying to bind.  If I was trying to bind an instance from a third-party library, I’d be out of luck completely.

I’m very disappointed in the Wicket project for dropping the ball on this issue.  I really have the utmost respect for those guys, which is why it’s so surprising to see something like this happen.  As it stands, WICKET-1130 is slated for 1.3.1, but given its track record of reassignment I’m not holding my breath.  Hopefully this posting will serve as a more prominent warning for those considering using Wicket and Guice together in their project.