Skip to content

XMLBuilder: A Ruby DSL Case Study

10
Mar
2008

XML is probably the most ubiquitous and most recognizable format in the modern development landscape.  It’s simple power in representing hierarchically structured data has made it the standard for representing everything from Word documents to databases.  It’s also one of the most verbose and meta-rich syntaxes known to man.

So in that sense, XML is a mixed blessing.  Its flexibility and intuitive nature allows developers to store just about any data in a human readable, easy-to-debug manner.  Unfortunately, its verboseness often makes generating the actual XML a very frustrating and boring foray into the land of boiler-plate.  Various techniques have been developed over the years to smooth this process (e.g. manipulating a DOM tree or reflectively marshalling objects directly to XML), but on the whole, generating XML in code is just as annoying as it has always been.  We’ve all written code like this in the past:

public String toXML() {
    final String INDENT = "    ";
    StringBuilder back = new StringBuilder(
            "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n\n");
 
    back.append("<person name=\"").append(getName()).append("\">\n");
    for (Book book : getBooks()) {
        back.append(INDENT).append("<book>").append(book.getTitle()).append("</book>\n");
    }
    back.append("</person>");
 
    return back.toString();
}

Not the most pleasant of algorithms.  Oh there’s nothing complex or challenging about the code, it’s just annoying (as String manipulation often is).  Things would get a little more interesting if we actually had some sort of recursive hierarchy to traverse, but even then it would still be pretty straight-forward.  XML generation is tedious grudge-work to which we all must submit from time to time.

Domain Specific Languages

There’s a new wave in programming (especially in the communities surrounding dynamic languages like Ruby and Groovy) called “Domain Specific Language”, or DSL for short.  The DSL technique has really been around for a long time, but it’s just now finding its way to the mainstream, “Citizen Joe” developers.  This is because for decades, domain specific languages have been separate languages unto themselves, with their own syntax, quirks, and utter lack of tool support.

For example, if a company wanted to specify their business logic in a more readable format using a DSL, the would have to spend months, sometimes years of effort to build an entirely new language, just for the one application.  While there was no need to generate a truly flexible and extensible syntax (or even one which was Turing-complete), such efforts would still require an enormous amount of work to be put into trivialities like parsers, AST walkers and core libraries.  Obviously this meant that DSLs were extremely uncommon.  There are very few use-cases for something which requires so much and gains you so little.

This variety of domain specific language is called an “External DSL”.  This name stems from the fact that the language is completely independent and external to other languages.  The innovation which made the DSL attainable for the common-man is that of the “Internal DSL”.

People often struggle to precisely define internal DSLs.  In the broadest sense, an internal DSL is a language API carefully structured to satisfy a particular syntax when used in code.  There is no syntax parsing involved in implementing an internal DSL.  Rather, effort is focused on the API itself.  The more flexible the syntax of the underlying language, the more powerful and potentially intuitive the syntax of the DSL can be.  It is for this reason that languages such as Ruby, Python, Groovy and company are often used to implement internal DSLs.  These languages are defined by extremely flexible and dynamic syntax, lending themselves perfectly to such efforts.

With this in mind, it should theoretically be possible to design an “API” for XML generation that’s simple and intuitive.  The DSL could be implemented using Ruby, though actually sufficiently dynamic language could do.  Implementing an internal DSL is a notoriously difficult task, so perhaps a step-by-step walk through is in order.

Getting Started

The very first step in creating an internal DSL is to design the syntax.  Similar to how test-driven development starts with a set of unit tests and builds functionality which satisfies the tests, DSL development starts with a few code samples and builds an API to satisfy the syntax.  This step will guide all of the code we write as we implement the DSL.

One of the primary goals for our DSL syntax should be to reflect (as much as possible) the structure and essence of the XML output in the generating code.  One of the major shortcomings of the ad hoc “string concat” XML generation technique is an utter lack of logical structure in the code.  Sure your algorithm may be nicely organized and formatted, but does it really reflect the actual structure of the XML it’s generating?  Probably not.  Another major goal should be brevity.  XML generating code is extremely verbose, and the whole idea behind writing a DSL for XML generation is to elevate some of this hassle.

So, we’ve got brevity and logical structure.  As minor goals, we also may want to spend some effort making the syntax versatile.  We don’t want the algorithm to perform incorrectly if we try to generate a document with the < character in some field.  With that said however, we don't need to be overly concerned with flexibility.  Yes, our DSL should be as capable as possible, but not to the detriment of brevity and clarity.  These concerns are paramount, functionality can take a back seat.

With these goals in mind, let’s try writing a static syntax for our Person/Book example above. Bear in mind that every construct used in the syntax must be valid Ruby which the interpreter will swallow.  How it will swallow is unimportant right now.  For this step, it’s helpful to remember that the ruby -c command will perform a syntax check on a source file without actually attempting to run any sort of interpretation.

xml.person :name => 'Daniel "Leroy" Spiewak' do
  book do
    'Godel, Escher, Bach'
  end
  book do
    'The Elegant Universe'
  end
  book do
    'Fabric of the Cosmos'
  end
  book do
    "The Hitchhiker's Guide to the Galaxy"
  end
  book    # book with no title
end
 
puts xml  # prints the generated XML

Note that this is all static, there’s no dynamically generated data to muddle the picture.  We can worry about that later.

This code sample gives us a rough idea of what the syntax should look like.  According to ruby -c, it’s valid syntax, so we’re ready to proceed.  Everything looks fairly unencumbered by meta-syntax, and the logical structure is certainly represented.  Now we do a quick mental pass through the syntax to ensure that all of our functional bases are covered.  In the example we’ve used attributes, nested elements and text data.  One of our attributes is making use of illegal XML characters (double quotes).  We haven’t tried mixing elements and text yet, but it’s fairly obvious how this could be done.

Strictly speaking, we haven’t rigorously defined anything.  What we have done is given ourselves a framework to build off of.  This sample will serve as an invaluable template to guide our coding.

Parsing the Syntax

Remember I said that internal DSLs do not involve any syntax parsing?  Well, in a way I was wrong.  The next thing we need to do is walk through the syntax ourselves and understand how Ruby will understand it.  This step requires some fairly in-depth knowledge of Ruby’s (or whatever language you’re using) syntax.  Hopefully the following diagram will help to clarify the process.

image1

I’ve annotated the areas of interest in cute, candy colors to try and illustrate roughly what the Ruby syntax parser will see when it looks at this code.  It’s important to understand this information since it is the key to translating our contrived syntax into a rigorous API.

For the sake of this discussion, I’m going to assume you have a working knowledge of Ruby and have already recognized some of the more basic syntax elements within the sample.  For example, the do/end block is just that, a Block object (roughly the Ruby equivalent of a closure).  The annotated String literal is an implicit return value for the inner block (the last statement of a block is implicitly the return value).  This syntax employs a Ruby “trick” which allows us to drop the return statement (an important trick, since you can’t return from a block anyway).

Anyone familiar with Ruby on Rails should be aware of Ruby’s convenient Hash literal syntax employed for the element attributes.  As annotated, this literal is being passed as a method parameter (Ruby allows us to drop the parentheses).  Here again we’re using little oddities in the Ruby syntax to clean up our DSL and reduce unnecessary cruft.

You should also be familiar with the blurred line between variable and method so common in Ruby.  As annotated, the xml syntax element could be either a local field, or a method that we’re invoking without the parentheses.  To clear up this point, we need to refer to the full sample above.  In no place do we declare, nor do we provide for the declaration of a local variable xml.  Thus, xml() will be a method within our API available in the global scope.

Now we get to the really interesting stuff: those mysterious undefined methods.  As with other dynamic languages, Ruby allows method invocations upon non-existent methods.  To someone from a static language background, this seems rather peculiar, but it’s perfectly legal I assure you.  In a sense, it’s much like the “methods as messages” metaphor employed by Objective-C and Smalltalk.  When you call the method, you’re just broadcasting a message to the object, hoping someone answers the call.

In Ruby, when a method is called which has no corresponding handler, a special method called method_missing is invoked.  This method is what allows us to handle messages even without a pre-defined method body.  ActiveRecord makes extensive use of this feature.  Every time you access a database field, you’re calling a non-existent method and thus, method_missing.

So based on the first non-existent method we’re calling (person(…)), we can already say something about our API.  Somewhere in the global scope, we’re going to need to have a method called xml().  This method will return an instance of a class which defines method_missing and so can handle our person(…) invocation.  It seems this implementation of method_missing will also need to optionally take a Block as the final parameter, allowing it to pass execution on to its logical children.  When referring back to our original sample, the final line seems to indicate that the instance must implement the to_s() method, allowing puts() to implicitly convert it to a String.

Implementation

All that work and we still haven’t even written a solid line of code.  What we have done though is given ourselves a clear idea of where we need to go, and a rough outline for how it can be done.  We’ve laid the foundation for the implementation by giving ourselves two critical starting points: xml() and that mysterious outer-scoped method_missing.  We may not know how we’re going to implement all this yet, but we at least have an idea on where to begin.

Starting with the easy one, let’s implement a basic framework around xml().

require 'singleton'
 
module XML
  class XMLBuilder
    attr_reader :last
 
    private
    def initialize
    end
 
    def method_missing(sym, *args, &block)
      # ...
    end
 
    def to_s
      @last
    end
 
    def self.instance
      @@instance ||= new
    end
  end
end
 
def xml
  XML::XMLBuilder.instance
end

Notice how we’re using a singleton instance of XMLBuilder?  That’s because there’s no need for us to have more than one instance exposed to the DSL users.  XMLBuilder is just a placeholder class that dispatches the first level of commands for the DSL and will assemble the result for us as it executes.  Thus, XMLBuilder can be considered the root of our DSL, the corner-stone upon which the entire API is bootstrapped.  We do however need to allow for other, private instances as we’ll see later on.

Another item worthy of note in this snippet is the non-standard method_missing signature.  This is because we will actually need the block as a proper Proc object down the line.  A block parameter (prefixed with &) is the only parameter which can follow a varargs parameter (prefixed with *) and there can only be one of them.

We can now try a first-pass implementation of method_missing.  This implementation is really just a sample with a very significant shortcoming.  The actual implementation is quite a bit more complex.

def method_missing(sym, *args, &block)
  @last = "<#{sym.to_s}"
 
  if args.size > 0 and args[0].is_a? Hash  # never hurts to be sure
    args[0].each do |key, value|
      @last += " #{key.to_s}=\"#{value.to_s}\""
    end
  end
 
  if block.nil?
    @last += "/>"   # if there's no children, just close the tag
  else
    @last += ">"
 
    builder = XMLBuilder.new
    builder.instance_eval block
 
    @last += builder.last
 
    @last += "</#{sym.to_s}>"
  end
end

Again, this is just the rough idea.  In our actual implementation we need to be concerned about things like valid attribute values (as demonstrated in our sample), proper element names, etc.

The key to the whole algorithm is the instance_eval invocation.  This single statement passes control to the next block down and starts the process all over again.  The important thing about this is evaluating the block within the context of the an XMLBuilder instance, rather than just its enclosing context.  This allows the nested block to take advantage of the same method_missing implementaiton, hence implicitly recursing further into the XML tree.  This technique is extremely powerful and absolutely critical to a lot of DSL implementations.

You’ll also notice that this is breaking the singleton pattern we established in our class design.  This is because a separate instance of XMLBuilder is required to handle each nested block within the DSL tree.  It’s very important to remember that we’re not actually exposing this instance in the public API, it’s just a tool we use within the implementation.  The API users will still only see the singleton XMLBuilder instance.

So a bit more semantically, what we’re doing is the following:

  1. Handle the root non-existant method invocation
  2. Deal with any attributes in the single Hash parameter (if exists)
  3. Check for nested block. If found, create a new builder instance and use its context to evaluate the child block
  4. Within child block evaluation, recurse to step 1 for each method invocation
  5. Accumulate result from child block evaluation and return

Of course, this doesn’t deal with nested text content within elements.  However, the principles of the implementation are fairly clear and the rest is just code.

As with all internal DSLs, the user-friendly DSL syntax is supported by an API that’s ugly, hacky and heavily dependant on language quirks (such as the super-critical instance_eval).  In fact, it has been said that internal DSLs encourage the use of language quirks if it simplifies the API from the end-developer standpoint.  Of course this makes the end-developer code very clean and easy to maintain at the cost of the DSL developer’s code, which is nightmarish and horrible to work with.  It’s a tradeoff that must be considered when the decision is made to go with a DSL-style API.

Conclusion

Hopefully this was a worthwhile trek into the gory innards of implementing an internal DSL.  I leave you with one final code sample to whet your appetite for the fully-implemented API.  This is an extract from the ActiveObjects Ant build file converted to use our new DSL.  It’s interesting that this converted version is significantly cleaner than the original, XML form.

require 'xmlbuilder'
 
xml.project :name => 'ActiveObjects', :default => :build do
  dirname :property => 'activeobjects.dir', :file => '${ant.file.ActiveObjects}'
  property :file => '${activeobjects.dir}/build.properties'
 
  target :name => :init do
    mkdir :dir => '${activeobjects.dir}/bin'
  end
 
  target :name => :build, :depends => :init do
    javac :srcdir => '${activeobjects.dir}/src', :source => 1.5, :debug => true
  end
 
  target :name => :build_test, :depends => [:init, :build] do
    property :name => 'javadoc.intern.path', :value => '${activeobjects.dir}/${javadoc.path}'
  end
end
 
puts xml

This will print the following XML:

<?xml version="1.0" encoding="UTF-8"?>
 
<project name="ActiveObjects" default="build">
  <dirname property="activeobjects.dir" file="${ant.file.ActiveObjects}"/>
  <property file="${activeobjects.dir}/build.properties"></property>
  <target name="init">
    <mkdir dir="${activeobjects.dir}/bin" />
  </target>
  <target name="build" depends="init">
    <javac source="1.5" debug="yes" srcdir="${activeobjects.dir}/src"/>
  </target>
  <target name="build_test" depends="init,build">
    <property name="javadoc.intern.path" value="${activeobjects.dir}/${javadoc.path}"></property>
  </target>
</project>

The fully implemented DSL is available in a single Ruby file. Also linked are some examples to provide a more balanced view of the capabilities.

Interface "Wow" Factor

7
Mar
2008

I’ve been using Firefox 3.0 since b1 on my main development machine under Windows Vista and to be honest, I’ve been floored by all of the improvements the Firefox team has managed to squeeze into this release.  For starters, performance is improved easily 10 fold over 2.0, especially when dealing with scads of tabs.  That alone makes the upgrade worthwhile in my book. 

Even beyond that though, the entire application just has a more polished “feel” about it.   Beta 3 saw the introduction of an improved theme on all platforms.  This new theme was most noticeable on Mac, where it made Firefox suddenly jump from a MacPorts-style outsider to running right alongside the “iApps” in terms of style.  The improvement on Vista was a little more subtle, but none-the-less impressive:

 image

It looks a little weird when you first examine it, but after using this UI for a while I’ve come to really appreciate it’s simple devotion to usability patterns.  I’ve followed with passing interest the Firefox development cycle since v0.5, so I knew that they were spending a lot of time polishing the UI.  I was actually one of those aware of new “keyhole” look long before beta1, so when it arrived I was somewhat non-plussed.  Oh it’s certainly slick and a vastly more usable (and more attractive) UI than it’s super-power competitor, IE7.  But let’s face it, slick only gets you so far these days.  Windows Vista has about the slickest look of any operating system, and just look at how much people malign it.

Returning to Firefox though, I’ve been extremely happy with how well “put together” the whole app seems.  It’s really become a coherant power application with a flair for the indulgent user.  It’s come a long way since those early releases on Windows.  I didn’t realize just how far it had come though until I installed 3.0b3 on my Ubuntu Linux virtual machine (click for larger shot).

image

My first reaction was, “Impressive, they finally managed to make it look like a real Gnome2 application.”  Then I looked down.

firefox-wow-thumb

I nearly wept when I saw these controls.  GTK Linux has been without a browser which could do this literally since the beginning of time.  I was certainly aware that the team was working on this, but I had no idea that it had been activated in 3.0.  Native controls in HTML are incredibly important to the user perception of how well the browser integrates with the platform.

That’s what it really all comes down to: user perception.  Jeff Atwood harps on about this constantly, but just because it’s oft-repeated doesn’t make it less true.  It doesn’t matter what your application can do, just what your users think it can do.  It’s all an elaborate illusion anyway, we just have to realize how complete that illusion really is.  If a user looks at your application and thinks, “Wow!  I don’t know what it is, but it looks powerful,” then you have succeeded as a developer.  Your application could do nothing more than print “Hello, World!” an infinite number of times; so long as it is impressive looking, it will be a success (think iPhoto 1.0).  Likewise, your application may desalinate water and hold the key to world peace, but if it looks wimpy, users will never give it a chance.

Now by “impressive looking” I certainly don’t mean just flashy.  Anyone can Photoshop a fancy interface with lickable buttons and endlessly translucent animations, but that doesn’t mean the interface will “feel powerful”.  The really important test is in those critical first seconds as the user makes his first few clicks through the application.  The user needs to instantly understand the core functionality of the application and why it is better than the competition.  Their eyes need to be draw to the critical areas and they need to be comfortable resting there for long periods at a time.  They should feel an immediate sense of cooperation and team-spirit in the application.  Things should progress as quickly as they can think (but no faster) and transition smoothly from state A to state B.  Oh, and the application should look good.

Firefox on Gnome2 isn’t particularly flashy; it isn’t very lickable, and there’s no translucency.  It does make a statement however, one which is immediate and unmistakably readable by the user: I can do whatever you need me to, and you’re going to like the way I do it.

Wow.

Should We Really Study Other Languages?

4
Mar
2008

The practice of learning multiple languages has really become dogma in the modern developer community.  Everyone knows that you should study different languages, different paradigms and various techniques for accomplishing the same thing.  Why?  The canonical answer is that it helps you grow into a better programmer in your primary language.  After all, if you know how to compose functions in Lisp, naturally that must make it easier to design flexible systems in Java.  I would beg to differ.

I must admit, I used to be a card-carrying member of the “one language per year” club.  I used to push myself and other developers to get their feet wet in alternative languages, even ones which had no practical application (I learned Ruby back when it was just a toy for Asian folks).  Recently though, I’ve been coming more to the conclusion that perhaps this frenetic dash to learn the most languages might not be the “ultimate answer” after all.  A commenter on Reddit relates a story which is (I believe) quite applicable to the subject:

When I was in college, one of the jobs I had was a TA for an intro programming class. For one of their projects, I was asked to whip up a kind of web browser “shell” in Java. The basic idea was to make a browser that would be highly extendable by students, while freeing them from worrying about the overall rendering framework.

Now, the first language that I learned was Smalltalk, which has historically been relatively design-pattern-free due to blocks and whatnot, and I had only learned Java as an afterthought while in college, so I coded in a slightly unorthodox way, making use of anonymous inner classes (i.e., shitty lambdas), reflection, and the like. I ended up with an extremely loosely coupled design that was extremely easy to extend; just unorthodox.

When I gave it to the prof, his first reaction, on reading the code, was…utter bafflement. He and another TA actually went over what I wrote in an attempt to find and catalog the GoF patterns that I’d used when coding the application. Their conclusion after a fairly thorough review was that my main pattern was, “Code well.”

You may laugh (I did), but I can’t tell you how much code I’ve had to work with which reminds me of this tale.  Developers these days pick up patterns and best-practices from dozens of languages and try to apply them to a language for which they are ill suited.  Consider the following code:

public class SortUtils {
    public static <T> List<T> mergeSort(final List<T> list) {
        return new Object() {
            private final List<T>[] divided = divide(list);
 
            public List<T> run() {
                return merge(mergeSort(divided[0]), mergeSort(divided[1]));
            }
        }.run();
    }
 
    public static <T> List<T>[] divide(final List<T> list) {
        if (list.size() == 0) {
            return new List<T>[] {new ArrayList<T>(), new ArrayList<T>()};
        } else if (list.size() == 1) {
            return new List<T>[] {list, new ArrayList<T>()};
        } else {
            return new Object() {
                // this part doesn't really work, it's just illustrative
                private final T first = list.remove();
                private final T second = list.remove();
                private final List<T>[] sub = divide(list);
 
                public List<T>[] run() {
                    return new List<T>[] {new ArrayList<T>() {
                        {
                            addAll(sub[0]);
                            add(first);
                        }
                    }, new ArrayList<T>() {
                        {
                            addAll(sub[1]);
                            add(second);
                        }
                    }};
                }
            }.run();
        }
    }
 
    public static <T> List<T> merge(final List<T> left, final List<T> right) {
        if (left.size() == 0) {
            return right;
        } else if (right.size() == 0) {
            return left;
        } else {
            if (left.get(0) < right.get(0)) {
                return new ArrayList<T>() {
                    {
                        add(left.remove(0));
                        addAll(merge(left, right));
                    }
                };
            } else {
                return new ArrayList<T>() {
                    {
                        add(right.remove(0));
                        addAll(merge(left, right));
                    }
                };
            }
        }
    }
}

I know what you’re thinking: Whoever wrote this is a sick, sick programmer.  Actually I wrote it, but that’s beside the point…  :-)

The point is that taking knowledge gained in one language and directly applying it to another is almost never the right approach.  This isn’t how you sort a list in Java, it’s how you would do it in ML, or maybe even Lisp.  By trying to apply functional idioms directly to an object-oriented language, I’ve accomplished code which does two things.  First, it’s unmaintainable.  No sane-minded Java developer is ever going to be able to take this code and make incremental improvements.  Second (and just as important), it is extremely slow.  Pure-functional languages are written in such a way as to make function calls, recursion and immutable data very performant.  Java just doesn’t work like that.  This code is creating objects left and right, recursing and copying data back and forth so many times its a wonder the whole machine doesn’t crash.

What’s ironic is even with all my efforts, I still couldn’t eliminate mutable data and sequential statements entirely.  Java’s data structures are mutable by design, something which makes it very inefficient to try to do things immutably.  Java’s APIs just aren’t built to comply with expression-based algorithms, something which is an absolute must when working with immutable data.

Throwing away Java’s utter lack of functional constructs like cons (::) and pattern matching, this code is still horrible because it doesn’t follow the Java idioms.  Not only is the Java language built to perform better when used imperatively, but the constructs are such that the code is more concise and maintainable.  Would it have killed readability to use a loop?  Hardly.  In fact, considering the mess we got when trying to avoid imperative constructs, it’s difficult to imagine things getting less readable.

This is certainly an extreme example, no right-minded developer would ever try to do something like this in the real world, but the point remains.  The same problems can be seen even in examples like applying Java’s for-loop idiom for iterating over arrays in Ruby.  Language-specific idioms are important, and we only cripple ourselves if we ignore them.

So at the end of the day, does this really mean we should stop learning new languages?  Absolutely not.  As “Pragmatic Dave” famously pointed out, learning a new language opens our mind to new ways of approaching problems.  It’s not so much how the code is written as much as gaining a deeper insight into the process of solving the problem.  Learning Scala has helped me refine my problem-solving abilities, allowing me to more effectively approach tasks in any language.

However, I think that this “perspective improvement” is a bit over-stressed.  Yes, problem solving is an important skill, one which should be refined through experience with multiple approaches, but I don’t think it’s as critical as many people make it out to be.  I know some really phenomenal developers who only know one language.  Could they be better for learning another?  Probably, but the point is they’re doing just fine right now.  Being multi-lingual is not critical, and as my facetious example illustrates, it can be very harmful.

In the end, the best reason for learning a new language stands unchallenged and unassailable: it’s fun.  I derive a great deal of satisfaction from learning how to do things in a new language, applying my old skills to new tools.  At the end of the day, I don’t really care whether I’m learning anything important to my career.  This may be the case, but the only thing which really matters to me is the challenge of the new puzzle, a new riddle to crack.  I certainly believe that most good developers will derive the same enjoyment, and as long as they keep their idioms straight, I’m perfectly content with that.

Defining High, Mid and Low-Level Languages

27
Feb
2008

I’ve been writing quite a bit recently about the differences between languages.  Mostly I’ve just been whining about how annoying it is that everyone keeps searching for the “one language to rule them all”, the Aryan Language if you will.  Over the course of some of these articles, I’ve made some rather loosely defined references to terms like “general purpose” and “mid-level” when trying to describe these languages. 

Several people have (rightly) called me out on these terms, arguing that I haven’t really defined what they mean, so I shouldn’t be using them to try to argue a certain point.  In the case of “general purpose language”, I have to admit that I tend to horribly misuse the term and any instances within my writing should be discarded without thought.  However, I think with a little bit of reflection, we can come to some reasonable definitions for high-, mid- and low-level languages.  To that end, I present the “Language Spectrum of Science!” (cue reverb)

Language Spectrum of Science 

This scale is admittedly arbitrary and rather loosely defined in and of itself, but I think it should be a sufficient visual aid in conveying my point.  In case you hadn’t guessed, red languages are low-level, green languages are high-level and that narrow strip of yellow represents the mid-level languages.  Obviously I’m leaving out a large number of languages which could be represented with equal validity, but I only have a finite number of pixels in page-width.

The scale is also somewhat myopic.  It defines Ruby as the highest of the high-level languages.  Very few could argue the other side of the scale since there’s not really anything lower than the hardware, but claiming that Ruby is the most high-level language in history seems somewhat odd.  In truth, I picked Ruby as the super high-level language mainly because it’s a) more dynamic than both JavaScript and Perl, b) more prone to RAD frameworks like Rails and c) it’s the most significant high-level language which I’m really familiar with.

It’s also important to note that languages aren’t really points on the spectrum, but rather they span ranges which are more or less wide, depending on the capabilities.  These ranges may overlap considerably (as in the case of Java and Scala) or may be entirely disjoint (Assembly and Ruby).  In short, the scale is somewhat blurry and shouldn’t be taken as a canonical reference.

Low-Level

Of all of the categories, it’s probably easiest to define what it means to be a low-level language.  Machine code is low level because it runs directly on the processor.  Low-level languages are appropriate for writing operating systems or firmware for micro-controllers.  They can do just about anything with a little bit of work, but obviously you wouldn’t want to write the next major web framework in one of them (I can see it now, “Assembly on Rails”).

Characteristics

  • Direct memory management
  • Little-to-no abstraction from the hardware
  • Register access
  • Statements usually have an obvious correspondence with clock cycles
  • Superb performance

C is actually a very interesting language in this category (more so C++) because of how broad its range happens to be.  C allows you direct access to registers and memory locations, but it also has a number of constructs which allow significant abstraction from the hardware itself.  Really, C and C++ probably represent the most broad spectrum languages in existence, which makes them quite interesting from a theoretical standpoint.  In practice, both C and C++ are too low-level to do anything “enterprisy”.

Mid-Level

This is where things start getting vague.  Most high-level languages are well defined, as are low-level languages, but mid-level languages tend to be a bit difficult to box.  I really define the category by the size of application I would be willing to write using a given language.  I would have no problem writing and maintaining a large desktop application in a mid-level language (such as Java), whereas to do so in a low-level language (like Assembly) would lead to unending pain.

This is really the level at which virtual machines start to become common-place.  Java, Scala, C# etc all use a virtual machine to provide an execution environment.  Thus, many mid-level languages don’t compile directly down to the metal (at least, not right away) but represent a blurring between interpreted and compiled languages.  Mid-level languages are almost always defined in terms of low-level languages (e.g. the Java compiler is bootstrapped from C).

Characteristics

  • High level abstractions such as objects (or functionals)
  • Static typing
  • Extremely commonplace (mid-level languages are by far the most widely used)
  • Virtual machines
  • Garbage collection
  • Easy to reason about program flow

High-Level

High-level languages are really interesting if you think about it.  They are essentially mid-level languages which just take the concepts of abstraction and high-level constructs to the extreme.  For example, Java is mostly object-oriented, but it still relies on primitives which are represented directly in memory.  Ruby on the other hand is completely object-oriented.  It has no primitives (outside of the runtime implementation) and everything can be treated as an object.

In short, high-level languages are the logical semantic evolution of mid-level languages.  It makes a lot of sense when you consider the philosophy of simplification and increase of abstraction.  After all, people were n times more productive switching from C to Java with all of its abstractions.  If that really was the case, then can’t we just add more and more layers of abstraction to increase productivity exponentially?

High-level languages tend to be extremely dynamic.  Runtime flow is changed on the fly through the use of things like dynamic typing, open classes, etc.  This sort of technique provides a tremendous amount of flexibility in algorithm design.  However, this sort of mucking about with execution also tends to make the programs harder to reason about.  It can be very difficult to follow the flow of an algorithm written in Ruby.  This “obfuscation of flow” is precisely why I don’t think high-level languages like Ruby are suitable for large applications.  That’s just my opinion though.  :-)

Characteristics

  • Interpreted
  • Dynamic constructs (open classes, message-style methods, etc)
  • Poor performance
  • Concise code
  • Flexible syntax (good for internal DSLs)
  • Hybrid paradigm (object-oriented and functional)
  • Fanatic community

Oddly enough, high-level language developers seem to be much more passionate about their favorite language than low- or mid-level developers.  I’m not entirely sure why it has to be this way, but the trend has been far too universal to ignore (Python, Perl, Ruby, etc).  Ruby is of course the canonical example of this primarily because of the sky-rocket popularity of Rails, but any high-level language has its fanatic evangelists.

What’s really interesting about many high-level languages is the tendency to fall into a hybrid paradigm category.  Python for example is extremely object-oriented, but also allows things like closures and first-class functions.  It’s not as powerful in this respect as a language like Scala (which allows methods within methods within methods), but nevertheless it is capable of representing most elements of a pure-functional language.

As an aside, high-level languages usually perform poorly compared with low- or even mid-level languages.  This is merely a function of the many layers of abstraction between the code and the machine itself.  One instruction in Ruby may translate into literally thousands of machine words.  Of course, high-level languages are almost exclusively used in situations where such “raw-metal” performance is unnecessary, but it’s still a language trait worth remembering.

Conclusion

It’s important to remember that I’m absolutely not recommending one language or “level” over another for the general case.  The very reason we have such a gradient variety of language designs is that there is a need for all of them at some point.  The Linux kernel could never be written in Ruby, and I would never want to write an incremental backup system in Assembly.  All of these languages have their uses, it’s just a matter of identifying which language matches your current problem most closely.

Should ORMs Insulate Developers from SQL?

25
Feb
2008

This is a question which is fundamental to any ORM design.  And really from a philosophical standpoint, how should ORMs deal with SQL?  Isn’t the whole point of the ORM to sit between the developer and the database as an all-encompassing, object oriented layer?

A long time ago in an office far, far away, a very smart cookie named Gavin King got to work on what would become the seminal reference implementation for object relational mapping frameworks the world over (or so Java developers would like to think).  This project was to be bundled with JBoss, possibly the most popular enterprise application server, and would support dozens of databases out of the box.  It was to offer heady benefits such as totally object-oriented database access, transparent multi-tier caching and a flexible transaction model.  At its core though, Hibernate was design to resolve a single problem: application developers hate SQL.

No really, it’s true!  Bread-and-butter application developers really dislike accessing data with SQL.  This has led to endless conflict (and bad jokes) between application developers and database administrators.  Often times the developer team would write a set of boilerplate lines in Java and then copy/paste these arbitrarily throughout their code, swapping in the relevant query as supplied by the DBA.  For obvious reasons, this would become very hard to maintain and just intensified the bad blood between developer and database.

If you think about it though, it’s a bit odd that this intense dislike would mutate from just hating the insanity of JDBC to hating JDBC, SQL and RDBMS in general.  SQL is a very nice, almost mathematical language which allows phenomenally powerful queries to be expressed simply and elegantly.  It abstracts developers from the headache of database-specific hashing APIs and algorithms which are almost filesystems in complexity.  The language was designed to make it as easy as possible to get data out of a relational database.  The fact that this effort backfired so utterly is a source of endless confusion to me.

But irregardless, we were talking about ORMs.  When it was first introduced, Hibernate held out the promise that developers would never again have to wade knee deep through a sea of half-set SQL.  Instead, developers would pass around POJOs (Plain Old Java Object(s)), modifying their values like any other Java bean and then handing these objects off to the data mapper, which would handle the details of persistence.  Furthermore, Hibernate promised that developers would never again have to worry about which databases support which non-standard SQL extensions.  Since developers would never have to work with SQL, anything database-specific could be handled within the persistence manager deep in the bowels of Hibernate itself.

This all seems lovely and wonderful, but there’s a catch: it doesn’t work so well in practice.  Now before you stone me, I’m not talking about Hibernate specifically now, but ORMs in general.  It turns out to be completely impossible to interact with a relational database solely through an object-oriented filter.  This is easily seen with a simple example:

SELECT * FROM people WHERE age > 21 GROUP BY lastName

How in the world are you going to represent that in an object model?  Sure, maybe you can provide a little abstraction for the query details, but it starts to get complex if you try to handle things like grouping non-declaratively.  The developers working on Hibernate quickly realized this problem and came up with an innovative solution: write their own query language!  After all, SQL is too confusing, so why not invent an entirely new query language with the “feel” of SQL (to keep the DBAs happy) but without all of the database-specific wrinkles?

This query language is now called “HQL”, and as the name implies, it’s really SQL, but not quite.  Here’s how the aforementioned example would look in HQL (disclaimer: I’m not a Hibernate expert, so I may have gotten the syntax wrong):

FROM Person WHERE :age > 21 GROUP BY :lastName

Remarkably similar, that.  Executing this query in a Hibernate persistence manager yields an ordered list of Person entities pre-populated with data from the query.  It seems to make a lot of sense, but there are a number of problems with this approach.  First, it requires Hibernate to literally have its own compiler to translate HQL queries into database-specific SQL.  Second, it hasn’t really solved the core problem that many developers have with SQL: it’s a declarative query language.  As you can see, HQL is really just SQL in disguise, so it really doesn’t eliminate SQL from your database access, just dresses it in a funny hat.

Other ORMs have appeared over the years, taking alternative approaches to the problem of object-relational mapping, but none of them quite eliminating the query language.  Even DSL-based ORMs like ActiveRecord fail to remove SQL entirely:

class Person < AR::Base; end
 
Person.find(:all, :conditions => 'age > 21', :group => 'lastName')

It’s sort of SQL-free, but you can still see bits and pieces of a query language around the edges.  In fact, what ActiveRecord is actually doing here is building a proper SQL query around the SQL fragments which are passed as parameters.  It’s a system which is ripe for SQL injection, but surprisingly leads to very few problems in real-world applications.  This is the approach which is also taken by ActiveObjects for its database query API.

So ORMs in and of themselves seem to have failed to entirely eliminate SQL from the picture, but what about other frameworks?  There are a few quite recent efforts which seem to have nearly succeeded in eliminating the direct use of SQL completely from application code.  Ambition is perhaps the best (and most clever) example of this, though others like scala-rel are catching up fast.  Ambition is designed from the ground up to interact naturally with ActiveRecord, so the two combined perhaps represent the first “true” ORM: one which does not require the developer at any point to deal with any SQL whatsoever.

But was it really worthwhile?  As clever as things like Ambition are, is it really that much easier than just writing queries in SQL?  As Nathan Hamblen so eloquently said (when referring to a totally different topic):

…is the end of the ORM rainbow.  You get there, throw yourself a party and realize that important things are broken.

A quote taken out of context perhaps, but I think it applies to the “cult of SQL genocide” with as much validity.  In the end, by denying yourself access to the powerful and well-understood mechanism that is SQL, you’re just crippling your own application and forcing yourself to write more code instead of less.

So what’s the “right” approach?  Is there a happy medium between ActiveRecord+Ambition and full-blown SQL on Rails?  I think so, and that is the approach I have been trying to implement with ActiveObjects.  As I’m sure you know, ActiveObjects takes a lot of its inspiration from ActiveRecord, so the syntax for querying the database is very similar:

EntityManager em = ...
em.find(Person.class, Query.select().where("age > 21").group("lastName"));
 
// ...or
em.find(Person.class, "name > 21");   // no grouping

You still have the full power of SQL available to you.  You can still write complex, nested boolean conditionals and funky subqueries, but there’s no longer any need to be burdened with the whole of SQL’s verbosity.  As with vanilla ActiveRecord, this code intends to be a bit of a hand-holder, shielding innocent application developers from the fierce world of RDBMS.

Is this the right way to go?  I’m honestly not sure.  I’ve met a lot of developers that would give their left eye to never have to look at another SQL statement again (for developers already missing a right eye, this isn’t much of a stretch).  On the other hand, there are purists like myself who revel in the freedom afforded by a powerful, declarative language.  It’s hard to say which path is better, but at the end of the day, it’s really the question itself that matters.  Giving application developers the choice to select whichever approach they feel is most appropriate, that is the solution.