Skip to content

The Brilliance of BDD

9
Jun
2008

As I have previously written, I have recently been spending some time experimenting with various aspects of Scala, including some of the frameworks which have become available.  One of the frameworks I have had the privilege of using is the somewhat unassumingly-titled Specs, and implementation of the behavior-driven development methodology in Scala.

Specs takes full advantage of Scala’s flexible syntax, offering a very natural format for structuring tests.  For example, we could write a simple specification for a hypothetical add method in the following way:

object AddSpec extends Specification {
  "add method" should {
    "handle simple positives" in {
      add(1, 2) mustEqual 3
    }
 
    "handle simple negatives" in {
      add(-1, -2) mustEqual -3
    }
 
    "handle mixed signs" in {
      add(1, -2) mustEqual -1
      add(-1, 2) mustEqual 1
    }
  }
}

We could go on, of course, but you get the picture.  This code will lead to the execution of four separate assertions in three tests (to put things into JUnit terminology).  Fundamentally, this isn’t too much different than a standard series of unit tests, just with a slightly nicer syntax.

Specs defines a domain-specific language for structuring test assertions in a simple and intuitive way.  However, this is hardly the only framework for BDD.  Perhaps the most well-known such framework is RSpec, which answers a similar use-case in the Ruby programming language.  Our previous specification could be rewritten using RSpec as follows:

describe AddLib do
  it 'should handle simple positives' do
    add(1, 2).should == 3
  end
 
  it 'should handle simple negatives' do
    add(-1, -2).should == -3
  end
 
  it 'should handle mixed signs' do
    add(1, -2).should == -1
    add(-1, 2).should == 1
  end
end

The end-result is basically the same: the add method will be tested against the given assertions (all four of them) and the results printed in some sort of report form.  In this area, RSpec is significantly more mature than Specs, generating very slick HTML reports and nicely formatted console output.  This isn’t really a fundamental weakness of the Specs framework however, just indicative of the fact that RSpec has been around for a lot longer.

These two frameworks are interesting of course, but they are merely implementations of a much larger concept: behavior-driven development.  I’ve never been much of a fan of unit testing.  It’s always seemed to be incredibly dull and a very nearly fruitless waste of effort.  As much as I hate it though, I have to bow to the benefits of a self-contained test suite; and so I press on, cursing JUnit every step of the way.

BDD provides a nice alternative to unit testing.  At its core, it is not much different in that test groupings and primitive assertions are used to check all aspects of a test unit against predefined data.  However, there is something about the “flow” of a behavioral spec that is considerably easier to deal with.  For some reason, it is far less painful to devise a comprehensive test suite using BDD principles than conventional unit testing.  It seems a little far-fetched, but BDD actually makes it easier to write (and more importantly, formulate) exactly the same tests.

It’s an odd phenomenon, one which can only be caused by the storyboard flow of the code itself.  It is very natural to think of distinct requirements for a test unit when each of these requirements are being labeled and entered in a logical sequence.  Moreover, the syntax of both Specs and RSpec is such that there is very little boiler-plate required to setup an additional test.  Compare the previous BDD specs with the following JUnit4 example:

public class MathTest {
    @Test
    public void testSimplePositives() {
        assertEquals(3, add(1, 2));
    }
 
    @Test
    public void testSimpleNegatives() {
        assertEquals(-3, add(-1, -2);
    }
 
    @Test
    public void testMixedSigns() {
        assertEquals(-1, add(1, -2));
        assertEquals(1, add(-1, 2));
    }
}

JUnit just requires that much more syntax.  It breaks up the logical flow of the tests and (more importantly) the developer train of thought.  What can be worse is this syntax bloat makes it very tempting to just group all of the assertions into a single test - to save typing if nothing else.  This is problematic because one assertion may shadow all the others in the case of a failure, preventing them from ever being executed.  This can make certain problems much more difficult to isolate.

Logical flow is extremely important to test structure.  BDD frameworks provide a very nice syntax for painlessly defining comprehensive test suites.  The really wonderful thing about all of this is that BDD is available on the JVM, right now.  There’s nothing stopping you from writing your code in Java as you normally would, then creating your test suite in Scala using Specs rather than JUnit.  Alternatively, you could use RSpec on top of JRuby, or Gspec with Groovy.  All of these are seamless replacements for a test framework like JUnit, and requiring of far less syntactic overhead.

The growing move toward polyglot programming encourages the use of a separate language when it is best suited to a particular task.  In this case, several languages are available which offer far more powerful test frameworks than those which can be found in Java.  Why not take advantage of them?

The Problem of Perspective Multiplicity

2
Jun
2008

Some six years ago, I switched my primary IDE from NetBeans to Eclipse JDT (then 2.0).  At the time, I did this primarily because NetBeans was too much of a resource hog for my pathetic development machine, but I quickly learned to appreciate the power of the Eclipse development environment.  NetBeans has since made great strides of course, but at the time, Eclipse was lightyears beyond it in both features and polish.

One of the more interesting features offered by Eclipse was the concept of a “perspective”, a collection of views in a specific layout conducive to performing a specific series of tasks.  The major upshot of this was instead of the debugger views popping in and out, they simply remained hidden in a separate perspective, ready to restore to your customized configuration as necessary.  This innovation was also present in other areas, such as the CVS Team view and the Update Manager (yes, the Eclipse update system was once a set of views and editors).

You could switch between these perspectives manually of course, but most of the time Eclipse was able to just detect which perspective you needed and make the switch automatically.  If you were to launch an application in debug mode for example, the “Debug” perspective would be opened automatically, bringing useful views to the fore.  Once you were done debugging, it was easy to switch back to the “Java” perspective for more streamlined editing.  It was a good system, and it worked well.

Unfortunately, times have changed.  Don’t get me wrong, I still love having all my debug views and layout saved for me in a discrete section of the app, ready to access on a moment’s notice.  But Eclipse is no longer the single-purpose application it once was.  Yes, I know that it has always been billed as “an open tool platform for everything and nothing in particular”, but back in the day (and especially before OSGi) most people had yet to realize this.  The only language supported by Eclipse on any serious level was Java, thus the perspective system worked extremely well for organizing IDE views.  Now, Eclipse serves as the foundation for IDE frameworks supporting dozens of different languages, requiring an equal (if not greater) number of perspectives.

 image

Even in this screenshot, I’m still hiding easily 70% of the perspectives available to Eclipse.  With all of these different view collections and configurations, it’s no wonder that people often find Eclipse to be confusing compared to other IDEs.  In NetBeans (for example) you can work with as many languages as you want within a single perspective/layout/configuration.  The outline shows the relevant information for whatever file you have open, and the project explorer view is fully integrated with each language, showing all available projects and their associated structure.  Most importantly, this view is able to show project logical structure as dictated by the support module for that language (e.g. src/, test/, etc).

Effectively, other IDEs have evolved a single “Development” perspective, one which shows a generic set of views common to all languages.  Unlike Eclipse, which requires switching to the Ruby perspective or the C/C++ perspective to get the appropriate project viewer, NetBeans has one project viewer which is extensible by any module.  Eclipse has some of this with the Package Explorer, but some plugins like DLTK don’t properly integrate and so the view isn’t as streamlined.  Additionally, some functions like “Open Type” don’t work appropriately unless in the corresponding perspective for a given language.

Yes, I am aware that I could simply open any views I want within a single perspective, but that’s not what I’m looking for.  I don’t want to open five different views for navigating project files, I want to have one master view which shows me everything through the filter of whatever language is relevant to the project.  Project Explorer comes close, but it fails to handle the tighter integration (such as the “Referenced Libraries” in JDT or script outlines for DLTK).

Theoretically, Eclipse only needs four or five perspectives for the average developer working with any number of languages: Develop, Debug, Test, Repositories, Synchronize.  Obviously, more perspectives would be needed for functionality which does not conform to normal development conventions (such as “Planning” or even “Email”), but I think that these core perspectives could provide a consistent, generic framework to which any language IDE could conform.  We can already see something similar happening with the Debug perspective, which is used by Java, Ruby, Scala and C/C++ alike.

What is needed is a common super-framework to be extended by actual language implementations such as JDT, CDT and the like (similar to what DLTK provides but more encompassing).  This framework should provide a common platform with features such as project viewing, outline, documentation, type hierarchy, call hierarchy, open type, etc.  This platform would then be specialized by the relevant IDE and the same views would allow extension to fit the needs of the language in question.  This already happens with the Outline view, but it needs to occur with other common functions as enumerated.  Views which are not common to different languages (such as Ant Build or Make Targets) would of course not be contained within this super-framework, but would be separate views as they are now.  This framework would allow a developer to use a single set of views for any language, never requiring a workflow-disrupting change of perspective.

The building blocks are all in place, and such an effort would still be in line with the Eclipse philosophy of total extensibility, it’s merely a question of implementation and opinion.  The implementation is simple, as I said, most of the functionality is already available (often redundantly) in any one of the many IDE packages.  The bigger challenge is to convince those who have the power to make the decision.  Eclipse 4.0 is coming, it should be an interesting road to follow.

The Plague of Polyglotism

28
Apr
2008

For those of you who don’t know, polyglotism is not some weird religion but actually a growing trend in the programming industry.  In essence, it is the concept that one should not be confined to a single language for a given system or even a specific application.  With polyglot programming, a single project could use dozens of different languages, each for a different task to which they are uniquely well-suited.

As a basic example, we could write a Wicket component which makes use of Ruby’s RedCloth library for working with Textile.  Because of Scala’s flexible syntax, we can use it to perform the interop between Wicket and Ruby using an internal DSL:

class TextileLabel(id:String, model:IModel) extends WebComponent(id, model) with JRuby {
  require("textile_utils")
 
  override def onComponentTagBody(stream:MarkupStream, openTag:ComponentTag) {
    replaceComponentTagBody(markupStream, openTag, 
        'textilize(model.getObject().toString()))
  }
}
# textile_utils.rb
require 'redcloth'
 
def textilize(text)
  doc = RedCloth.new text
  doc.to_html
end

Warning: Untested code

We’re actually using three languages here, even though we only have source for two of them.  The Wicket library itself is written in Java, our component is written in Scala and we work with the RedCloth library in Ruby.    This is hardly the best example of polyglotism, but it suffices for a simple illustration.  The general idea is that you would apply this concept to a more serious project and perform more significant tasks in each of the various languages.

The Bad News

This is all well and good, but there’s a glaring problem with this philosophy of design: not everyone knows every language.  You may be a language aficionado, picking up everything from Scheme to Objective-C, but it’s only a very small percentage of developers who share that passion.  Many projects are composed of developers without extensive knowledge of diverse languages.  In fact, even with a really good sampling of talent, it’s doubtful you’ll have more than one or two people fluent in more than two languages.  And unfortunately, there’s this pesky concern we all love called “maintainability”.

Let’s pretend that Slava Pestov comes into your project as a consultant and decides that he’s going to apply the polyglot programming philosophy.  He writes a good portion of your application in Java, Lisp and some language called Factor, pockets his consultant’s fee and then moves on.  Now the code he wrote may have been phenomenally well-designed and really perfect for the task, but you’re going to have a very hard time finding a developer who can maintain it.  Let’s say that six months down the road, you decide that your widget really needs a red push button, rather than a green radio selector.  Either you need a developer who knows Factor (hint: there aren’t very many), or you need a developer who’s willing to learn it.  The thing is that most developers with the knowledge and motivation to learn a language have either already done so, or are familiar enough with the base concepts as to be capable of jumping right in.  These developers fall into that limited group of people fluent in many different languages, and as such are a rare find.

Now I’m not picking on Factor in any way, it’s a very interesting language, but it still isn’t very widespread in terms of developer expertise.  That’s really what this all comes down to: developer expertise.  Every time you make a language choice, you limit the pool of developers who are even capable of groking your code.  If I decide to build an application in Java, even assuming that’s the only language I use, I have still eliminated maybe 20% of all developers from ever touching the project.  If I make the decision to use Ruby for some parts of the application, while still using Java for the others, I’ve now factored that 80% down to maybe 35% (developers who know Java and Ruby).  Once I throw in Scala, that cuts it down still further (maybe at 15% now).  If I add a fourth language - for example, Haskell - I’ve now narrowed the field so far, that it’s doubtful I’ll find anyone capable of handling all aspects within a reasonable price range.  It’s the same problem as with framework choice, except that frameworks are much easier to learn than languages.

The polyglot ideal was really devised by a bunch of nerdy folks like me.  I love languages and would like nothing better than to get paid to learn half a dozen new ones (assuming I’m coming into a project with a strange combination I haven’t seen before).  However, as I understand the industry, that’s not a common sentiment.  So a very loud minority of developers (/me waves) has managed to forge a very hot methodology, one which excludes almost all of the hard-working developer community.  If I didn’t know better, I would be tempted to say that it was a self-serving industry ploy to foster exclusivity in the job market.

I want to work on multi-language projects as much as anyone, but I really don’t think it’s the best thing right now.  I’m working on a project now which has an aspect for which Scala would be absolutely perfect, but since I’m the only developer on hand who is remotely familiar with the language, I’m probably going to end up recommending against its adoption.  Consider carefully the ramifications of trying new languages on your own projects, you may not be doing future developers any favors by going down that path.

Algorithm Proof Inference in Scala

1
Apr
2008

Anyone who’s written any sort of program or framework knows first-hand the traumas of testing.  The story always goes something like this.  First, you spend six months writing two hundred thousand lines of code which elegantly expresses your intent.  Next, you spend six years writing two hundred million lines of code which tests that your program actually does what you think it does.  Of course, even then you’re not entirely sure you’ve caught everything, so you throw in a few more years writing code which tests your tests.  Needless to say, this is a vicious cycle which usually ends badly for all involved (especially the folks in marketing who promised the client that you would have the app done in six hours).  The solution to this “test overload cycle” is remarkably simple and well-known, but certain problems have constrained its penetration into the enterprise world (that is, until now).  The solution is program proving.

A More Civilized Age

Program proofs are basically a (usually large) set of mathematical expressions which rigorously prove that all accepted program outcomes are correct.  A program prover takes these expressions and performs the appropriate static analysis on your code, spitting out either a “yes” or a “no”.  It’s far more effective than boring old unit testing since you can be absolutely certain that all the bugs have been caught.  More than that, it doesn’t require you to write reams of test code.  All that is required is the series of expressions defining program intent:

\Gamma_m \to \{\Gamma_0 = \textit{input} | m \sub \Gamma_0\} \cup \Sigma^\star

\epsilon = e^{\epsilon \pm \delta}

f \colon \{ x \in [\epsilon, \infty) | x \to \Gamma_x \}

\rho_\delta = f(\Gamma_\delta \bullet \Gamma_\alpha) \Rightarrow \alpha \in \Sigma^\star

\mbox{prove}(\Omega) = \Omega \in \Gamma_\Beta \Rightarrow (\Omega \times \epsilon) \vee \rho_\omega

There, isn’t that elegant?  Much better than some nasty JUnit test suite.  In a happier world, all tests would be replaced by this simple, easy-to-read representation of intent.  Unfortunately, program provers have yet to overcome the one significant stumbling block that has barred them from general adoption: the limitations of the ASCII character set.

Sadly, the designers of ASCII never anticipated the widespread need to express a Greek delta, or to properly render an implication.  Of course, we could always fall back on Unicode, but support for that character set is somewhat lacking, even in modern programming languages.  And so program provers languish in the outer darkness, unable to see the wide-scale adoption they so richly deserve.

An Elegant Weapon

Fortunately, Scala can be the salvation for the program prover.  It’s hybrid functional / object-oriented nature lends itself beautifully to the expression of highly mathematical concepts of intense precision.  Theoreticians have long suspected this, but the research simply has not been there to back it up.  After all, most PhDs write all their proofs on a blackboard, making use of a character set extension to ASCII.  Fortunately for the world, that is no longer an issue.

The answer is to to turn the problem on its ear (as it were) and eschew mathematical expressions altogether.  Instead of the developer expressing intent in terms of epsilon, delta and gamma-transitions, a simple framework in Scala will infer intent just based on the input code.  All of the rules will be built dynamically using an internal DSL, without any need to mess about in low-level character encodings.  Scala is the perfect language for this effort.  Not only does its flexible syntax allow for powerful expressiveness, but it even supports UTF-8 string literals!  (allowing us to fall back on plan B when necessary)

Note that while Ruby is used in the following sample, the proof inference is actually language agnostic.  This is because the parsed ASTs for any language are virtually identical (which is what allows so many languages to run on the JVM).

class MyTestClass
  def multiply(a, b)
    a * b
  end
end
 
obj = MyTestClass.new
puts obj.multiply(5, 6)

Such a simple example, but so many possible bugs.  For example, we could easily misspell the multiply method name, leading to a runtime error.  Also, we could add instead of multiply in the method definition.  There are truly hundreds of ways this can go wrong.  That’s where the program prover steps in.

We define a simple Scala driver which reads the input from stdin and drives the proof inference framework.  The framework then returns output which we print to stdout.

object Driver extends Application {
  val ast = Parser.parseAST(System.in)
 
  val infer = new ProofInference(InferenceStyle.AGRESSIVE)
  val prover = new ProgramProver(infer.createInference(ast))
 
  val output = prover.prove(ast)
  println(output)
}

It’s as simple as that!  When we run this driver against our sample application, we get the following result:

image

Notice how the output is automatically formatted for easy reading?  This feature dramatically improves developer productivity by reducing the time devoted to understanding proof output.  One of the many criticisms leveled against program provers is that their output is too hard to read.  Not anymore!

Of course, anyone can write a program which outputs fancy ASCII art, the real trick is making the output actually mean something.  If there’s a bug in our program, we want the prover to find it and notify us.  To see how this works, let’s write a new Ruby sample with a minor bug:

class Person < AR::Base
end
 
me = Person.find(1)
me.ssn = '123-45-6789'
me.save

It’s an extremely subtle flaw.  The problem here is that the ssn field does not exist in the database.  This Ruby snippet will parse correctly and the Ruby interpreter will blithely execute it until the critical point in code, when the entire runtime will crash.  This is exactly the sort of bug that Ruby on Rails adopters have had to deal with constantly.

No IDE in the world will be able to check this code for you, but fortunately our prover can.  We feed the test program into stdin and watch the prover do its thing:

image

Once again, clear and to the point.  Notice how the output is entirely uncluttered by useless debug traces or line numbers.  The only thing we need to know is that something is wrong, and the prover can tell us that.

Conclusion

I can speak from experience when I say that this simple tool can work wonders on any project.  Catching bugs early in the development cycle is the Holy Grail of software engineering.  By learning there’s a problem early on, effort can be devoted to finding the bug and correcting it.  I strongly recommend that you take the time to check out this valuable aid.  By integrating this framework into your development process, you may save thousands of hours in QA and testing.

JRuby Interop DSL in Scala

24
Mar
2008

JRuby is an amazing bit of programming.  It has managed to rise from its humble beginnings as a hobby project on SourceForge to the most viable third-party Ruby implementation currently available.  As far as I am aware, JRuby is the only Ruby implementation other than MRI which is capable of running an unmodified Rails application.  But JRuby’s innovation is not just limited to a rock-solid Ruby interpreter, it also provides tight integration between Java and Ruby.

There’s a lot of material out there on how to replace Java with Ruby “glue code” in your application.  The so-called “polyglot programming” technique states that we should embrace multiplicity of language in our applications.  Java may be very suitable for the core business logic of the application, but for actually driving the frontend UI, we may want to use something more expressive (like Ruby).  JRuby provides some powerful constructs which allow access to Java classes from within any Ruby application.  For example:

require 'java'
 
JFrame = javax.swing.JFrame
JButton = javax.swing.JButton
JLabel = javax.swing.JLabel
 
BorderLayout = java.awt.BorderLayout
 
class MainWindow < JFrame
  def initialize
    super 'My Test Window'
 
    setSize(300, 200)
    setDefaultCloseOperation EXIT_ON_CLOSE
 
    label = JLabel.new('You pushed the button', JLabel::CENTER)
      label.visible = false
    add label
 
    button = JButton.new 'Push Me'
    button.add_action_listener do
      label.visible = true
    end
    add(button, BorderLayout::SOUTH)
  end
end
 
window = MainWindow.new
window.visible = true

sshot-1  sshot-2 

Not a terribly complex example, but it illustrates some of the major advantages of JRuby.  Notice how clean and concise this code is.  It wouldn’t have been much longer had I done this using Java, but it would certainly have been less readable.  Ruby is absolutely perfect for this sort of use case (driving a UI).

As I said though, there are a myriad of examples showing this sort of thing.  As such, it’s not a very interesting topic for a posting.  What the masses have failed to cover, however, is how to accomplish the opposite: calling from Java into Ruby.

Likely the reason this topic has received less attention is because Java is the language will the veritable zoo of libraries and frameworks.  The amount of effort and research that has been put into Java simply dwarfs the comparative immaturity of the Ruby offerings.  Given the disparity, why would you even want to call into Ruby from Java?  This conclusion seems logical until one remembers that almost any application which uses Ruby for the frontend must actually pass flow control to Ruby at some point.  This means calling some sort of Ruby code.

The Java Way

There is some information available on the JRuby Wiki.  The wiki article really should include the caveat that “some experimentation may be required.”  Sufficient information is available, but it is neither intuitive nor convenient.  From Java, the syntax for executing an arbitrary Ruby statement looks like this:

ScriptEngineManager m = new ScriptEngineManager();
ScriptEngine rubyEngine = m.getEngineByName("jruby");
ScriptContext context = engine.getContext();
 
context.setAttribute("label", new Integer(4), ScriptContext.ENGINE_SCOPE);
 
try {
    rubyEngine.eval("puts 2 + $label", context);
} catch (ScriptException e) {
    e.printStackTrace();
}

It’s a typical Java API: over-bloated, over-designed and over-generic.  What would be really nice is to have a syntax for accessing Ruby objects that is as seamless as accessing Java from Ruby.  I want to be able to call Ruby methods and use Ruby classes with the same ease that I can use Java methods and classes.  In short, I want an internal DSL for Ruby.

Unfortunately, Java is a bit constrained in this regard.  Java’s syntax is extremely rigid and does not lend itself well to DSL construction.  It’s certainly possible, but the result is usually less than satisfactory.  We could certainly construct an API around the the Java Scripting API (JSR-233) which provides more high-level access (such as direct method calls and object wrappers), but it would be clunky and only a marginal improvement over the original.

The good news is there’s another language tightly integrated with Java that has a far more flexible syntax.  Rather than building our JSR-233 wrapper in Java, we can avail ourselves of Scala’s power and flexibility, hopefully arriving at a DSL which approaches native “feel” in its syntax.

The Scala Way

Since we’re attempting to construct a tightly-integrated API for language calls, the most effective route would be to apply techniques already discussed in the context of DSL design.  As always, we start with the syntax and allow it to drive the implementation:

// syntax.scala
import com.codecommit.scalaruby._
 
object Main extends Application with JRuby {
  require("test")
 
  associate('Person)(new Person(""))
 
  println("Received from multiply: " + 'multiply(123, 23))
  println("Functional test: " + funcTest('test_string))
 
  val me = new Person("Daniel Spiewak")
  println("Name1: " + me.name)
  println("Name2: " + (me->'name)())
 
  me.name = "Daniel"
  println("New Name: " + me.name)
 
  println("Person#toString(): " + me)
 
  val otherPerson = 'create_person().asInstanceOf[AnyRef]
  println("create_person type: " + otherPerson.getClass())
  println("create_person value: " + otherPerson.send[String]("name")())
 
  eval("puts 'Ruby integration is amazing'")
 
  def funcTest(fun:(Any*)=>String) = fun()
}
 
class Person(name:String) extends RubyClass('Person, name) {
  def name = send[String]("name")()
  def name_=(n:String) = 'name = n
}

And the associated Ruby code:

# test.rb
class Person
  attr_reader :name
  attr_writer :name
 
  def initialize(name)
    @name = name
  end
 
  def to_s
    "Person: {name=#{name}}"
  end
end
 
def test_string
  'Daniel Spiewak'
end
 
def multiply(a, b)
  a * b
end
 
def create_person
  Person.new 'Test Person'
end

Obviously we’re going to need some heavy implicit type conversions.  The important thing to note is that we don’t see any residue of the Java Scripting API, it’s all been encapsulated by our DSL.  We’ve taken an API which is oriented around single-call, low-level invocations and created a high-level wrapper framework which allows method calls, instantiation and even some form of type-checking.

Starting from the top, we see a call which should be familiar to Rubyists, the require statement.  In our framework, this method call is just a bit of syntactic sugar around a call to eval(String).  This semantics are basically the same as within Ruby directly, with the exception of how Ruby source files are resolved.  Any script file on the CLASSPATH is fair game, in addition to the normal Ruby locations.  This allows us to easily embed Ruby scripts within application JARs, libraries and other Java distributables.

Moving down a bit further, we find a somewhat mysterious call to the curried associate(Symbol)(RubyObject) method.  The purpose of this invocation will become more apparent later on.  Suffice it to say that this step is necessary to allow Scala class wrappers around existing Ruby classes.

On the next line of interest, we see for the first time how the framework allows for seamless Ruby method invocation.  Unlike Ruby, Scala doesn’t allow us to simply handle calls to non-existent methods.  Because of this limitation, we have to be a bit more clever in how we structure the syntax.  In this case, we use Scala symbols to represent the method.  There doesn’t seem to be a terribly good explanation of symbols in Scala, but there’s plenty of information regarding how they work in Ruby.  Since the concepts are virtually identical, techniques are cross-applicable.

The key to the whole “symbols as methods” idea is implicit type conversion.  The JRuby trait inherits a set of conversions which look something like this:

implicit def sym2Method[R](sym:Symbol):(Any*)=>R = send[R](sym2string(sym))
implicit def sym2MethodAssign[R](sym:Symbol) = new SpecialMethodAssign[R](sym)
 
private[scalaruby] class SpecialMethodAssign[R](sym:Symbol) {
  def intern_=(param:Any) = new RubyMethod[R](str2sym(sym2string(sym) + "="))(param)
}

Though we haven’t looked at it yet, it is possible to infer the purpose of the send(String) method.  It’s function is to prepare a call to a Ruby method without actually invoking it.  This distinction allows us to pass Ruby methods around as method parameters, just like standard Scala methods.  The method returned is actually an instance of class RubyMethod[R] (where R is the return type).  Scala allows classes to extend structural types like methods, allowing us to redefine the method invocation semantics for wrapped Ruby calls.

class RubyMethod[R](method:Symbol) extends ((Any*)=>R) {
  import JRuby.engine
 
  override def apply(params:Any*) = call(params.toArray)
 
  private[scalaruby] def call(params:Array[Any]):R = {
    val context = engine.getContext()
    val plist = new Array[String](params.length)
 
    for (i <- 0 until params.length) {
      plist(i) = "res" + i
      context.setAttribute(plist(i), JRuby.resolveValue(params(i)), ScriptContext.ENGINE_SCOPE)
      plist(i) = "$" + plist(i)
    }
 
    evaluate(() => if (plist.length > 0) {
      sym2string(method) + "(" + plist.reduceLeft[String](_ + ", " + _) + ")"
    } else {
      sym2string(method) + "()"
    })
  }
 
  protected def evaluate(invoke:()=>String):R = {
    val toRun = invoke()
    Logger.getLogger("com.codecommit.scalaruby").info(toRun)
 
    JRuby.handleExcept(JRuby.wrapValue[R](engine, engine.eval(toRun, engine.getContext())))
  }
}

The gist of this code is simply to assign every parameter value to an attribute in the Ruby runtime.  Attributes of ENGINE_SCOPE (as defined by JSR-233) are represented as global variables within Ruby.  These variables are named sequentially starting from zero.  (e.g. $res0, $res1, …)  As you can imagine, this technique tends to be a bit of a concurrency killer.  To keep things simple, I decided to completely ignore the issues associated with asynchronous execution.  It is certainly possible to adapt the framework to function in a multi-threaded environment, but I didn’t bother to do it.  (one of the perks of blogging is a license to extreme laziness)

Once these parameters are assigned, the method call is evaluated within the context of the runtime.  This is done by literally generating the corresponding Ruby code (done in the anonymous method) and then wrapping the return value in an instance of RubyObject (if necessary).  Note that the send(String) method does not actually kick-start this invocation process at all.  Rather, it creates an instance of RubyMethod[R] which corresponds to the method name.  This class extends (Any*)=>R, so it may be used in the normal “method fashion” - by appending parentheses which enclose parameters (if any).

Supporting Cast

At this point, it’s worth taking a moment to examine the specifics of the framework class hierarchy.  A number of classes exist to wrap around Ruby objects and methods.  We’ve already seen a few of them (RubyMethod[R] and RubyObject), but it’s worth going into more detail as to their purpose and relation to one another. 

Note that these class names often conflict with existing classes in the JRuby implementation.  This odd coincidence is precipitated by the fact that the framework seems to deal with a lot of the same concepts as the JRuby runtime (go figure).  Rather than obfuscating my class naming to avoid conflict, I just assume that you will either make use of the enhanced Scala import feature (as I have in the implementation), or just avoid using the JRuby internal classes.

image

  • RubyObject - The root of the object hierarchy.  This abstract class is designed to encapsulate the core functionality of the generic object (roughly: send, -> and eval) as well as containing all of the implicit type conversions.  Most of the syntax-defining magic happens here (more on this later).
  • JRuby - This is the primary type interface between the developer and the framework.  Classes which wish to make use of Ruby integration must inherit from this trait.  This is where the Logger (for executed statements) is initialized and deactivated.  Within the corresponding object, all of the backend resources are managed.  This is where the actual ScriptEngineManager instance lives, as well as a set of utility methods to handle wrapping and unwrapping of framework-specific objects.
  • RubyWrapperObject - This implementation of RubyObject is designed to wrap around instances which already exist within the Ruby interpreter.  For example, if a Ruby method returns an instance of ActiveRecord::Base, it will be represented in Scala by a corresponding instance of RubyWrapperObject.  Note that objects which are equivalent in the Ruby interpreter are not guaranteed to be pointer-equivalent.  However, the equals(Object) method is well-defined within RubyObject, thus comparisons between RubyObject instances will return sane results.  The == method in Scala is defined in terms of equals(Object), so existing code will behave rationally.
  • RubyClass - With the exception of the JRuby trait, this is likely the only class within the framework which the developer will have to reference explicitly.  This class allows developer-defined Scala classes to wrap around existing Ruby classes, providing type-safe method calls and even extended functionality.  More on this feature later.
  • RubyMethod - We’ve already seen how this serves as a wrapper around calls to Ruby methods.  However, its default implementation assumes that the method is defined in the global namespace.  This is impractical for many method calls (such as dispatch on an object).
  • RubyInstanceMethod - This class solves the problem of object dispatch with RubyMethod.  All of the core functionality is identical to its superclass with the exception of the generated Ruby code.  Instead of just generating a method call passing parameters, this class will generate a method call on a given Ruby object.  Thus, this class depends upon RubyWrapperObject which maintains a reference to a corresponding Ruby instance.

Alternative Dispatch

Not every method call is made on the enclosing scope.  Sometimes it is necessary to call a method in an object to which you have a reference.  For example, a method may return an instance of a some Ruby class.  This instance will be automatically wrapped by a Scala instance of RubyWrappedObject.  Since this Scala class doesn’t actually define any methods which correspond to the Ruby class, it is necessary to once more utilize the “symbols as methods” trick in method dispatch.  There are two ways to call a method on an object like this: the send[R](String) method (where R is the return type), and the -> (arrow) operator.

Using the arrow operator is a lot like normal method calls, except with symbols instead of method names.  Just like dispatch on the enclosing scope, the call is converted into an instance of RubyMethod (actually, an instance of RubyInstanceMethod) which can then be used as a standard Scala method.  The difference between using arrow and dispatching on the enclosing scope is the syntax must be a little more contrived.

Parentheses have the second-highest priority of all the Scala operators (the dot operator (.) has the highest).  This means that if we simply “follow our nose” where the syntax is concerned, we will arrive at an order of invocation which leads to an undesirable result.  Consider the following sample:

val obj = 'create_person()
obj->'name()

The first call is a standard dispatch on the enclosing scope.  The second call is what is interesting to us.  Reading this line naturally (at least to old C/C++ programmers) we would arrive at the following sequence of events:

  1. Get a reference to the name method from the instance contained within obj
  2. Invoke the method, passing no parameters

Unfortunately, this is not how the compiler sees things.  Because parentheses bind tighter than the arrow operator, it actually resolves the expression in the following way:

  1. Get a reference to the name method contained within the enclosing scope
  2. Invoke the method, passing no parameters
  3. Invoke the -> method on the instance within obj, passing the result of name as a parameter

This is obviously not what we wanted.  Unfortunately, there’s no way to make the arrow operator bind tighter than parentheses.  This is a good thing from a language standpoint, but it causes problems for our syntax.

The solution is to enclose any “arrow dispatch” statement within parentheses so as to force the order of evaluation:

val obj = 'create_person()
(obj->'name)()

It looks a bit weird, but it’s the only way Scala will allow this to work.  This call now evaluates properly, calling the name method on the obj instance, passing no parameters.

There’s actually another problem associated with arrow dispatch in our DSL: Scala already has an implicit meaning for the arrow operator.  The following sample should look familiar to those of you who have worked with Scala in other applications:

val numbers = Map(1 -> "one", 2 -> "two", 3 -> "three")

By default, Scala defines the arrow operator as an alternative syntax for defining 2-tuples.  This is good for most things, but bad for us.  What we want is to define a new implicit type conversion which converts Any into a corresponding instance of RubyWrappedObject.  This would allow us to satisfy the syntax given above.  However, Scala’s 2-tuple syntax already defines an implicit type conversion for the Any type which deals with the arrow operator.  Rather than examining the context to attempt to disambiguate, the Scala compiler simply gives up and prints an error stating that the implicit type conversions are ambiguous.  This poses a bit of a problem and nearly killed the arrow operator idea in design.

The solution is actually to override Scala’s built-in conversion by defining our own conversion with the same name and signature but which provides us with the option of using our own arrow operator definition.  The behavior we want is to allow normal use of the arrow operator when dealing with Any -> Any, but convert to RubyWrappedObject and dispatch when dealing with Any -> Symbol.  After a little digging through the Scala standard library, I arrived at the following solution (defined in RubyObject):

implicit def any2ArrowAssoc[A](a:A) = new SpecialArrowAssoc(a)
 
private[scalaruby] class SpecialArrowAssoc[A](a:A) extends ArrowAssoc(a) {
  def ->(sym:Symbol) = (a match {
    case obj:RubyObject => obj
    case other => new RubyWrapperObject(other)
  })->sym
}

Notice that we extend Scala’s pre-existing ArrowAssoc[A] class (which handles the special 2-tuple syntax) and then overload the -> method to work differently with symbols.  This code now does precisely what we need.  By introducing this extra layer of indirection, as well as by overriding Scala’s existing conversion, we’re able to support the arrow syntax as shown in the above examples.

Sending Messages

There is one final form of dispatch which allows typed return values: send[R](String).  This is actually the method to which all the other dispatch forms delegate (as it is the most general).  This method is very similar to the Ruby send method which allows Smalltalk-style message passing on arbitrary objects.  The really important thing about this method though is that it will automatically cast the return value from the method to whatever type you specify, allowing you to define type-safe wrappers around existing Ruby methods in Scala:

def multiply(a:Int, b:Int) = send[Int]("multiply")(a, b)
 
val result:Int = multiply(123, 23)

send is effectively defined as a curried function since it takes a method name as a parameter and returns an instance of RubyMethod as a result.  This mimics the behavior of dispatch with symbol literals in that you can use send to generate type-safe partially-applied functions for corresponding Ruby methods.

Note that send could just as easily have taken a symbol as a parameter, rather than a string.  However, the metaphor throughout the DSL is “symbols as methods”, thus string was used to avoid logical conflict.  Scala itself was perfectly happy passing symbol literals around in addition to treating them as methods.

Class Wrapping

The final bit of code in the example now so far above us serves as a sample of how one might wrap an existing Ruby class within Scala.  Person is actually a class defined in Ruby (as you can see from the Ruby sources).  It has a read/write attribute, name, as well as an overridden to_s method.  RubyObject already contains the logic for handling calls to toString() and proxying them to Ruby’s to_s, but the name attribute must be handled explicitly in code.

The goal is basically to provide a type-safe wrapper around the Person Ruby class.  We could just as easily dispatch on the automatically wrapped instance of RubyWrappedObject using either syntax described above, but an explicit wrapper is a bit nicer.  The compiler can check things for us, and we can even add methods to the class (at least, as far as Scala is concerned) in true Ruby “open class” style.  All that is necessary to accomplish this wrapper is to extend RubyClass and to define the delegating wrapper methods:

class Person(name:String) extends RubyClass('Person, name) {
  def name = send[String]("name")()
  def name_=(n:String) = 'name = n
}

We specify which Ruby class we are wrapping as the first parameter in the constructor for RubyClass.  The parameters which follow are passed directly to the constructor of the corresponding Ruby class.  This Ruby constructor is invoked automatically, instantiating the corresponding wrapped Ruby object in the background.  Notice that we specify the name of the Ruby class using a symbol.  This is the one place in the framework that we break with the “symbols as methods” metaphor.  The consequence is a nice, clean syntax for Ruby class wrapping.  Unfortunately, it also means that wrapping a class within a non-included namespace (e.g. ActiveRecord::Base) can be a little clunky.  The only way to do it is to explicitly invoke the Symbol(String) constructor.  (this is required because Scala symbols can only contain alpha-numerics and underscores)

Once we have our wrapped class signature, it’s easy to define the delegate methods.  Scala encourages a blurring of field and method, similar to Ruby.  As such, it supports a very Ruby-esque syntax for accessor/mutator pairs.  This makes the wrapped syntax just a bit nicer.  For the accessor, we make a call to the send method, specifying the return type necessary for the wrapper.  The mutator allows us to be a bit more creative.

We don’t really need type-safe return values for a mutator.  We would normally just set the return type as Unit and ignore the result.  Thus we can once again use the symbol dispatch syntax.  Notice that this time we’re not directly treating a symbol as a method.  We’re apparently assigning a value to the symbol using the = operator (corresponds to the operator= assignment operator in C++).  This is possible through a separate implicit type conversion which generates a one-off utility instance:

private[scalaruby] class SpecialMethodAssign[R](sym:Symbol) {
  def intern_=(param:Any) = new RubyMethod[R](str2sym(sym2string(sym) + "="))(param)
}

As you can see, all this method does is generate a new symbol which includes the ‘=’ character and returns the result of dispatching on the corresponding Ruby method.  Note that mutators in Ruby are defined as “=“, thus appending “=” to the method name is the appropriate behavior.

Return Value Wrapping

There’s actually a slight problem involved in allowing Scala wrappers around existing Ruby types.  Well, not so much a problem as an inconsistency.  The problem is simply this: if a Ruby method creates an instance of a Ruby class for which there is a Scala wrapper and returns this value through the framework into Scala, one would expect this value would be wrapped into an instance of the Scala wrapper.  If you look in the example far above, there is an example of this in the create_person method.  The method creates an instance of Ruby class Person and returns it as a result.

Somehow, the framework must identify that there is a corresponding Scala wrapper and then properly create an instance.  This actually poses something of a dilemma in two ways.  Number one, Scala has no equivalent to Ruby’s ObjectSpace, so there’s no way to get a comprehensive list of all classes which have been defined.  Even if we could get this list, the corresponding Ruby class is specified in the constructor parameters to RubyClass, so there’s no way to obtain the information statically from outside the class.  Number two, we have to somehow create an instance of the Scala wrapper class without creating a corresponding instance of the wrapped Ruby class (since we already have one).  This means we need some sort of override in the RubyClass constructor.

The best solution to all of these problems is to introduce the associate method.  The usage is demonstrated at the top of the example where we associate the Person Ruby class with the Person Scala wrapper class.  More specifically, we associate the Ruby class with a pass-by-name parameter which defines how to instantiate the Scala class.  This is an important distinction as it solves our second problem of instance creation.  The framework has no way of knowing what parameters must be passed to the Scala wrapper constructor, so the instantiation itself must be passed:

associate('Person)(new Person(""))

As I mentioned previously, this is a pass-by-name parameter which means that it will not be immediately evaluated, but rather on-demand somewhere in the body of associate.  The associate method actually takes this value and wraps it in an anonymous method which invokes the instantiation each time a value of Ruby type Person must be wrapped.  Just prior to invoking the constructor, an override is put in place within the RubyClass singleton object (not shown in the class hierarchy) to prevent the creation of a corresponding Ruby instance.  This is what allows the new instance of Scala class Person to correspond with an existing Ruby value.  Here again we’re sacrificing concurrency for a hacky work-around to a complex problem.  Any sort of “proper” implementation would have to solve this problem in a more elegant way.

It Never Ends!

This post, that is.  There’s so much more I could ramble on about (I never even talked about how exceptions are handled), but this entry is already far too long.  Hopefully the material presented here only serves to whet your appetite for slicker JRuby-Scala integration and all the benefits it can bring.  I’ve packaged up the framework presented here as a downloadable archive.  The package includes the Ruby engine for the Java Scripting API as well as a jar-complete build made from the JRuby SVN.  The project may work with JRuby 1.0, but I doubt it.  Anyway, JRuby 1.1 is due shortly, so why bother.  Remember that this is extremely untested and very experimental.  (I did warn you about the concurrency issues, right?)  If this is interesting to people, I may do a proper release into an OSS project somewhere.  For right now, I just don’t have the time.  :-(

I hope this entry gives you an idea of what’s involved in Scala DSL implementation, as well as an idea of where such a technique may be useful in your own projects.  After all, what would be better than everyone being able to write their own Rails-killer and define highly fluid APIs!