Skip to content

Buildr Still Not Ready for Prime Time

12
May
2008

As some of you may know, I’ve been following the Buildr project with a fair degree of interest of late.  Just last week, Assaf announced their first release as an Apache project: Buildr 1.3.0.  Being of the inquisitive bend, I decided that this would be an excellent time to reevaluate its suitability for use in my projects, both commercial and personal.

A while back, I evaluated Buildr as a potential replacement for Ant and Maven2 and eventually came to the unfortunate conclusion that it just wasn’t ready.  At the time, Buildr was still hampered by a number of annoying limitations (such as being unable to reconfigure the default directory structure).  I’m happy to say that many of these problems have been fixed in the 1.3 release, as well as a large number of feature additions which I hadn’t thought to request.  Despite all that, I’m still forced to conclude that Buildr simply isn’t (yet) suitable as my build tool of choice.

Now before you stone me for utter heresy, try to hear me out.  I’m not dissing Buildr in any way, I really want to use it, but I just can’t justify moving over to it in the face of all of its current issues.  What’s really unfortunate is that all of these issues can be summed up with a single word: transitivity.

One of Maven’s killer features is the ability to resolve the entire transitive dependency graph.  Now I’ll grant that it does this with varying degrees of success, but for the most part it’s pretty smart about how it fixes up your CLASSPATH.  As of 1.3, Buildr claims to have experimental support for transitive dependency resolution, but judging from the problems I encountered in my experimentation, it’s barely even deserving of mention as an experiment, much less a full-fledged feature.

To understand why transitive dependencies are such a problem, it is of course necessary to first understand the definition of such.  In a word (well, several anyway), a transitive dependency is an inherited dependency, an artifact which is not depended upon directly by the project, but rather by one of its dependencies.  This definitions carries recursively to dependent parents, grand-parents and so on, thus defining a transitive dependency graph.  Consider it this way:

image

In the diagram, MyCoolProject is the artifact we are trying to compile.  The only dependencies we have actually specified for this artifact are the DBPool and Databinder artifacts.  However, the Databinder artifact has declared that it depends upon both the Wicket and the ActiveObjects artifacts.  ActiveObjects doesn’t depend upon anything, but Wicket has dependencies SLF4J and on the Servlets API.  Thus, our original goal, MyCoolProject, has transitive dependencies Wicket, SLF4J, Servlets and ActiveObjects.  Quite a bit more than we thought we were asking for when we declared our dependency on Databinder.

In general, this sort of transitive resolution is a good thing.  It means that instead of specifying six dependencies, we only had to specify two.  Furthermore, we observed the DRY principle by not re-specifying dependency information already contained within the various packages.  As with any “good thing”, there’s definitely a valid argument regarding its down sides, but on the whole I’m quite fond of this feature.

In Maven, this sort of thing happens by default, which can lead to some very confusing and subtle CLASSPATH problems (conflicting packages, etc).  Buildr takes a slightly different approach in 1.3 by forcing the use of the transitive method:

repositories.remote << 'http://www.ibiblio.org/maven2'
 
define 'MyCoolProject' do
  project.version = '1.0'
 
  compile.with transitive('xmlrpc:xmlrpc-server:jar:3.0')
 
  package :jar
end

It’s easy to see why Buildr is raising such a ruckus in the build genre.  It’s syntax is elegance itself, and since it’s actually a proper scripting language (Ruby), there’s really nothing you can’t do with it.  Having to remember to specify the default repository is a bit of a pain, but it’s certainly something I can live with.  The real problem with this example is a little less subtle: It doesn’t work.

If you create a buildfile with the above contents and then run the buildr command, the results will be something like the following:

Downloading org.apache.xmlrpc:xmlrpc:jar:3.0
rake aborted!
Failed to download org.apache.xmlrpc:xmlrpc:jar:3.0, tried the following repositories:
http://www.ibiblio.org/maven2/

(See full trace by running task with --trace)

After a fairly significant amount of digging, I managed to discover that this problem is caused by the fact that Buildr attempts to download a corresponding JAR file for every POM it resolves.  This seems logical until you consider that many large projects (including Databinder, Wicket, and Hibernate) define POM-only projects which exist for the sole purpose of creating transitive dependencies.  They’re organizational units, designed to allow projects to depend upon one super-artifact, rather than a dozen sub-projects.  It’s a very common practice, and one which Buildr completely fails to handle appropriately.

After some prodding on the Buildr dev mailing-list, Assaf admitted that this is an issue worth looking at and provided a temporary workaround (pending a rework of the functionality in 1.4):

def pom_only(a)
  artifact a do |task|
    mkpath File.dirname(task.to_s)
    Zip::ZipOutputStream.open task.to_s
  end
end

This was about the point that I started wondering if maybe Ant/Ivy would be a better bet, at least for the time being.  To use this workaround, you must call the pom_only method once for every POM-only project in the dependency graph.  Usually, this means you must invoke Buildr repeatedly and find the troublesome artifacts by trial and error.  Not exactly a “just works” solution.

Pressing forward however, I unearthed a deeper, even more insidious issue: an intermittent failure to generate Eclipse project meta.  I’m not sure if this is due to the POM-only dependencies or just bad juju, but whatever the reason, it’s annoying.  I’ve raised the issue on the Buildr mailing lists, but so far no response.  Basically, what happens is something like this:

C:\Users\Daniel Spiewak\Desktop\MyCoolProject> buildr eclipse
(in C:/Users/Daniel Spiewak/Desktop/MyCoolProject, development)
Completed in 0.499s
C:\Users\Daniel Spiewak\Desktop\MyCoolProject>

Not exactly the most helpful output.  In case you were wondering, this process did not create the Eclipse metadata.  It’s interesting to note that calling buildr idea (to create project metadata for IntelliJ) seems to work just fine.  Whatever causes the bug, it seems to be specific to just the Eclipse project generator.

Buildr is a remarkable project.  It shows potential to someday become the de-facto build system, possibly even unseating Ant.  Unfortunately, that day is not today.  There are too many odd wrinkles and unpredictable errors to really call it a “finished product”.  Hopefully, the Buildr dev team will continue their excellent work, eventually producing a tool worthy of serious consideration.  Until then, I guess that I’m (still) stuck with Ant.

Is Scala Really the Next C++?

5
May
2008

I’ve been turning that question over in my head for a few months now.  It’s really a worthy thought.  At face value, it’s incredibly derogatory and implicative of an over-bulked, poorly designed language.  While I’m sure this is not how the concept was originally intended, it certainly comes across that way.

But the more I think about it, the more I realize that the parallels are uncanny.  Consider the situation back in 1983, when C++ first got started.  C had become the dominant language for serious developments.  It was fast, commonly known and perhaps more importantly, structured.  C had successfully applied the concepts shown in Pascal to systems programming, revolutionizing the industry and becoming the lingua franca of developers everywhere.

When C++ arrived, one of its main selling points was as a “better C”.  It was possible to interoperate seamlessly between C++ and C, even to the point of compiling most C programs unmodified in C++.  But despite its roots, it still managed to introduce a number of features drawn from Smalltalk et al. (such as classes and virtual member functions).  It represented a paradigm shift in the way developers represented concepts.  In fact, I think it’s safe to say that the popular object-oriented design principles that we all take for granted would never have evolved to this level without the introduction of C++.  (yes, I’m aware of Objective-C and other such efforts, but C++ was the one which caught on)

So we’ve got a few catch-phrases here: “better C”, “seamless interop”, “backwards compatibility”, “paradigm shift”, etc.  Sound familiar?  (actually, it sounds a lot like Groovy)  The truth is that Scala seems to occupy a very similar place in history (if six months ago can be considered “history”).  Scala is almost an extension to Java.  It brings to the language things like higher-order functions, type inference and a type system of frightening power.  Scala represents a fundamental shift in the concepts and designs we use to model problems.  I truly believe that whatever language we’re using in a decade’s time, it will borrow heavily from the concepts introduced in Scala (in the same way that Java borrowed from C++).

But if Scala and C++ are so similar in historical inception, shouldn’t we view the language with a certain amount of distrust?  We all know what a mess C++ turned out to be, why should Scala be any different?  I believe the answer has to do with Scala’s fundamental design principles.  Specifically, Scala is not trying to be source-compatible with Java.  You can’t just take Java sources and compile them with Scala.

This clean break with the progenitor language has a number of ramifications.  Most importantly, Scala is able to smooth many of the rough edges in Java without breaking existing libraries.  For example, Scala’s generics are far more consistent than Java’s, despite still being implemented using erasure.  This snippet, for example, fails to compile:

def doSomething(ls:List) = {
  ...
}

All we have done is omit the generic type parameter.  In Java, the equivalent would lead to a compiler warning at worst, because Java has to remain backwards compatible with code written before the introduction of generics.  This “error vs warning” distinction seems a bit trivial at first, but the distinction has massive implications throughout the rest of the type system.  Anyone who has ever tried to write a “generified” library in Java will know what I mean.

Scala represents a clean break from Java.  This is in sharp contrast to C++, which was trying to remain fully backward compatible with C sources.  This meant inheriting all of C’s weird wrinkles (pass-by-value, no forward referencing, etc).  If C++ had just abandoned it’s C legacy, it would have been a much nicer language.  Arguably, a language more like Java.  :-)

Perhaps the most important distinction between Scala and C++ is that Scala is being designed from the ground up with consistency in mind.  All of the major problems in C++ can be traced back to inconsistencies in syntax, semantics or both.  That’s not to say that the designers of C++ didn’t put a good deal of effort into keeping the language homogenous, but the truth is that they ultimately failed.  Now we could argue until the cows come home about why they failed, but whatever the reasons, it’s done and it has given C++ a very bad reputation.  Scala on the other hand is being built by a close-knit team of academics who have spent a lifetime thinking about how to properly design a language.  I tend to think that they have a better chance of succeeding than the C++ folks did.

So the moral of this long and rambling post is that you shouldn’t be wary of the Scala language.  It’s not going to become the next evil emperor of the language world.  Far from it, Scala may just represent the next step forward into true programmatic enlightenment.

Useless Hackery: A Scala Quine

30
Apr
2008

Warning: This post has little-to-no practical value.  Waste time at your own risk…

While double-checking the terms for my previous post, I came across the Wikipedia definition of a polyglot program:

In the context of computing, a polyglot is a computer program or script written in a valid form of multiple programming languages, which performs the same operations or output independently of the programming language used to compile or interpret it.

Not precisely the same as the definition which has now come into common use (referring to the use of multiple languages in a single application).  The article goes on to give two examples of a polyglot, one in PHP/C/Bash and one in Haskell/OCaml/Scheme (I don’t count the Perl/DOS example since it doesn’t perform the same function in both languages).  These examples are quite interesting, but what really caught my eye were the additional properties of the second example: not only is it a polyglot, but it is also a quine:

In computing, a quine is a program, a form of metaprogram, that produces its complete source code as its only output.

Think about that for just a second: A program which produces itself as its only output.  I think that’s probably the most profound brain-teaser that I’ve run across in months.  Consider for a moment just how one would accomplish this.  For example, we could try a naive implementation in Ruby:

puts "puts \"puts \\\"..."

You’ll notice that we have run into a bit of a problem.  In fact, the infinitely recursive nature of the definition is precisely what makes quines so interesting.  Of course, I’m aware that there are already a number of very clever Ruby quines, but that’s not the point.  After all, what good is a puzzle if someone else gives you the solution?

By putting a little thought into this, we can devise a slightly more advanced attempt which brings us a bit closer to quine-ness:

s = "s = \"#{s}\"; puts s"
puts s

We’re getting closer, anyway.  We still have a serious problem in that string declaration (hint: it has something to do with the whole recursiveness thing).  We have to somehow include the string within itself once explicitly, but on the inner recursion only include a textual reference to itself.  This is by no means trivial to accomplish.

One technique we can employ is string formatting.  Old C salts will certainly be familiar with the printf function.  There’s a clever little trick we can employ which allows us to format a string using itself as the format string.  This is one way to provide single-level recursion in the string resolution:

char *s = "char *s = \"%s\"; printf(s, s);";
printf(s, s);

Note that I’m cheating a bit on the formatting to make things more readable.  There’s really nothing preventing this sample from formatting a bit more correctly (newline, etc).

We’re almost there now.  Our only remaining problem is the fact that the second recursion of the string will have improperly quoted double-quotes.  Gary Thompson shows a fully fleshed-out C quine which gets around this problem by exploiting the int/char duality in the language.  However, this little trick isn’t precisely available in languages like Scala.  Well, it is, but there are problems with the printf formatting which obviate the possibility.  Specifically, Scala’s printf method does not allow for the standard %s-style formatting (even though the scaladoc claims that it does).  All that this function allows us are simple substitutions, but it turns out that this is enough to (finally) complete our quine in Scala (formatted for easy reading):

object Q extends Application {
  val s = "object Q extends Application'{'val s={0}{2}{0};printf(s,{0}{1}{0}{0},{0}{1}{1}{0},s)}"
  printf(s, "\"", "\\", s)
}

Even with the reformatting, the second line still overflows the formatting on most browsers (sorry about that).  I’ve uploaded the unformatted, “true quine” here.

It’s somewhat interesting that Scala’s syntax is concise enough (especially with type inference) that this sort of thing is possible in only 149 characters.  If you look around for Java quines (there are a few of them), you’ll see that most of them take a similar approach, but they usually have trouble with the encumbrances of Java’s highly-verbose syntax.  It’s sort-of depressing to condense your code when you have to type “public static void main” regardless.

Anyway, I’m certainly not an expert on the ins-and-outs of the Scala library.  Suggestions welcome on how to golf this down a bit.   Or better yet, an extremely clever solution which eluded me entirely.

The Plague of Polyglotism

28
Apr
2008

For those of you who don’t know, polyglotism is not some weird religion but actually a growing trend in the programming industry.  In essence, it is the concept that one should not be confined to a single language for a given system or even a specific application.  With polyglot programming, a single project could use dozens of different languages, each for a different task to which they are uniquely well-suited.

As a basic example, we could write a Wicket component which makes use of Ruby’s RedCloth library for working with Textile.  Because of Scala’s flexible syntax, we can use it to perform the interop between Wicket and Ruby using an internal DSL:

class TextileLabel(id:String, model:IModel) extends WebComponent(id, model) with JRuby {
  require("textile_utils")
 
  override def onComponentTagBody(stream:MarkupStream, openTag:ComponentTag) {
    replaceComponentTagBody(markupStream, openTag, 
        'textilize(model.getObject().toString()))
  }
}
# textile_utils.rb
require 'redcloth'
 
def textilize(text)
  doc = RedCloth.new text
  doc.to_html
end

Warning: Untested code

We’re actually using three languages here, even though we only have source for two of them.  The Wicket library itself is written in Java, our component is written in Scala and we work with the RedCloth library in Ruby.    This is hardly the best example of polyglotism, but it suffices for a simple illustration.  The general idea is that you would apply this concept to a more serious project and perform more significant tasks in each of the various languages.

The Bad News

This is all well and good, but there’s a glaring problem with this philosophy of design: not everyone knows every language.  You may be a language aficionado, picking up everything from Scheme to Objective-C, but it’s only a very small percentage of developers who share that passion.  Many projects are composed of developers without extensive knowledge of diverse languages.  In fact, even with a really good sampling of talent, it’s doubtful you’ll have more than one or two people fluent in more than two languages.  And unfortunately, there’s this pesky concern we all love called “maintainability”.

Let’s pretend that Slava Pestov comes into your project as a consultant and decides that he’s going to apply the polyglot programming philosophy.  He writes a good portion of your application in Java, Lisp and some language called Factor, pockets his consultant’s fee and then moves on.  Now the code he wrote may have been phenomenally well-designed and really perfect for the task, but you’re going to have a very hard time finding a developer who can maintain it.  Let’s say that six months down the road, you decide that your widget really needs a red push button, rather than a green radio selector.  Either you need a developer who knows Factor (hint: there aren’t very many), or you need a developer who’s willing to learn it.  The thing is that most developers with the knowledge and motivation to learn a language have either already done so, or are familiar enough with the base concepts as to be capable of jumping right in.  These developers fall into that limited group of people fluent in many different languages, and as such are a rare find.

Now I’m not picking on Factor in any way, it’s a very interesting language, but it still isn’t very widespread in terms of developer expertise.  That’s really what this all comes down to: developer expertise.  Every time you make a language choice, you limit the pool of developers who are even capable of groking your code.  If I decide to build an application in Java, even assuming that’s the only language I use, I have still eliminated maybe 20% of all developers from ever touching the project.  If I make the decision to use Ruby for some parts of the application, while still using Java for the others, I’ve now factored that 80% down to maybe 35% (developers who know Java and Ruby).  Once I throw in Scala, that cuts it down still further (maybe at 15% now).  If I add a fourth language - for example, Haskell - I’ve now narrowed the field so far, that it’s doubtful I’ll find anyone capable of handling all aspects within a reasonable price range.  It’s the same problem as with framework choice, except that frameworks are much easier to learn than languages.

The polyglot ideal was really devised by a bunch of nerdy folks like me.  I love languages and would like nothing better than to get paid to learn half a dozen new ones (assuming I’m coming into a project with a strange combination I haven’t seen before).  However, as I understand the industry, that’s not a common sentiment.  So a very loud minority of developers (/me waves) has managed to forge a very hot methodology, one which excludes almost all of the hard-working developer community.  If I didn’t know better, I would be tempted to say that it was a self-serving industry ploy to foster exclusivity in the job market.

I want to work on multi-language projects as much as anyone, but I really don’t think it’s the best thing right now.  I’m working on a project now which has an aspect for which Scala would be absolutely perfect, but since I’m the only developer on hand who is remotely familiar with the language, I’m probably going to end up recommending against its adoption.  Consider carefully the ramifications of trying new languages on your own projects, you may not be doing future developers any favors by going down that path.

Screencast: Introduction to the Scala Developer Tools

21
Apr
2008

Virtually everyone who has visited the Scala project page has seen the info page for the Scala plugin for Eclipse.  There are a few screenshots, an update site and very little instruction on how to proceed from there.  Those of you who have actually installed this plugin can vouch for how terribly it works as well as the remarkable lack of usefulness in its functionality.  It’s basically a very crude syntax highlighting editor for Scala embedded into Eclipse.  It has the ability to run programs and compile them within the IDE, but that’s about all.  Worse than that, it seems to make everything else about Eclipse less stable; somehow crashing random, unrelated plugins (such as DLTK).  Needless to say, it’s often a race to see how fast we can remove the Scala Eclipse plugin from our systems.

What is far less widely known is that there is a second Eclipse plugin which offers support for Scala development.  Basically, the guys at LAMP decided that it wasn’t worth trying to build out the original plugin any further.  Instead, they started from scratch and created a whole new implementation.  The result is entitled the “Scala Developer Tools” (or SDT, if you’re into short and phonetically confusing acronyms).  Basically, this plugin is a very unstable, very experimental attempt to build a first-class IDE for Scala on top of Eclipse.  Obviously, they still have a ways to go:

image

In case you were wondering, no that isn’t my default editor font.  To say the least, the plugin suffers from an annoying plethora of UI-related bugs.  Behavior is inconsistent, and often times changing a value doesn’t seem to be permanent (it took me several tries to get the syntax highlighting to stop shifting before my very eyes).  To make matters worse, it seems that installing the plugin in the first place is a bit like playing a game of hopscotch using un-anchored floats in the middle of a pool.  The update site has a nasty habit of throwing a 404 about 50% of the time.  You know what they say: if at first you don’t succeed…

The good news is that once you get the plugin installed, the preferences beaten into submission, and the UI bugs safely ignored, things become quite nice indeed.  The new editor is vastly improved over the old one, and it’s easy to see tremendous potential in the project.  Things are actually getting to a point where I would consider using the plugin rather than my current jEdit setup.

Of course, it’s hard to get a good idea of how a tool works until you see it in action.  That’s why I took the time to put together a small screencast which illustrates some of the highlights of the new editor.  I made no attempt to hide the bugs which cropped up during my testing, so this should give you a fair approximation of the current state of the plugin and whether it’s worth trying for your own projects.  The screencast has been produced at a reasonably high resolution (1024×732) in both Flash and downloadable AVI format.  Enjoy!

screencast-front