Skip to content

So Begins the Functional Revolution

19
May
2008

When I started learning Scala, I was convinced that its designers were positively the worst marketers I had ever seen.  The official project page was (and is) peppered with Scala examples of things like quicksort, factoring, prime number sieves and so on.  All of these examples were written in such a way as to be virtually incomprehensible to the average OOP developer.  In fact, they all had a very simple common denominator which really set me off: they were written with a very functional flair.

Functional programming is an ancient and venerable tradition, dating back all the way to the days of Lisp and APL.  In its purest form (ahem: Haskell), functional programming is the absence of side-effects.  Everything in the language is some sort of declarative expression, taking some values and returning a result.  This sweeping restricting has some fairly profound consequences.  Consider the following ML function:

fun search nil _ = false
  | search (hd::tail) x = (hd = x) orelse (search tail x)

For most developers sporting an imperative background, this is probably fairly difficult to read.  If you actually boil it down, all it does is walk through a list, returning true if it finds a certain element (x), otherwise false.  This implementation is a far cry from how we would do it in an imperative language:

public <T> boolean search(List<T> list, T element) {
    for (T hd : list) {
        if (element.equals(hd)) {
            return true;
        }
    }
 
    return false;
}

Arguably, this is harder to read, but it is much more familiar to most people.  Both functions do exactly the same thing, but one of them relies upon mutable state and the side effects imposed by iterators, while the other is completely declarative (in the mathematical sense).  The one critical thing to notice is that the Java version is less constrained.  In ML, you can only have one expression per function, and you certainly can’t have anything mutable floating around.  By contrast, Java offers a far greater sense of freedom in what you can do.  Want to modify an instance field?  Go right ahead; there’s nothing stopping you!  I believe that it is for this reason that imperative languages like C/C++, Java, Ruby and such have really become the dominant force in the industry.

Getting back to my original point though…  I have to admit that I used to be a firm believer in the One True (imperative) way of doing things.  The OOP mothership had landed and I was 100% convinced that it was here to stay.  However, after spending some time getting to know Scala, I’m beginning to sway more to the “academic” way of thinking: functional languages are pretty nice.  Consider the following Scala algorithm which takes a series of integers as its arguments and prints their sum:

object Main {
  def main(args: Array[String]) {
    println(args.map(_.toInt).foldLeft(0)(_ + _))
  }
}

Compare that to the equivalent Java code:

public class Main {
    public static void main(String[] args) {
        int sum = 0;
        for (String arg : args) {
            sum += Integer.parseInt(arg);
        }
 
        System.out.println(sum);
    }
}

Scala’s naturally concise syntax aside, the use of functional concepts definitely contributes to a more expressive solution.  Once you understand what the map and foldLeft methods accomplish, this code becomes startlingly readable.  What’s more, because this code is simpler and more expressive, the potential for subtle bugs and maintenance problems because virtually non-existent.  If something goes wrong in our Scala example and somehow the type checker doesn’t catch it, the problem will still be fairly easy to fix because of how straightforward and consistent the code is.

Contrast this with the Java solution.  It’s easy to imagine subtle errors slipping into the implementation.  Even something as simple as a typo can be difficult to track down.  Consider:

public class Main {
    public static void main(String[] args) {
        int sum = 0;
        for (String arg : args) {
            sum = Integer.parseInt(arg);
        }
 
        System.out.println(sum);
    }
}

Can you spot the error?  I’ll give you a hint, given the input “1 2 3 4 5“, the correct answer is 15.  Our revised Java solution prints “5“.

I can’t even figure out how to trivially break the Scala solution so that it does something bad.  This sort of stability happens time and time again with functional solutions.  You write the code, it’s expressive and elegant, and somehow it manages to do everything you wanted without fuss. 

This isn’t even limited to simple examples.  Somehow, this effect holds even in larger, more complicated systems.  I recently built a modest application in Scala using it’s trademark hyrbrid-functional style.  It’s hard to estimate of course, but judging by my experiences with similar projects in Java, I saved a great deal of time and effort just due to Scala’s functional expressiveness.

I doubt that the industry is going to change overnight, but it’s hard to deny the benefits of the FP approach.  Functional languages are on the rise; Scala is only the tip of the iceberg.  Languages like Erlang and F# are also gaining popularity as developers begin to recognize how expressive they can be.  It may take some time, but I predict that within a decade or so, the dominant paradigm in the industry will be functional, or more likely some hybrid thereof.  Welcome the revolution!

Weekend Fun: ActiveObjects Testability

18
May
2008

I’m not entirely sure what these metrics mean, but they give me a warm feeling inside.  :-)

      Analyzed classes:   136
 Excellent classes (.):   121  89.0%
      Good classes (=):     9   6.6%
Needs work classes (@):     6   4.4%
             Breakdown: [.............................................===@@]
       0                                                                    118
    10 |......................................................................:   118
    31 |..                                                                    :     3
    52 |                                                                      :     0
    73 |===                                                                   :     5
    94 |===                                                                   :     4
   115 |                                                                      :     0
   136 |@@                                                                    :     3
   157 |                                                                      :     0
   178 |                                                                      :     0
   199 |                                                                      :     0
   220 |                                                                      :     0
   241 |                                                                      :     0
   262 |                                                                      :     0
   283 |                                                                      :     0
   304 |@                                                                     :     1
   325 |                                                                      :     0
   346 |@                                                                     :     1
   367 |                                                                      :     0
   388 |                                                                      :     0
   409 |                                                                      :     0
   430 |                                                                      :     0
   451 |                                                                      :     0
   472 |                                                                      :     0
   493 |@                                                                     :     1
   514 |                                                                      :     0

Highest Cost
============
net.java.ao.EntityManager 501
net.java.ao.schema.SchemaGenerator 353
net.java.ao.EntityProxy 296
net.java.ao.schema.ddl.SchemaReader 141
net.java.ao.RelatedEntityImpl 127
net.java.ao.SearchableEntityManager 127
net.java.ao.Query 99
net.java.ao.types.EntityType 87
net.java.ao.db.HSQLDatabaseProvider 85
net.java.ao.db.OracleDatabaseProvider 85
net.java.ao.DatabaseProvider 83
net.java.ao.Common 82
net.java.ao.EntityManager$1 82
net.java.ao.db.PostgreSQLDatabaseProvider 82
net.java.ao.types.TypeManager 80
net.java.ao.schema.ddl.DDLAction 30
net.java.ao.SoftHashMap 28
net.java.ao.schema.AbstractFieldNameConverter 28
net.java.ao.SoftHashMap$HashIterator 20

Most of the badness seems to stem from EntityManager, which makes a lot of sense given the way it is designed.  EntityProxy also poses issues, but in practice this isn’t a real problem because of how extensive the JUnit tests are for just this class.  Overall, ActiveObjects testability isn’t anywhere near to the Guice score, but it’s not as horrible as JRuby.

Buildr Still Not Ready for Prime Time

12
May
2008

As some of you may know, I’ve been following the Buildr project with a fair degree of interest of late.  Just last week, Assaf announced their first release as an Apache project: Buildr 1.3.0.  Being of the inquisitive bend, I decided that this would be an excellent time to reevaluate its suitability for use in my projects, both commercial and personal.

A while back, I evaluated Buildr as a potential replacement for Ant and Maven2 and eventually came to the unfortunate conclusion that it just wasn’t ready.  At the time, Buildr was still hampered by a number of annoying limitations (such as being unable to reconfigure the default directory structure).  I’m happy to say that many of these problems have been fixed in the 1.3 release, as well as a large number of feature additions which I hadn’t thought to request.  Despite all that, I’m still forced to conclude that Buildr simply isn’t (yet) suitable as my build tool of choice.

Now before you stone me for utter heresy, try to hear me out.  I’m not dissing Buildr in any way, I really want to use it, but I just can’t justify moving over to it in the face of all of its current issues.  What’s really unfortunate is that all of these issues can be summed up with a single word: transitivity.

One of Maven’s killer features is the ability to resolve the entire transitive dependency graph.  Now I’ll grant that it does this with varying degrees of success, but for the most part it’s pretty smart about how it fixes up your CLASSPATH.  As of 1.3, Buildr claims to have experimental support for transitive dependency resolution, but judging from the problems I encountered in my experimentation, it’s barely even deserving of mention as an experiment, much less a full-fledged feature.

To understand why transitive dependencies are such a problem, it is of course necessary to first understand the definition of such.  In a word (well, several anyway), a transitive dependency is an inherited dependency, an artifact which is not depended upon directly by the project, but rather by one of its dependencies.  This definitions carries recursively to dependent parents, grand-parents and so on, thus defining a transitive dependency graph.  Consider it this way:

image

In the diagram, MyCoolProject is the artifact we are trying to compile.  The only dependencies we have actually specified for this artifact are the DBPool and Databinder artifacts.  However, the Databinder artifact has declared that it depends upon both the Wicket and the ActiveObjects artifacts.  ActiveObjects doesn’t depend upon anything, but Wicket has dependencies SLF4J and on the Servlets API.  Thus, our original goal, MyCoolProject, has transitive dependencies Wicket, SLF4J, Servlets and ActiveObjects.  Quite a bit more than we thought we were asking for when we declared our dependency on Databinder.

In general, this sort of transitive resolution is a good thing.  It means that instead of specifying six dependencies, we only had to specify two.  Furthermore, we observed the DRY principle by not re-specifying dependency information already contained within the various packages.  As with any “good thing”, there’s definitely a valid argument regarding its down sides, but on the whole I’m quite fond of this feature.

In Maven, this sort of thing happens by default, which can lead to some very confusing and subtle CLASSPATH problems (conflicting packages, etc).  Buildr takes a slightly different approach in 1.3 by forcing the use of the transitive method:

repositories.remote << 'http://www.ibiblio.org/maven2'
 
define 'MyCoolProject' do
  project.version = '1.0'
 
  compile.with transitive('xmlrpc:xmlrpc-server:jar:3.0')
 
  package :jar
end

It’s easy to see why Buildr is raising such a ruckus in the build genre.  It’s syntax is elegance itself, and since it’s actually a proper scripting language (Ruby), there’s really nothing you can’t do with it.  Having to remember to specify the default repository is a bit of a pain, but it’s certainly something I can live with.  The real problem with this example is a little less subtle: It doesn’t work.

If you create a buildfile with the above contents and then run the buildr command, the results will be something like the following:

Downloading org.apache.xmlrpc:xmlrpc:jar:3.0
rake aborted!
Failed to download org.apache.xmlrpc:xmlrpc:jar:3.0, tried the following repositories:

http://www.ibiblio.org/maven2/

(See full trace by running task with --trace)

After a fairly significant amount of digging, I managed to discover that this problem is caused by the fact that Buildr attempts to download a corresponding JAR file for every POM it resolves.  This seems logical until you consider that many large projects (including Databinder, Wicket, and Hibernate) define POM-only projects which exist for the sole purpose of creating transitive dependencies.  They’re organizational units, designed to allow projects to depend upon one super-artifact, rather than a dozen sub-projects.  It’s a very common practice, and one which Buildr completely fails to handle appropriately.

After some prodding on the Buildr dev mailing-list, Assaf admitted that this is an issue worth looking at and provided a temporary workaround (pending a rework of the functionality in 1.4):

def pom_only(a)
  artifact a do |task|
    mkpath File.dirname(task.to_s)
    Zip::ZipOutputStream.open task.to_s
  end
end

This was about the point that I started wondering if maybe Ant/Ivy would be a better bet, at least for the time being.  To use this workaround, you must call the pom_only method once for every POM-only project in the dependency graph.  Usually, this means you must invoke Buildr repeatedly and find the troublesome artifacts by trial and error.  Not exactly a “just works” solution.

Pressing forward however, I unearthed a deeper, even more insidious issue: an intermittent failure to generate Eclipse project meta.  I’m not sure if this is due to the POM-only dependencies or just bad juju, but whatever the reason, it’s annoying.  I’ve raised the issue on the Buildr mailing lists, but so far no response.  Basically, what happens is something like this:

C:\\Users\\Daniel Spiewak\\Desktop\\MyCoolProject> buildr eclipse
(in C:/Users/Daniel Spiewak/Desktop/MyCoolProject, development)
Completed in 0.499s
C:\\Users\\Daniel Spiewak\\Desktop\\MyCoolProject>

Not exactly the most helpful output.  In case you were wondering, this process did not create the Eclipse metadata.  It’s interesting to note that calling buildr idea (to create project metadata for IntelliJ) seems to work just fine.  Whatever causes the bug, it seems to be specific to just the Eclipse project generator.

Buildr is a remarkable project.  It shows potential to someday become the de-facto build system, possibly even unseating Ant.  Unfortunately, that day is not today.  There are too many odd wrinkles and unpredictable errors to really call it a “finished product”.  Hopefully, the Buildr dev team will continue their excellent work, eventually producing a tool worthy of serious consideration.  Until then, I guess that I’m (still) stuck with Ant.

Update: It seems that the issues with transitive dependencies have been resolved in the latest Buildr versions in their SVN.  I’m looking forward to when this all becomes stable and publicly consumable!

Is Scala Really the Next C++?

5
May
2008

I’ve been turning that question over in my head for a few months now.  It’s really a worthy thought.  At face value, it’s incredibly derogatory and implicative of an over-bulked, poorly designed language.  While I’m sure this is not how the concept was originally intended, it certainly comes across that way.

But the more I think about it, the more I realize that the parallels are uncanny.  Consider the situation back in 1983, when C++ first got started.  C had become the dominant language for serious developments.  It was fast, commonly known and perhaps more importantly, structured.  C had successfully applied the concepts shown in Pascal to systems programming, revolutionizing the industry and becoming the lingua franca of developers everywhere.

When C++ arrived, one of its main selling points was as a “better C”.  It was possible to interoperate seamlessly between C++ and C, even to the point of compiling most C programs unmodified in C++.  But despite its roots, it still managed to introduce a number of features drawn from Smalltalk et al. (such as classes and virtual member functions).  It represented a paradigm shift in the way developers represented concepts.  In fact, I think it’s safe to say that the popular object-oriented design principles that we all take for granted would never have evolved to this level without the introduction of C++.  (yes, I’m aware of Objective-C and other such efforts, but C++ was the one which caught on)

So we’ve got a few catch-phrases here: “better C”, “seamless interop”, “backwards compatibility”, “paradigm shift”, etc.  Sound familiar?  (actually, it sounds a lot like Groovy)  The truth is that Scala seems to occupy a very similar place in history (if six months ago can be considered “history”).  Scala is almost an extension to Java.  It brings to the language things like higher-order functions, type inference and a type system of frightening power.  Scala represents a fundamental shift in the concepts and designs we use to model problems.  I truly believe that whatever language we’re using in a decade’s time, it will borrow heavily from the concepts introduced in Scala (in the same way that Java borrowed from C++).

But if Scala and C++ are so similar in historical inception, shouldn’t we view the language with a certain amount of distrust?  We all know what a mess C++ turned out to be, why should Scala be any different?  I believe the answer has to do with Scala’s fundamental design principles.  Specifically, Scala is not trying to be source-compatible with Java.  You can’t just take Java sources and compile them with Scala.

This clean break with the progenitor language has a number of ramifications.  Most importantly, Scala is able to smooth many of the rough edges in Java without breaking existing libraries.  For example, Scala’s generics are far more consistent than Java’s, despite still being implemented using erasure.  This snippet, for example, fails to compile:

def doSomething(ls:List) = {
  ...
}

All we have done is omit the generic type parameter.  In Java, the equivalent would lead to a compiler warning at worst, because Java has to remain backwards compatible with code written before the introduction of generics.  This “error vs warning” distinction seems a bit trivial at first, but the distinction has massive implications throughout the rest of the type system.  Anyone who has ever tried to write a “generified” library in Java will know what I mean.

Scala represents a clean break from Java.  This is in sharp contrast to C++, which was trying to remain fully backward compatible with C sources.  This meant inheriting all of C’s weird wrinkles (pass-by-value, no forward referencing, etc).  If C++ had just abandoned it’s C legacy, it would have been a much nicer language.  Arguably, a language more like Java.  :-)

Perhaps the most important distinction between Scala and C++ is that Scala is being designed from the ground up with consistency in mind.  All of the major problems in C++ can be traced back to inconsistencies in syntax, semantics or both.  That’s not to say that the designers of C++ didn’t put a good deal of effort into keeping the language homogenous, but the truth is that they ultimately failed.  Now we could argue until the cows come home about why they failed, but whatever the reasons, it’s done and it has given C++ a very bad reputation.  Scala on the other hand is being built by a close-knit team of academics who have spent a lifetime thinking about how to properly design a language.  I tend to think that they have a better chance of succeeding than the C++ folks did.

So the moral of this long and rambling post is that you shouldn’t be wary of the Scala language.  It’s not going to become the next evil emperor of the language world.  Far from it, Scala may just represent the next step forward into true programmatic enlightenment.

Useless Hackery: A Scala Quine

30
Apr
2008

Warning: This post has little-to-no practical value.  Waste time at your own risk…

While double-checking the terms for my previous post, I came across the Wikipedia definition of a polyglot program:

In the context of computing, a polyglot is a computer program or script written in a valid form of multiple programming languages, which performs the same operations or output independently of the programming language used to compile or interpret it.

Not precisely the same as the definition which has now come into common use (referring to the use of multiple languages in a single application).  The article goes on to give two examples of a polyglot, one in PHP/C/Bash and one in Haskell/OCaml/Scheme (I don’t count the Perl/DOS example since it doesn’t perform the same function in both languages).  These examples are quite interesting, but what really caught my eye were the additional properties of the second example: not only is it a polyglot, but it is also a quine:

In computing, a quine is a program, a form of metaprogram, that produces its complete source code as its only output.

Think about that for just a second: A program which produces itself as its only output.  I think that’s probably the most profound brain-teaser that I’ve run across in months.  Consider for a moment just how one would accomplish this.  For example, we could try a naive implementation in Ruby:

puts "puts \"puts \\\"..."

You’ll notice that we have run into a bit of a problem.  In fact, the infinitely recursive nature of the definition is precisely what makes quines so interesting.  Of course, I’m aware that there are already a number of very clever Ruby quines, but that’s not the point.  After all, what good is a puzzle if someone else gives you the solution?

By putting a little thought into this, we can devise a slightly more advanced attempt which brings us a bit closer to quine-ness:

s = "s = \"#{s}\"; puts s"
puts s

We’re getting closer, anyway.  We still have a serious problem in that string declaration (hint: it has something to do with the whole recursiveness thing).  We have to somehow include the string within itself once explicitly, but on the inner recursion only include a textual reference to itself.  This is by no means trivial to accomplish.

One technique we can employ is string formatting.  Old C salts will certainly be familiar with the printf function.  There’s a clever little trick we can employ which allows us to format a string using itself as the format string.  This is one way to provide single-level recursion in the string resolution:

char *s = "char *s = \"%s\"; printf(s, s);";
printf(s, s);

Note that I’m cheating a bit on the formatting to make things more readable.  There’s really nothing preventing this sample from formatting a bit more correctly (newline, etc).

We’re almost there now.  Our only remaining problem is the fact that the second recursion of the string will have improperly quoted double-quotes.  Gary Thompson shows a fully fleshed-out C quine which gets around this problem by exploiting the int/char duality in the language.  However, this little trick isn’t precisely available in languages like Scala.  Well, it is, but there are problems with the printf formatting which obviate the possibility.  Specifically, Scala’s printf method does not allow for the standard %s-style formatting (even though the scaladoc claims that it does).  All that this function allows us are simple substitutions, but it turns out that this is enough to (finally) complete our quine in Scala (formatted for easy reading):

object Q extends Application {
  val s = "object Q extends Application'{'val s={0}{2}{0};printf(s,{0}{1}{0}{0},{0}{1}{1}{0},s)}"
  printf(s, "\"", "\\", s)
}

Even with the reformatting, the second line still overflows the formatting on most browsers (sorry about that).  I’ve uploaded the unformatted, “true quine” here.

It’s somewhat interesting that Scala’s syntax is concise enough (especially with type inference) that this sort of thing is possible in only 149 characters.  If you look around for Java quines (there are a few of them), you’ll see that most of them take a similar approach, but they usually have trouble with the encumbrances of Java’s highly-verbose syntax.  It’s sort-of depressing to condense your code when you have to type “public static void main” regardless.

Anyway, I’m certainly not an expert on the ins-and-outs of the Scala library.  Suggestions welcome on how to golf this down a bit.   Or better yet, an extremely clever solution which eluded me entirely.