Skip to content
Print

Diving into Scala

2
Jan
2008

I came to the realization this week that I really don’t know enough Scala (like I say, just enough to be dangerous).  After all, most experts agree that this is the general purpose language of the future, and it may just be a more productive language of the present.  With that in mind, I think it’s a fair conclusion that no Java developer should be without at least a working knowledge of the language.

In truth, I’ve been putting off learning more about Scala.  Yes, I’m sold on its benefits and what it could become, but I’m very comfortable in my cozy IDE and often it seems like there’s no immediate reason to change.  More than that, I was slightly put-off by all the functional goodies.  I mean, currying?  Function parameters?  Lazy evaluation?  It’s enough to make an upstanding imperative developer’s hair stand on end.  And every time I thought I had worked up the courage to try a more in-depth experiment, I’d browse over to the Scala site and there would be that demonic monocharacter-variabled quicksort implementation staring me in the face.  A word of advise, guys: that’s not the best example for how readable and intuitive Scala can be!

But in the end, my curiosity and dislike-for-anything-I-can’t-understand won the day.  A while back, I wrote an extremely naive chess engine in Java.  It was horribly inefficient (working recursively, rather than using a more conventional graph search algorithm) and it barely worked for even simple analyses, but it was at the time a diverting weekend project.  Remembering this project set me thinking about how I could have done it better.  Obviously some redesign would be needed on the core algorithm.  An astounding amount of graph theory comes into play (no pun intended) in chess analysis.  Graph search algorithms are usually profoundly functional in nature, which got me thinking about Scala again.  Working with closures, cheap recursion and first class functions can make such algorithms far more expressive, easy-to-read and efficient.  So I updated my Scala runtime, dusted off the “scala” mode definition for jEdit and got to work.

I won’t deny, I was somewhat leery of stepping out of my nice Eclipse environment.  Granted, I program in Ruby all the time and rarely use an IDE for it, but somehow that’s different.  Scala is a peer to Java from a language standpoint, so mentally I expected my experience to be similar to the time I wrote Java code in Notepad.  Nothing could be farther from the truth.

Writing code in Scala was like a breath of fresh air.  To my surprise, the code I was turning out was both more concise and more readable than the equivalent Java (a rare combination).  Add to that the fact that much of the Scala core API is based on (or actually is) the Java lang libraries and I found myself almost oblivious to the lack of code completion, inline documentation and incremental compilation (almost).  I did find myself keeping a Scala interpreter running in a separate shell (a habit I picked up doing Ruby with IRB), as well as using Google frequently to find little tidbits.  There’s no doubt that the experience would have been more fluid with a toolset of the JDT caliber, but I wasn’t totally dependant on my life support anymore.

Things were flowing smoothly, and I was just starting to pick up steam when my passenger train derailed on a tiny piece of miss-placed syntactic jello: arrays.

In Scala, arrays are objects just like any other.  Yes, I’m sure you heard that when transitioning to Java, but Scala really carries the concept through to its full character (think Ruby arrays).  This is how you allocate an array of size 5 in Scala:

var array = new Array[String](5)

Or, if you just have a set of static values, you can use the factory method (ish, more on that later):

var names = Array("daniel", "chris", "joseph", "renee", "bethany", "grace")

It was at this point that my warning sirens began going off.  Any syntax which looks even remotely like C++ deserves extremely close scrutiny before it should be used.

The first invocation is fairly straightforward.  We’re creating a new instance of class Array parameterized on the String type.  Scala has type parameterization similar to Java generics except significantly more powerful.  In fact, Scala type parameters are really closer to lightweight templates (without the sytactical baggage) than to generics.  The Javascript-style variable declaration is actually typesafe because the compiler statically infers the type from the assignment (yet another “well, duh!” feature that Scala handles beautifully).

The second syntax is a bit more complex.  It is best understood once a bit of groundwork is laid analyzing the rest of the array syntax:

var array = new Array[String](5)
array(0) = "daniel"
array(1) = "chris"
println(array(0))

No, that’s not a typo. Scala uses parentheses to access array values, not the C-style square brackets which have become the standard.  By this time, the syntax was starting to look more and more like BASIC and the warning sirens had become full-blown air raid alerts.  My vision blurred, my knees got week and I desperately groped for a fresh slice of pizza.  Any language that reminds me of C++ and VB6 successively within the span of 30 seconds deserves not just scrutiny, but all-out quarantine.

It turns out that somewhere in the Array class, there’s a set of declarations that look like this:

class Array[T](size:Int) {
  def apply(index:Int):T = {
    // return the value from the underlying data structure
  }
 
  def update(index:Int, value:T):Unit {
    // set the value at the corresponding index
  }
 
  // ...
}

Ouch!  Seems the other shoe just dropped.  It turns out that the apply and update methods are special magic functions enabling the array access and modification syntax.  The same code in C++ might look like this:

template <class T>
class Array {
public:
    Array(int length);
 
    T& operator()(int index);
 
private:
    // ...
};

For those of you who don’t know, there’s a reason that syntax passed into “bad practice” years ago.  In short: it’s cryptic, relies heavily on magic syntax tricks (thus is quite fragile) and by its very nature redefines an operation that is understood to be basic to the language (parentheses).  It’s the absolute worse that operator overloading has to offer.

But the fun doesn’t stop there!  Remember I said the (sort of) array factory constructor would make more sense after analyzing arrays a bit more?  You probably know where I’m going with this…

object Array {
  def apply[T](values:*T) = values
}
 
class Array[T](length:Int) {
  // ...
}

Try not to think about it too much, it’ll make your brain hurt.  Scala doesn’t really have a static scope, so any class that needs to do things statically (like factories) has to use the singleton construct (object).  Extend this concept just a bit and you can see how classes which must be both instance and static can run into some trouble.  Thus, Scala allows the definition of both the singleton and the class form of the same type.  In Java, the same concept might look like this:

public class Array<T> {
    public Array(int length) {
        // ...
    }
 
    // ...
 
    public static <V> Array<V> createInstance(V... values) {
        // ...
    }
}

Obviously the correspondence isn’t exact since Java array syntax can’t be replicated by mere mortals, but you get the gist of it.  It seems this is one time when Scala syntax isn’t more concise or more intuitive than Java.

So what does this all mean for our array factory construct?  It means we’re not actually looking at a method, at least, not a method named “Array” as we might reasonably expect.  The array factory construct can be written equivalently in this fashion:

var names = Array.apply("daniel", "chris", "joseph", "renee", "bethany", "grace")

Are you starting to see why overloading the parentheses operator is considered bad form in civilized lands?  The syntax may look nice, properly used, but it’s a major pain in the behind if you have to read someone else’s badly-designed code to understand what’s going on if you can’t rely on parentheses to be parentheses.

Things keep going from bad to worse when you consider all the trouble you can get into with method declarations.  Try this one on for size:

class Pawn {
  def getSymbol = "P"
}

Famously concise and readable…except it’s not going to do what you probably expect it to.  To use this class, you might write something like this:

var pawn = new Pawn
println(pawn.getSymbol())

The above code will fail with a compiler error on line 2, saying that we passed the wrong number of arguments to the getSymbol method.  Here’s a heads-up to those of you designing the Scala syntax: this confuses the hell out of any newbie trying the language for the first time.  It turns out that declaring a method without parentheses means that it can never be invoked with those parentheses.  Changing the invocation to the following alleviates the error:

println(pawn.getSymbol)

And just when you’re beginning to make sense of all this, you remember that declaring a method with parentheses means that it can be used both ways!  So the following code is all valid:

class Pawn {
  def getSymbol() = "P"
}
 
var pawn = new Pawn
 
// method 1
println(pawn.getSymbol())
 
// method 2
println(pawn.getSymbol)
 
// method 3
println(pawn getSymbol)

One final, related tidbit just in case you thought these were isolated examples.  It may come as a shock to those of you coming from C-derivative languages (such as Java), but Scala doesn’t support constructor overloading.  Seriously, there’s no way to define multiple constructors for a class in Scala.  Of course, like many other things Scala provides a way to emulate the syntax, but it’s far from pretty.  Let’s suppose you want to be able to construct a Person with either a full name, a first and last name or a first name, last name and age.

object Person {
  def apply(fullName:String) = {
    var names = fullName.split(' ')
    Person(names(0), names(1))
  }
 
  def apply(firstName:String, lastName:String) = Person(firstName, lastName, 20)
}
 
case class Person(firstName:String, lastName:String, age:Int) {
  def getFirstName() = firstName
  def getLastName() = lastName
  def getAge() = age
}
 
var person1 = Person("Daniel Spiewak")
var person2 = Person("Jonas", "Quinn")
var person3 = Person("Rebecca", "Valentine", 42)

“Where are your Rebel friends now?!”

Of course it all makes some sense from a theoretical standpoint, but in practice I can’t see this being anything but clumsy and annoying (not to mention visually ambiguous.  Try to trace the code that creates person1 to see what I mean).

Update: As has been pointed out in the comments, Scala does support a primitive sort of constructor overloading (using def this()) which allows for simple delegate constructors (like I demonstrated above).  However, the fact remains that Scala’s constructor overloading is neither as powerful nor as uniformly intuitive as the corresponding feature in Java.

Conclusion

I really don’t mean to be a pain.  After all, someone could write exactly the same sort of rant about any language.  Overall, I still think Scala is a big win over any imperative language I’ve ever seen.  The point I’m trying to get across is that there’s no silver bullet.  No language is perfect (not even Scala).

Scala’s at an interesting place right now.  When I work with it and when I read the buzz around the blogosphere, I get the same feeling as I did about Ruby 5 years ago.  Back then Ruby was a virtually unknown Asian language with some interesting dynamic constructs.  Now there’s so much hype built up behind the language it could choke a mule.  People are rallying to its cause and breaking out the ACME flame-throwers to deal with any dissidents.  I would hate to see that happen to Scala.

Final word to Java developers everywhere who haven’t tried Scala yet: you’re missing it!!  Learn Scala.  Try a few applications.  Trust me, you won’t regret it.  Just be sure to drink plenty of fluids, I’m sure you’ll run into your share of oddball syntax corners.

Comments

  1. You are mistaken — multiple constructors are possible, but they all have to call another constructor in their first statement and by association the primary constrcutor has to be called *always* (as I remember this is because of the linearization done when using mixins similar to multiple inheritance):
    case class Person(firstName:String, lastName:String, age:Int) {
    def getFirstName() = firstName
    def getLastName() = lastName
    def getAge() = age
    def this(fullName:String) = {
    this(fullName.split(‘ ‘)(0), fullName.split(‘ ‘)(1))
    }

    def this(firstName:String, lastName:String) = this(firstName, lastName, 20)
    }

    On the point that Scala isn’t perfect — I fully agree, but I think what’s not perfect about it may be completely different for everyone. For example I am very happy with the way apply() and update(), unapply() and other magic functions work. It just requires a little different thinking, but I got used to it pretty quickly. The array design did seem very odd at first, but it’s actually very consistent with the rest of Scala.

    Erkki Lindpere Wednesday, January 2, 2008 at 3:13 am
  2. I agree with Erkki. When I looked at your array access methods that were supposed to be confusing I wasn’t sure what point you were making, because I read them straight through. That’s not to say I’m some kind of Scala master, because I’m certainly not, but I have gotten used to its collection instantiation, access, and updating methods, and I’m pretty sure they are just different and not worse than the old square brackets. They have the advantage of being something any class can implement (and so List does, etc.), more than enough to make them worthwhile.

    n8han Wednesday, January 2, 2008 at 6:24 am
  3. People differ in their opinion about the magic behind () depending on how much legacy code they had to read in their life. Those with their share of legacy code have stared at the road to madness, the others hopefully never see it.

    Peace
    -stephan


    Stephan Schmidt :: stephan@reposita.org
    Reposita Open Source – Monitor your software development
    http://www.reposita.org
    Blog at http://stephan.reposita.org – No signal. No noise.

    Stephan Schmidt Wednesday, January 2, 2008 at 7:32 am
  4. @Erikki
    Ah, I figured I had to be missing something, though even the “def this” syntax seems a bit…strange. I’m adding an update to the article so I don’t lead hapless hoplites astray.

    @n8han
    It is actually all quite logical, it just strikes me as very odd and dangerous. The reasoning I’ve always ascribed to behind the square brackets is its an operator that’s separate from the array parameter markers. If you start blurring the line between the two, you get odd code like this:

    def getArray() = Array(1, 2, 3, 4)
    println(getArray()(1))

    It’s not necessarily wrong, just a bit less intuitively readable.

    Daniel Spiewak Wednesday, January 2, 2008 at 8:46 am
  5. “It’s not necessarily wrong, just a bit less intuitively readable.”

    This is only true in the context of other languages. I’ve heard similar arguments against Python’s magic len()/__len__ style constructs too, but when something permeates the language, claiming that it is “less intuitively readable” is equivalent to saying, “this doesn’t work how it would in another language I know.” In practice this sort of thing is usually only weird for people that are new to the language and trying to draw syntactic analogies as they learn.

    Adam Wednesday, January 2, 2008 at 9:17 am
  6. :-) Point taken. However, the other qualification worth considering (for syntax intuitiveness) is that the syntax is consistent with the rest of the language. Just having a construct permeate the language is not sufficient to constitute consistency, as the construct may contradict another construct or lead to visually ambiguous character sequences (such as in this case).

    Daniel Spiewak Wednesday, January 2, 2008 at 9:21 am
  7. Scala doesn’t support operator overloading, because in Scala, there are no operators. Everything is a method in scala. The dangers of supporting operator overloading doesn’t apply to Scala because it doesn’t support operator overloading.

    Isn’t it nice to have a compiler that ‘does what I mean, not what I say’? Its neat, in my opinion, that the compiler automatically maps ‘anArray(3)’ to ‘anArray.apply(3)’ , or ‘myObject.myProperty’ = 3 to ‘myObject.myProperty.=(3)’ for me (instead of me having to explicitly type these longer versions.) The compiler is guessing what I mean. cool. In his Programming in Scala book, Mike Odersky, the Scala designer, is telling me that after a while I’ll learn what the compiler will guess in different situations. I’m willing to take him at his word for the moment because it cuts down on the amount of characters I have to type.

    Regards,
    (Still a Scala newbie) John

    John Franey Wednesday, January 2, 2008 at 11:58 am
  8. Yes you’re right, Scala really supports operator overloading in the same way that Ruby does (they’re not really operators, just symbolic method names). This doesn’t alleviate the dangers of operator overloading though.

    When you redefine what an operator does for a certain type, you’re (potentially) breaking some very basic assumptions people will make when reading your code. The canonical example (done in Ruby just to get open classes):

    class Fixnum
    def +(other)
    self.-(other)
    end
    end

    puts 4 + 2 # prints 2

    Obviously most developers won’t be so obviously stupid, but you get my drift. In reality, operator overloading is no more dangerous than defining a method with a poorly descriptive name. The difference is that people believe they understand what operations operators are describing and it’s far more confusing to them when those assumptions are proven inaccurate by a badly overloaded operator.

    Daniel Spiewak Wednesday, January 2, 2008 at 12:07 pm
  9. The parentheses can be left out so there is a convention to stress that a method doesn’t have side effects (and is as well a historical remnant), more detailed discussion:
    http://www.nabble.com/-scala–%28Side-Effects%29-and-Parens-td13438418.html#a13438418

    lopex Wednesday, January 2, 2008 at 1:19 pm
  10. I agree [ but I don't take the quoted ruby code as an operator overload. Its seems to me to be a clear-cut method override.]

    I think your point is that when developers see “a + b” they presume it means what it means in mathematics and they cannot be shaken from that presumption. They presume that this operator has all the qualities of the ‘+’ operator with numeric operands, i.e. associative, communicative, transitive and symmetric qualities in addition to the summation behaviour. Not all implementations of operator + shared these qualities and so developers made incorrect assumptions. Reuse of the + name for an operation therefore is more dangerous somehow than reuse of the name ‘plus’ instead.

    Elliot Rusty Harold thinks about the same: http://cafe.elharo.com/java/operator-overloading-considered-harmful/

    Checkout Bruce Eckel’s rebuttal on that page. He starts with: “Unfortunately, Gosling did not seem to understand the actual problems with operator overloading in C++, and so he condemned it out of hand. People have been echoing this condemnation ever since.” Bruce says the real problem of operator overloading was complexity of implementation due to memory management concerns.

    So, since scala is not C++, and does not have these memory issues, maybe scala developers have an opportunity that Java developers don’t have: to see the good side of operator overloading when its appropriate.

    By the way, scala helps the developer with respect to precedence. If a method name starts with a ‘*’ then it has the precedence of a ‘muliplication’. If a method name starts with ‘+’, it has the precedence of an ‘addition’, and so on. So problems with precedence won’t be so easy to implement in scala.

    Personally, I like operator overloading.

    Regards,
    John

    John Franey Wednesday, January 2, 2008 at 2:03 pm
  11. Nice post! I’m trying to like Scala, but I’ve run into exactly the same kind of ‘things that make me go hmmmm.” It’s good to hear that I’m not the only one. Until the Scala docs improve, I’ll be sticking with Java + Groovy.

    hohonuuli Wednesday, January 2, 2008 at 3:30 pm
  12. @hohonuuli
    Actually, I’ve run into fewer gotchas learning Scala than I did when I tried to pickup Groovy. Groovy basically started with Java’s syntax and scriptified it, which leads to tons of odd angles and corners. The result is face-polished and quite nice, but when you really dig into it things start to get hairy. Because Scala wasn’t trying to be another language at the same time, I think it’s a bit easier to learn.

    Dead on with the docs comment though. I was able to get on by myself more or less using an interactive Scala shell, the core scaladocs and the Scala wiki, but there were a few places that I just couldn’t fight my way through the woods. That’s when you contact your local scala expert (in my case Alex Blewitt) and beg advice. :-) It’s something the Scala language community really needs to work on though. Too much of the “documentation” is just left up to the imagination and ingenuity of the developer.

    Daniel Spiewak Wednesday, January 2, 2008 at 3:42 pm
  13. @Daniel
    Oh, I completely agree that Groovy has “odd angles and corners”; but it sure is fun to program with and I can get lots of work done with it. Scala on the other hand, well, here’s a few examples of why Scala “drives me crazy”:

    (1 until 1000).filter(n => n % 3 == 0 || n % 5 == 0).foldLeft(0)(_ + _)

    lazy val fib: Stream[Int] = Stream.cons(0, Stream.cons(1, fib.zip(fib.tail).map(p => p._1 + p._2)))

    These examples are from http://scala-blogs.org/2007/12/project-euler-fun-in-scala.html. They illustrate that it’s not enough to know what arguments a method or function takes. You have to know the “special details”, like what the underscores do in foldLeft and what to do p._1 and p._2 mean. Take a look at the docs for p._1 at http://www.scala-lang.org/docu/files/api/scala/Product2.html#_1. (They weren’t too helpful for me) BTW, if you picked Scala right up then I bow down before you. ;-)

    hohonuuli Wednesday, January 2, 2008 at 7:44 pm
  14. Yeah, I think it’s safe to say that the examples on the scala-lang page were designed for people coming from the functional paradigm. What Scala needs is a good introductory page for Java/imperative developers. After all, that’s where most of the meat is that’s interested in the language.

    Daniel Spiewak Wednesday, January 2, 2008 at 7:52 pm
  15. I really enjoyed reading this post about scala. Mostly because I have been thinking about looking at scala myself, but have not really found the time (or real interest) in it yet. Until I got to your conclusion, my own conclusion was, that his language (also) will not be the one that beats the Java platform.

    When you write “..Scala’s at an interesting place right now….”, I think: But will it change to the better? Isn’t it kind of rare, that a language changes in areas like the ones you mention (constructors, static constructs, () “overloading”,…) when it has already been published? Changing such stuff radically would break a lot stuff around already, not to mention piss of the current followers, who like the constructs?

    As the language stands right now, it seems to me, that it is a bit too different, from what the masses will jump onto. A big part of this is all the functional stuff. When clever people say, that the next big language will be functional, I tend to disagree. Maybe they are right, when you think about what we need to be able to write faster programs for the many cores of the future. But I think they are missing the migration path from the current big language (Java) to the upcoming one. I can’t really figure out if it is just me, that is getting old :-)

    Per Olesen Thursday, January 3, 2008 at 5:05 am
  16. It’s absolutely correct to say that the docs for Scala are designed for someone coming from a functional programming background. When I first looked at Scala I realized that I couldn’t get my head around the functional paradigm and so contented myself with Ruby and Groovy (Groovy has very different advantages and disadvantages from Scala, such as metaprogramming, and shouldn’t be thought of as an replacement; rather they both stand on their own).

    I then did some research on Lisp/Scheme Haskel, Erlang and OCAML. A few days ago I bought the pre-realease of the new Scala book and am blazing though it. The weirdnesses mentioned in the Article above make *perfect* sense from a functional language and I am *loving* the language. As for things like the p._1 – as the book explains they are for tuples which is a *core-concept* of functional languages and the syntax makes sense once you understand what they are (‘parameter 1, parameter 2) with the reason they are not array syntax being because of varying return types. I suspect the syntax is far less confusing then the ‘Anonymous Classes’ would be for those who have never seen that either.

    No-one can say for certain that Scala will never replace Java on the JVM . There are two things to remember.

    1) As Scala compiles to byecode then *parts* of projects can be replaced by Scala. For example, in a complex financial application, perhaps the calculation routines could be done in Scala by the senior developers and just used as a normal library by the other developers.

    2) Which languages are used in the future will depend on developments in concurrency. If we start using 80 core Intel chips int he future then threading is going to ’separate the men from the boys’ as I’ve heard it described :) Sclala’s functional subset and it’s Actor model is going to make it start looking pretty good compared to Java.

    btw, has everyone forgotten how weird Java’s constructor syntax looked compared to C++’s which we were using before Java came out?

    Cheers

    Nos Doughty Wednesday, April 30, 2008 at 1:57 am
  17. Is there a way to fake out the syntax for something other than case classes? Non-case classes require the “new” operator. I suppose I could duplicate the arguments in the object and create a delegation to the “new” constructor…

    Robert Fischer Thursday, May 20, 2010 at 1:30 pm
  18. > I suppose I could duplicate the arguments in the object and create a delegation to the “new” constructor…

    That’s basically what you have to do. :-) In fact, that’s almost exactly the same thing that case class is doing for you. What you’re going to end up with is something like this:

    class Foo(a: Int, b: String) {
      ...
    }
    
    object Foo {
      def apply(a: Int, b: String) = new Foo(a, b)
    }
    Daniel Spiewak Saturday, May 22, 2010 at 2:48 pm

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*