Skip to content
Print

Quick Explanation of Scala’s (_+_) Syntax

9
Jan
2008

It seems every time I turn around, someone else is railing against Scala for having an enormously cryptic syntax, citing (_+_) as an example.  Aside from fact that it looks like an upside-down face, I think this little construct is probably catching more flack than it deserves.  To start with, let’s get a bit of a context:

val numbers = Array(1, 2, 3, 4, 5)
val sum = numbers.reduceLeft[Int](_+_)
 
println("The sum of the numbers one through five is " + sum)

Block out the second line for just one second.  The example is really pretty understandable.  The first line creates a new value (not a variable, so it’s constant) of inferred type Array[Int] and populating it with the results from the factory method Array.apply[T](*T)  (creates a new array for the given values).  The correspondent to the example in Java would be something like this:

final int[] numbers = {1, 2, 3, 4, 5};
int sum = 0;
for (int num : numbers) {
    sum += num;
}
 
System.out.println("The sum of the numbers from one to five is " + sum);

You’re probably already getting a bit of an idea on the meaning behind that magic (_+_) syntax.  Let’s go a bit deeper…

In Scala, arrays, lists, buffers, maps and so on all mixin from the Iterable trait.  This trait defines a number of methods (such as foreach) which are common to all such classes.  This is precisely why the syntax for iterating over an array is the same as the syntax for iterating over a list.  One of the methods defined in this trait is the high-order function reduceLeft.  This method essentially iterates over each sequential pair of values in the array and uses a function parameter to combine the values in some way.  This process is repeated recursively until only a single value remains, which is returned.

Linearized, the function looks something like this:

val numbers = Array(1, 2, 3, 4, 5)
var sum = numbers(0)
sum += numbers(1)
sum += numbers(2)
sum += numbers(3)
sum += numbers(4)

I mentioned that reduceLeft uses a function parameter to perform the actual operation, and this is where the (_+_) syntax comes into play.  Obviously such an operation would be less useful if it hard-coded addition as the method for combination, thus the function parameter.  You’ll find this is a common theme throughout the Scala core libraries: using function parameters to define discrete, repeated operations so as to make generic functions more flexible.  This pattern is also quite common in the C++ STL algorithms library.

(_+_) is actually a Scala shorthand for an anonymous function taking two parameters and returning the result obtained by invoking the + operator on the first passing the second.  Less precisely: it’s addition.  Likewise, (_*_) would define multiplication.  With this in mind, we can rewrite our first example to use an explicitly defined closure, rather than the shorthand syntax:

val numbers = Array(1, 2, 3, 4, 5)
val sum = numbers.reduceLeft((a:Int, b:Int) => a + b)
 
println("The sum of the numbers one through five is " + sum)

The examples are functionally equivalent.  The main advantage to this form is clarity.  Also, because we’re now explicitly defining the anonymous function, we no longer need to specify the type parameter for the reduceLeft method (it can be inferred by the compiler).  The disadvantage is that we’ve added an extra 20 characters and muddied our otherwise clean code snippet.

Another bit that is worth clarifying is that Scala does not limit this syntax to the addition operator.  For example, we could just as easily rewrite the example to reduce to the factorial of the array:

val numbers = Array(1, 2, 3, 4, 5)
val prod = numbers.reduceLeft[Int](_*_)
 
println("The factorial of five is " + prod)

Notice we have changed the (_+_) construct to (_*_)  This is all that is required to change our reduction operation from a summation to a factorial.  Let’s see Java match that!

Admittedly, the underscore syntax in Scala is a rather odd looking construct at first glance.  But once you get used to it, you can apply it to many complex and otherwise verbose operations without sacrificing clarity.  In fact, the only thing you sacrifice by using this syntax is your intern’s ability to read and modify your code; and really, is that such a bad thing to be rid of?  :-)

Comments

  1. Here’s my question: how come we don’t have to tell reduceLeft() to 0 as the starting point? I always thought of reduce() as a method that took a list, a starting value, and a binary function but you’re only providing the list and the binary function.

    Michael Chermside Wednesday, January 9, 2008 at 8:13 am
  2. Good question! You’re actually thinking of foldLeft/foldRight. Scala separates the two functions, even though they are often discussed as one in computer science courses. The syntax for foldLeft would look like this:

    val numbers = Array(1, 2, 3, 4, 5)
    val sum = numbers.foldLeft(0)(_+_)

    The reason I didn’t pick foldLeft is I didn’t want to have to explain how a method signature could have two sets of parentheses. Still seems pretty odd to me, but it’s actually valid syntax.

    Anyway as you pointed out, foldLeft and reduceLeft do essentially the same thing, with the exception that foldLeft takes a starting value with which it “primes” the fold (so it merges the starting value with index 0, then the merged value with index 1 and so on).

    Daniel Spiewak Wednesday, January 9, 2008 at 9:04 am
  3. I agree that (_ + _) is not too cryptic. However, as much as I’ve been enjoying Scala, I admit I’ve been a little disappointed with its syntax for stuff like this. I’ve also been dabbling with Haskell recently, and its version looks like:

    foldl1 (+) [1, 2, 3, 4, 5]

    Further, there’s other places in Scala where the closure syntax seems to verbose because of the need to declare types. I think this makes anonymous functions a little bit clunky. Dynamic languages don’t suffer from this, of course, and Haskell doesn’t either, because its type inference is more universal. It seems Scala has had to sacrifice some of this in order to reconcile OO and FP in a static language. Still, I can live with it!

    Matt Wednesday, January 9, 2008 at 4:27 pm
  4. Nice post, very informative. This kinda gets into why I think Scala’s a bad idea for some, like for example, Java programmers. The fact that _+_ is a function applies the ‘+’ operation to two parametric types is completely obvious if your familiar with SML. But to an imperative programmer trying to wrap their heads around first class functions AND unfamiliar syntax, well, I expect to see Java programmers writing a lot of Java code in Scala.

    Mark Wednesday, January 9, 2008 at 5:16 pm
  5. Oh, I see. So reduceLeft() is an odd function that treats various elements of the Iterable differently. Let me see if I can write what it does:

    * if the iterable is of length 0, then it throws an exception
    * if the iterable is of length 1, then it returns the single element
    * if the iterable is of length 2+ then it stores the first element in a buffer, then repeatedly performs the binary operation on the buffer and an iterable element, storing the results in the buffer.

    Okay, I guess only the FIRST element is being treated differently from the rest. Odd… but probably fairly useful. Handy, for instance, for cases like this:

    val sqlCols = columnNames.reduceLeft[String](_ + “, ” + _)
    val sqlQuery = “SELECT ” + sqlCols + ” FROM tableName WHERE ” + sqlWhere

    – Michael Chermside

    Michael Chermside Thursday, January 10, 2008 at 8:03 pm
  6. Yeah, if you think about use-cases, reduceLeft really falls out more commonly useful than foldLeft (though they can be used for the same purpose). One of the problems with fold is you have to account for some empty space. Example:

    val sqlCols = columnNames.foldLeft(“”)(_ + “, ” + _)
    assert sqlCols == “, name, id, value”

    The leading comma won’t occur if reduce is used.

    Daniel Spiewak Friday, January 11, 2008 at 7:08 am
  7. Hey, very good post. I like Scala very much, but at last few month, it thought it becames more cryptic from release to release. The real problem is, that the original example does not explain this as good as yours. Now I understand it. Thank you!

    Perhaps you could add this to the scala wiki.

    Joerg Gottschling Thursday, February 14, 2008 at 12:19 am
  8. Two remarks:
    a) (Question) The [Int] in the first Scala-block:
    val sum = numbers.reduceLeft[Int](_+_)
    Is it a type information for ‘reduceLeft’, which is operating on a collection of arbirtrary, but homogen type?
    It’s not the return type?

    b) (Not a question) Without operator-overloading or self defined operators, it wasn’t perfect clear to me, that I can use any self defined function instead of +, for example max:
    scala> def max (a:Int, b:Int):Int = { if (a>=b) return a ; return b}
    max: (Int,Int)Int

    scala> val mmax = numbers.reduceRight[Int](_max_)
    :5: error: not found: value _max_
    val mmax = numbers.reduceRight[Int](_max_)
    ^
    scala> val mmax = numbers.reduceRight[Int](_ max _)
    mmax: Int = 5

    Ah! :)

    Stefan Wagner Tuesday, March 18, 2008 at 9:15 pm
  9. Minor meta-hint:
    The navigation here shows:

    I would expect:

    bye.

    Stefan Wagner Tuesday, March 18, 2008 at 9:17 pm
  10. Actually, your max function isn’t being called at all in that example. Scala’s operators are not infix, but actually left-associative method calls. Another way of writing your reduction would be as follows:

    val mmax = numbers.reduceRight((a:Int, b:Int) => a.max(b))

    max(Int) is a method within class RichInt, to which any Int value can be implicitly coerced.

    The type parameter for reduceRight is actually what allows us to use the underscores rather than properly typed values. The trick is that Scala cannot infer the types of the underscores without some sort of specified initial type. In this case, we’re giving Scala the return type explicitly, so it is able to infer the rest on its own. I considered using foldLeft instead of reduceLeft, but I thought that the added confusion of a curried function would obscure the point I was trying to make.

    This is also a valid way of performing the reduce in the article:

    numbers.foldLeft(0)(_+_)

    Daniel Spiewak Tuesday, March 18, 2008 at 9:25 pm
  11. Daniel wrote:

    >Another way of writing your reduction would be as follows:
    >val mmax = numbers.reduceRight((a:Int, b:Int) => a.max(b))

    Yet another way to phrase this is to use the original function more directly.
    Assuming the original author’s function should have been called something
    besides ‘max’ to avoid the name collision:

    def mmax (a:Int, b:Int):Int = { if (a>=b) return a ; return b}

    val nums = Array(1, 9, 3, -1, 11, 0)

    then the reduceRight could also be applied as:

    nums.reduceRight[Int](mmax(_,_))

    Tom Hicks Wednesday, April 9, 2008 at 10:32 am
  12. I think the premise of this post is a bit misguided. Nobody’s saying that understanding this construct is rocket science.

    The problem is that this construct isn’t self-explanatory… even if one has learnt the basic feel of Scala. The more stuff like this a language has, the longer the learning curve is. That’s the issue. You’re asking someone new to find this web page for this cryptic construct, another page for another cryptic construct, etc., etc. Then, you have to find this page again when you forget how this non-intuitive thing works.

    It’s a gradient scale but, eventually, newcomers will start to say, “There’s too much arcana here.” The more you put in, the worse it is.

    I’m writing a natural language parser for my PhD project (I think Scala’s the best/most fun/most productive language). I want others to extend it eventually. So, I take the time to *type out* certain things in long form so that my intent is clear to another person trying to learn the language and my code *at the same time*, which is inevitable given Scala’s marginal status.

    Scala is great but not perfect I wish everyone who likes it didn’t feel the need to defend every part of it, even when they don’t agree, e.g. “The reason I didn’t pick foldLeft is I didn’t want to have to explain how a method signature could have two sets of parentheses. Still seems pretty odd to me, but it’s actually valid syntax.”

    greg Tuesday, October 26, 2010 at 7:57 am
  13. Hmm, that underscore is a bit too magic for me. I prefer the Boost.Lambda way where they’re numbered, so you can tell multiplication _1 * _2 from squaring _1 * _1.

    Scott Friday, October 21, 2011 at 11:08 pm

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*