Skip to content
Print

The Benefits of Type Inference

16
Jan
2008

There’s been some complaining around the blogosphere about the use of Scala’s type inference mechanism, how it makes it difficult to know what type you’re dealing with since the only explicit declaration of type could be several steps up the call stack.  This is true.  What is also true (and really more important) is that Scala almost completely eliminates the need to worry about what type you’re dealing with; and it does this without sacrificing the all-important compile-time type checking.

Developers with experience in dynamic languages such as Ruby or Python will know what I mean when I say that there comes a time when you stop worrying about what type a value is and concern yourself more with how you can use it.  For example, when I code in Ruby, I often write code like this:

def operate(v)
  if v < "daniel"
    v += "spiewak"
    return value[0..(v.size - 7)]
  end
 
  v
end

When I wrote this method, I wasn’t necessarily thinking about v as a String, but rather as some value which defined the less-than operator against a String, the + operator again taking a String and the square brackets operator taking a Range.  I also never documented anywhere that this method expects a String.  I probably could document it that way, and many people would do so.  But really, all that needs to be documented is that the value passed to the method defines those three functions.

Scala enables the same sort of “duck typing” but adds static checking.  It has constructs which enable values to be typed based on defined functions and such, but I’m not referring to these.  The feature of Scala which really enables one to almost forget about type is the static type inference.  Without it, not only would the language be far more verbose, but also every algorithm, every function would have to pedantically spell out precisely the type expected in every situation.  This sort of restriction makes algorithms far more rigid and far more difficult to manage in the long run.  This is the reason interfaces are used for everything in Java, to satisfy the compiler just enough to allow an arbitrary type to be passed under the surface.

Think about it.  How many times have you written code to use the java.util.Map interface rather than HashMap directly to avoid strongly coupling yourself to one Map implementation?  In Java, one usually writes code like this:

import java.util.Map;
import java.util.HashMap;
 
public class MapTest {
    private Map<Integer, String> myMap;    // using the interface type
 
    public MapTest() {
        myMap = createMap();
    }
 
    public Map<Integer, String> createMap() {
        return new HashMap<Integer, String>();
    }
}

Granted, needlessly verbose, but not too far off the mark when it comes to Java design patterns.  This code makes it fairly trivial to change the underlying map implementation used throughout the class.  The implementation can even be changed by a subclass if it overrides the createMap() method.

In this code, we’re specifying semantically what sort of operations we need.  We’re being forced to tell the compiler that we want a map, but we’re not particular about what sort of map we get.  In short, we’re typing myMap less strongly than if we used the HashMap type directly.  This sort of “loose typing” is good, because we don’t want to rely on a particular implementation.  Unfortunately, the Java way of implementing this typing has given rise to the ubiquitous use of what I call “the I pattern”.  Just look at the Eclipse RCP hierarchy or virtually any of the classes in the Wicket API.  Almost all of them are typed against their almost-pointless interface definition; IModel for example.

Scala avoids this problem with type inference.  Type inference is inherently “loose typing” because it’s constraining the type based on the value which defines it.  Translating the Java sample into Scala yields the following:

import scala.collection.mutable.HashMap
 
class MapTest {
  var myMap = createMap()
 
  def createMap() = new HashMap[Int, String]
}

The myMap variable is actually of type HashMap, and the compiler knows and enforces this.  However, we can change the return value from createMap() and it will be immediately reflected in the type of myMap.  Thus there is no need to type the variable against its superinterface, since the flexibility we want is already available through type inference.

The only thing we do need to worry about here is not using any methods which are specific to HashMap.  Once we do that, we’ll have to actually change our code in more than one place if we want to change the map implementation.  However, when’s the last time you used any function that was specific to a collection implementation?

The point of the original article is actually that type inference makes it harder to see what type your variable (or even your method) is going to be, since the only explicit declaration of type is tucked away out of sight.  This is true.  In the above example, to know what type myMap is, we need to remember that createMap() actually returns a HashMap[Int, String] value.  The point I’m trying to make though is that worrying about precisely what type each variable may be is the wrong approach.  All you need to know is what they can do, and this is something which is easy enough to handle.

So does type inference make you do more work than necessary: yes, but only if you’re so rigidly dogmatic in your mind-set that you can’t accept the fact that it doesn’t matter.  To those of you who have an aversion to this sort of pattern: deal with it (you’ll thank me later).

Comments

  1. Good series Daniel!

    Must have been an editor in another life …
    myMap = createMap() => missing semicolon, Scala is getting you ! ;)
    but a subclass => by a subclass

    javier Wednesday, January 16, 2008 at 5:19 pm
  2. In a perfect world, this is so. My concern was that when things weren’t working right, I was having the darnedest time trying to figure out *which* code was running (and needed to be fixed). In Java things are simplified because the method being executed has to be in the specific class or somewhere up the “extends” chain – unless the method is abstract, in which case it will be specified in a child class somewhere down the “extends” chain. These are usually tracked down quite easily by an IDE.

    In Scala, the method being executed could also be a nested method within the current one (not a big deal, really, just something to get used to looking for) and could also be anywhere in the tree of traits rooted in the actual type of the object – which is the fun part. Determining the actual type of the object, then recursively enumerating all of its parent classes, traits, and objects, which generally requires locating the source code for those classes, traits, and objects, along with any companion objects which might or might not exist, can be a problem.

    The process of tracking these down is exacerbated by Scala’s not having any requirements for source file naming and location, and (to an extent) by Scala’s allowing multiple classes, traits, and objects in each source file whether related to each other or not. In theory it’s also exacerbated by Scala’s ability to rename on import, but that feature isn’t used enough to be a real problem.

    I was reminded of the bad old days when I worked in C++, and some twit would add a new method to foo.hpp, then put the actual implementation code in bar.cpp because they happened to be writing the Bar class when they discovered they needed a new method on Foo. At least there you could grep for Foo::NewMethod fairly effectively. Doing a grep for, say, ‘apply’ gives quite a number of choices to pick from.

    A good IDE could probably help a lot, as it does with Java.

    Anyway, I like type inference. That’s not the source of the problem. The problem is caused by the ability to achieve Separation of Concerns by splattering tiny scraps of code across dozens of classes, traits, objects, and companion objects, then bringing them all together at the point of use (in different combinations at each point, if you like). It sounds like a great idea until you have to try to debug it.

    Doug Pardee Wednesday, January 16, 2008 at 5:31 pm
  3. @Doug

    Definitely valid points. TBH, I agree with you fully on this. With my article, I was just trying to nip in the bud any anti-hype about type inference. A lot of people no where near as smart as you are going to half-read your article, try out Scala for ten seconds and decide that type inference makes life impossible and they need the “crutch” of explicit types. (starting to sound like the dynamic typing wars all over again) It really would be nice to see a decent Scala IDE, one which could keep track of all this stuff for us and remind us when necessary. Someday soon I hope…

    @javier

    Obviously not a very good editor since I keep missing all these obvious mistakes. Thanks!

    Daniel Spiewak Wednesday, January 16, 2008 at 5:54 pm
  4. I just realized that I forgot to mention one other place that one needs to look when chasing down the code for a method: implicit conversions. I haven’t really used those yet, but I did have the newbie problem of being totally confused about where the heck the method String.last came from. [For the other newbies: there's an implicit conversion from java.lang.String to scala.RichString in the scala.Predef object, and RichString inherits the method scala.Seq[Char].last via the RandomAccessSeq[Char] mixin.]

    Doug Pardee Wednesday, January 16, 2008 at 7:01 pm
  5. “….when’s the last time you used any function that was specific to a collection implementation?…”

    addFirst()/addLast()/getFirst()/getLast() of LinkedList, as far as I remember. And I don’t think it was that long ago :-) But, granted, it does not happen that often.

    BTW: This type-inference thing of Scala sounds an awful lot like what they introduced in C# in .Net3. M$ does not hold back on the evolution of the language. Java could get it too, although I guess it wont.

    But the technique must have its limitations too. What if the MapTest scala class in the example was part of a public API, which was compiled and released as a lib in binary form. What can Scala infer from that, when someone links against the api? What would the inferred return type of createMap() be then?

    Per Olesen Thursday, January 17, 2008 at 3:57 am
  6. If you want to prevent people from not depending on implementations you can add the type to the method or variable.

    def createMap : Map = new HashMap[Int, String]
    val map = createMap

    //hashmap specific methods not accessible

    Steve Bendiola Thursday, January 17, 2008 at 7:11 am
  7. Maybe I’m missing the point but you say:

    “worrying about precisely what type each variable may be is the wrong approach. All you need to know is what they can do, and this is something which is easy enough to handle.”

    Correct me if I’m wrong but this is the exact reason we program to interfaces in Java. We don’t care about the underlying implementation or the precise type of an object, we only care about those methods exposed by the interface and what the documentation or specs say those methods do.

    Brian Thursday, January 17, 2008 at 7:42 am
  8. @Brian

    In principle, you’re correct. Interfaces are designed to be precisely that, a type-agnostic constraint defining what method signatures will be available. The problem is that interfaces still are technically types and working with them explicitly imposes more constraints than just the method signatures. In the post I think I coined the phrase “loose typing”, meaning we’re not really tying ourselves to the specific implementation of the method signatures, but we *are* tying ourselves to the specific Map hierarchy. If we had our own Tree implementation that didn’t implement Map, even if it correctly implemented every method in question, we still couldn’t use it in place of our HashMap because it isn’t part of the Map type hierarchy. Type inference allows the type to logically float a bit, removing the explicit binding to a specific hierarchy.

    @Steve

    Very correct. Actually, if I were doing this in a real app, I probably would have declared createMap() like so:

    def createMap():Map[Int, String] = new HashMap[Int, String]

    As you say, this would constrain us only to methods that are in Map, though it somewhat defeats the point I was trying to make. :-) What’s really going on here is I was being a bit lazy and didn’t come up with a more thorough example.

    Daniel Spiewak Thursday, January 17, 2008 at 9:52 am
  9. @Per Olesen

    Type information is compiled into the binaries (otherwise the compiler wouldn’t be able to type check anything). This holds true for both JVM and CLR binaries.

    Daniel Spiewak Thursday, January 17, 2008 at 9:55 am
  10. So, in the “loose typing” case of createMap(), the JVM bytecode would contain a HashMap, implementation specific, type?

    Per Olesen Thursday, January 17, 2008 at 12:55 pm
  11. Yes it would (again, a consequence of my laziness in example choice).

    Daniel Spiewak Thursday, January 17, 2008 at 12:58 pm
  12. Something helpful to illustrate the static/dynamic typing difference is something like:

    var myString = “Init”
    myString = 8

    Static typing like Scala’s, whether inferred or not, will not compile this. Dynamic typing like Ruby’s or Python’s will just change the myString alias/reference to point to an integer object instead of a string object. By the way, C# 3.0 has (local) type inference through the “var” keyword also…

    Reedo Thursday, January 17, 2008 at 1:41 pm
  13. You wrote: “What is also true (and really more important) is that Scala almost completely eliminates the need to worry about what type you’re dealing with…”

    I just asked Tony Morris how often Scala’s type inference was successful. He gave a very different answer: “Scala’s type-inferencing algorithm is very different to Haskell’s — where in Haskell we rarely need to write type-annotations (though we may choose to), in Scala it is pretty much the other way around — very rarely do you not need to.”

    So what’s the scoop?

    My comment is at the bottom of Tony’s blog article, “Monads do not compose”. http://blog.tmorris.net/monads-do-not-compose/

    Frank Atanassow Tuesday, April 5, 2011 at 5:08 am

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*