Skip to content
Print

Case Classes Are Cool

11
Aug
2008

Of all of Scala’s many features, this one has probably taken the most flack over the past year or so.  Not immutable data structures or even structural types, but rather a minor variation on a standard object-oriented construct.  This is more than a little surprising, especially considering how much work they can save when properly employed.

Quick Primer

Before we get into why they’re so nice, we should probably look at what they are and how to use them.  Syntactically, case classes are standard classes with a special modifier: case.  This modifier signals the compiler to assume certain things about the class and to define certain boiler-plate based on those assumptions.  Specifically:

  • Constructor parameters become public “fields” (Scala-style, which means that they really just have an associated accessor/mutator method pair)
  • Methods toString(), equals() and hashCode() are defined based on the constructor fields
  • A companion object containing:
    • An apply() constructor based on the class constructor
    • An extractor based on constructor fields

What this means is that we can write code like the following:

case class Person(firstName: String, lastName: String)
 
val me = Person("Daniel", "Spiewak")
val first = me.firstName
val last = me.lastName
 
if (me == Person(first, last)) {
  println("Found myself!")
  println(me)
}

The output of the above is as follows:

Found myself!
Person(Daniel,Spiewak)

Notice that we’re glossing over the issue of pattern matching and extractors for the moment.  To the regular-Joe object-oriented developer, the really interesting bits are the equals() method and the automatic conversion of the constructor parameters into fields.  Considering how many times I have built “Java Bean” classes solely for the purpose of wrapping data up in a nice neat package, it is easy to see where this sort of syntax sugar could be useful.

However, the above does deserve some qualification: the compiler hasn’t actually generated both the accessors and the mutators for the constructor fields, only the accessors.  This comes back to Scala’s convention of “immutability first”.  As we all know, Scala is more than capable of expressing standard imperative idioms with all of their mutable gore, but it tries to encourage the use of a more functional style.  In a sense, case classes are really more of a counterpart to type constructors in languages like ML or Haskell than they are to Java Beans.  Nevertheless, it is still possible to make use of the syntax sugar provided by case classes without giving up mutability:

case class Person(var firstName: String, var lastName: String)
 
val me = Person("Daniel", "Spiewak")
me.firstName = "Christopher"   // call to a mutator

By prefixing each constructor field with the var keyword, we are effectively instructing the compiler to generate a mutator as well as an accessor method.  It does require a bit more syntactic bulk than the immutable default, but it also provides more flexibility.  Note that we may also use this var-prefixed parameter syntax on standard classes to define constructor fields, but the compiler will only auto-generate an equals() (as well as hashCode() and toString()) method on a case class.

Why Are They Useful?

All of this sounds quite nice, so why are case classes so overly-maligned?  Cedric Beust, the creator of the TestNG framework, even went so far as to call case classes “…a failed experiment”.

From my understanding of Scala’s history, case classes were added in an attempt to support pattern matching, but after thinking about the consequences of the points I just gave, it’s hard for me to see case classes as anything but a failure. Not only do they fail to capture the powerful pattern matching mechanisms that Prolog and Haskell have made popular, but they are actually a step backward from an OO standpoint, something that I know Martin [Odersky] feels very strongly about and that is a full part of Scala’s mission statement.

Well, he’s right…at least as far as the pattern matching bit is involved.  Case classes are almost essential for useful pattern matching.  I say “almost” because it is possible to have pattern matching in Scala without ever using a single case class, thanks to the powerful extractors mechanism.  Case classes just provide some nice, auto-generated magic to speed things along, as well as allowing the compiler to do a bit more checking than would be otherwise possible.

The point that I think Cedric (and others) have missed entirely is that case classes are far more than just a means to get at pattern matching.  Even the most stringent object-oriented developer has to admit that a slick syntax for declaring a data container (like a bean) would be a nice thing to have.  What’s more, Scala’s automatic generation of a companion object for every case class lends itself very nicely to some convenient abstractions.  Consider a scenario I ran into a few months back:

class MainWindow(parent: Shell) extends Composite(parent, SWT.NONE) {
  private lazy val display = parent.getDisplay
 
  private val panels = Map("Foreground" -> ForegroundPanel, 
                           "Background" -> BackgroundPanel, 
                           "Font" -> FontPanel)
 
  setLayout(new FillLayout())
 
  val folder = new TabFolder(this, SWT.BORDER)
  for ((text, make) <- panels) {
    val item = new TabItem(folder, SWT.NONE)
    val panel = make(folder)
 
    item.setText(text)
    item.setControl(panel)
  }
 
  def this() = this(new Shell(new Display()))
 
  def open() {
    parent.open()
    layout()
 
    while (!parent.isDisposed) {
      if (!display.readAndDispatch()) {
        display.sleep()
      }
    }
  }
}
 
case class ForegroundPanel(parent: Composite) extends Composite(parent, SWT.NONE) {
  ...
}
 
case class BackgroundPanel(parent: Composite) extends Composite(parent, SWT.NONE) {
  ...
}
 
case class FontPanel(parent: Composite) extends Composite(parent, SWT.NONE) {
  ...
}

If you ignore the SWT boiler-plate, the really interesting bits here are the Map of panels and the initialization loop for the TabItem(s).  In essence, I am making use of a cute little trick with the companion objects of each of the panel case classes.  These objects are automatically generated by the compiler extending function type: (Composite)=>ForegroundPanel, where ForegroundPanel is replaced by the case class in question.  Because each of these classes extends Composite, the inferred type of panels will be: Map[String, (Composite)=>Composite](actually, I’m cheating a bit and not giving the precise inference, only its effective equivalent)

This definition allows the iteration over the elements of panels, generating a new instance by using the value element as a function taking a Composite and returning a new Composite instance: the desired child panel.  It’s all statically typed without giving up either the convenience of a natural configuration syntax (in the panels declaration) or the familiarity of a class definition for each panel.  This sort of thing would certainly be possible without case classes, but more work would be required on my part to properly declare each companion object by hand.

Conclusion

I think the reason that a lot of staid object-oriented developers tend to frown on case classes is their close connection to pattern matching, a more powerful relative of the much-despised switch/case mechanism.  What these developers fail to realize is that case classes are really much more than that, freeing us from the boiler-plate tyranny of endless getter/setter declarations and the manual labor of proper equals() and toString() methods.  Case classes are the object-oriented developer’s best friend, just no one seems to realize it yet.

Comments

  1. I would recommend against using case classes with var fields as the generated hashCode() and equals() methods are time variant, causing all kinds of problems.

    Jesper Nordenberg Monday, August 11, 2008 at 1:29 am
  2. I would recommend against using var fields, as their values are time variant, causing all kinds of problems.

    Ricky Clarkson Monday, August 11, 2008 at 7:04 am
  3. Hi Dan,

    I’m still very open to being convinced that case classes are useful, so I’ve been trying hard to understand your example but I must be missing something because I don’t see why you need to declare your three panel classes as case classes.

    If it’s just the creation of the map, you can obtain the same result in Java by creating your map as an assocation of , so I don’t see what this approach buys you…


    Cedric

    Cedric Monday, August 11, 2008 at 9:01 am
  4. I think that a good part of the reason that many OO developers don’t understand case classes is that they still don’t understand the difference between a data type (identity defined by value, objects can be shared by copying) and an entity class (identity defined by address, objects can only be shared by reference). As a prime example, the difference between String which is a data type and StringBuffer/StringBuilder which is an entity. Case classes are a fine tool for defining data types, which is pretty much what you were saying.

    And yeah, mutable fields in data types are problematic.

    Doug Monday, August 11, 2008 at 9:02 am
  5. HTML escaping wins again.

    Here is what I meant (replace the brackets accordingly)

    If it’s just the creation of the map, you can obtain the same result in Java by creating your map as an assocation of [string, instance of panel].

    Cedric Monday, August 11, 2008 at 9:03 am
  6. @Jasper and @Ricky

    ========================================================
    scala> case class Person(var firstName: String, var lastName: String)
    defined class Person

    scala> val me = Person(“Christopher”, “Spiewak”)
    me: Person = Person(Christopher,Spiewak)

    scala> me.firstName = “Daniel”

    scala> me == Person(“Daniel”, “Spiewak”)
    res5: Boolean = true

    scala> me.hashCode
    res6: Int = -601825497

    scala> Person(“Daniel”, “Spiewak”).hashCode
    res7: Int = -601825497
    ========================================================

    Variant fields work fine with case classes, they just aren’t as elegant. I definately agree that immutability is the way to go, but there are many scenarios that call for a “bean” of sorts. Just passing data from one part of the program to another, or perhaps storing it temporarily. Not all of those *require* mutability, but many of them are more convenient because of it.

    @Cedric

    Yeah, I probably should have picked a different example than the panels and the Map. I was trying to show how the unique properties of case classes allow some very nice idioms that are not possible (or are too bulky) without them.

    The reason the panels must be case classes is that way the compiler automatically generates a companion object as a subclass of a function type. This is different from storing a map to the instances for two reasons. First: it allows lazy initialization of the panels. You wouldn’t want to load 30 panels when the user will only view two of them. This isn’t relevant to a TabFolder, but in my original application (which was more complex), it was significant. Also, this technique allows parameters to be passed to the panel constructors upon initialization based on values in the UI itself. This might be useful in a situation like a wizard where the user fills out firstname/lastname and that data must be passed to the next step. This is only possible if the panels are created *after* the UI is populated and input readied.

    Literally, the compiler is auto-generating a factory for every case class. This factory singleton is what we are mapping. We could of course create the factory ourselves, but it is ever-so-much more convenient to allow the compiler to do it for us, especially when we can take advantage of function types and inference to operate on the factories polymorphically. Again, something which could have been done by hand using an interface/trait or by extending the function type ourselves, but why bother when the compiler can do it for us?

    Daniel Spiewak Monday, August 11, 2008 at 11:26 am
  7. Daniel,

    You can still achieve a very similar effect by mapping your string keys to Callables that will create the right panels with the correct parameters lazily (and it’s even type safe with generics!). It’s actually even more flexible than your case class approach since any kind of objects can be created this way and not just subtypes of panels.

    I’m obviously still skeptical, but if you want to come up with a convincing use of case classes, I would suggest to find an example that actually does some case/switch, which is an area where Java might not be able to follow (but I’ll still try :-) ).


    Cedric

    Cedric Monday, August 11, 2008 at 11:33 am
  8. True, as you said, the map could have been to Callable(s), or really any other factory implementation for that matter. The advantage to case classes in this instance is that all of the boiler-plate of the factories is generated for us by the compiler. More of a “slickness” feature than anything else, but it does reduce the LoC (and the chances for a typographical error) somewhat.

    I had been trying to avoid using pattern matching as motivation for case classes, but since you insist… :-) We really don’t have to look further than the block-standard monad example: Option:

    def accessDatabase(id: Int): Option[Person] = …

    val res = accessDatabase(123)
    res match {
    case Some(p) => // do something with the Person
    case None => // report problem to user
    }

    Java can support this, but we would have to do something using null as a signal value. I suppose there’s no real harm in doing things that way, but explicit checks for null have always struck me as a little ugly (not to mention the risk of missing something and triggering a NPE).

    To me, the most convincing case (no pun intended) for pattern matching (and by extension case classes) is the wide world of monads. Option happens to be an example which can be easily countered by the use of null, but there are more elaborate structures which are more bulky in the absence of pattern matching. I’m planning on dealing with this in next week’s post, so I won’t try to condense it all into a comment, but see this scaladoc for a taste: http://www.scala-lang.org/docu/files/api/scala/util/parsing/combinator/Parsers.html Not trying to duck out of the argument, just postponing it so that I can give you a proper response.

    Daniel Spiewak Monday, August 11, 2008 at 12:06 pm
  9. Daniel, what I’m trying to say is that equals() and hashCode() should not be time variant functions, so whenever you have mutable fields you shouldn’t override equals() and hashCode(), but rather use the default implementation. That’s why I recommend against using case classes for mutable objects.

    Jesper Nordenberg Monday, August 11, 2008 at 1:24 pm
  10. @Jesper

    Oh, I see what you mean. In principle, I agree with you. Unfortunately, there have been situations where I have needed that time variant property in the object identity. I can’t remember an exact scenario, but I believe it had something to do with caching and multi-maps.

    In general though, mutable containers should use object id-based implementations of equals and hashCode, while containers which require equals/hashCode to be formed based on contents should ensure immutability. As Doug pointed out, it is the difference between a datatype and an entity.

    Daniel Spiewak Monday, August 11, 2008 at 2:16 pm
  11. There’s been a good discussion on Lambda the Ultimate about Visitor vs Pattern Matching. The discussion starts about here: http://lambda-the-ultimate.org/node/2927#comment-43220.

    James Iry Monday, August 11, 2008 at 5:12 pm
  12. I’m curious about the runtime characteristics of case classes. In your first example, if I do this:

    val me = Person(“Doug”, “Clinton)
    val meAgain = Person(“Doug”, “Clinton”)

    and assuming that the immutable form of case class is used, will this result in the creation of two objects, or will me and meAgain refer to the same instance of Person?

    Doug Clinton Tuesday, August 12, 2008 at 7:53 am
  13. Two different instances. Scala doesn’t currently know in any deep sense the difference between immutable classes which could share the same instance and mutable classes which can’t.

    Even if it did, there would be practical limitations similar to the way string values can be shared. For instance, it would probably only be able to share instances that were created from the same literals (or things that the compiler knew how to fold into the same literal like “Do” + “ug”). Trying to share instances created from variables would require a runtime cache which is normally considered something to be left in the hands of a programmer because it creates important differences in runtime space/time trade-offs.

    James Iry Tuesday, August 12, 2008 at 9:04 am
  14. James,

    Fair enough, but the caching is something which could be handled by the companion object. Having to build your own companion object to do this would seem to me to defeat half the benefit of using case classes in this way. The compiler could probably determine if the case class was deeply immutable (i.e. all it’s instance variable were var and deeply immutable) and provide an option in some way to say to use common instances. I’m thinking of a particular case I’ve dealt with in a java program where I built the cache myself to optimise away the generation of vast numbers of immutable objects with the same values that was causing serious overhead. It just seems a missed opportunity to let the language take care of this automatically in some way.

    BTW, if I want to override the auto-generated companion object is it just a matter of defining it and the compiler will use my definition instead?

    Doug Clinton Tuesday, August 12, 2008 at 12:24 pm
  15. If you define a companion object for your case class, then the compiler will use that and add to it as necessary to satisfy the case class specification. For example, if you define your own case class companion object with an apply() method, the compiler will *add* an unapply() method to allow pattern matching with the case class. It won’t overwrite anything you define, but it will ensure that the necessary signatures are present.

    You can override apply() in your companion object to perform caching, and that’s precisely what we’re suggesting you do for immutable objects. The compiler could certainly perform deep analysis to determine immutability, but as James pointed out, actually implementing a caching mechanism based on that analysis alone would be presumptuous to say the least. How is the compiler to know what your program should be optimized for (space vs time)? At least GCC and friends define a -O parameter to hint at this sort of thing, but even that does not control transparent caching or its ilk.

    As an extreme example of how such transparent caching could go wrong, let’s assume that you’re working on some sort of data-mining application involving the creation of *billions* of immutable objects, each slightly different from the other. Each of these objects is used briefly, then goes out of scope. What happens if the compiler injects caching into this scenario? Answer: memory usage shoots through the roof, the heap overflows and your application crashes after only a few hundred thousand objects. Even the smartest analysis can’t be trusted to make the final decision on *what* scenario you want to optimize your application for, it can only make guesses. Basic optimizations (constant folding, tail-call folding, etc) are almost “no-brainers” that apply in all cases, but something as high-level as an object cache would be very dangerous to try to do automatically.

    Daniel Spiewak Tuesday, August 12, 2008 at 12:37 pm
  16. Daniel,

    Fair enough. I wasn’t aware that the compiler will supply the necessary bits of the companion object, so that keeps the overhead down and gives the flexibility one might need. I should have known. Scala seems particularly good at just letting you express the things you need and dealing with all the rest automatically.

    Doug Clinton Tuesday, August 12, 2008 at 1:22 pm
  17. @Cedric

    Ok, so it looks like my article example is going to have to wait a bit longer (not happening this week). :-S I’ll try to summarize what I was thinking:

    Pattern matching allows boolean testing *and* extraction of data in a single operation. It’s like the old-fashioned dynamic_cast in C++, which combines Java’s instanceof and cast operations. This saves on the amount of typing of course, but it also makes things a little less typo-prone. For example, consider the example of the immutable List. In Scala, this is implemented using a case class (::) and a case object (Nil):

    def sum(list: List[Int]): Int = list match {
    case hd :: tail => hd + sum(tail)
    case Nil => 0
    }

    The pattern matching handles two operations in one: it checks to see if the list has any remaining elements and binds the first such element and the remainder of the list to the hd and tail constants (respectively). If we tried to write this without pattern matching, it would look something like this:

    def sum(list: List[Int]): Int = {
    if (list.length > 0) {
    list.head + sum(list.tail)
    } else 0
    }

    The problem here is that our conditional body is not *compile time* constrained to only deal with the list as a head and tail. All we know about this list is that it has at least one element, so we should only work with the element we know exists (at index 0). It is possible at compile time to access the list deeply. While this is of course possible in our first example (by accessing list directly), it is not as easy to do (in terms of typographical or even logical errors).

    Pattern matching is inextricably linked to the more compelling uses for case classes, so I suppose that it is true that you cannot deeply consider one without the other. Neither is a hard necessity, since it is certainly possible in Scala to write code which avoids pattern matching (and case classes) altogether. The question is: why would you want to? As language features, they save on code noise, they reduce the potential for subtle logic errors (like selecting the wrong index), in every way I can think of they represent a benefit. Can the technique be over-used and applied to situations where polymorphism is better suited? Yes it can, but that would not be a “correct” application of the construct. It’s just another tool which simplifies some unfortunately common scenarios.

    Daniel Spiewak Sunday, August 17, 2008 at 1:04 pm
  18. Sorry to come into the discussion so late…

    I wonder if the angst about case classes stems simply from the name. It is true that they can be used in case clauses of match expressions. However, to say that case classes are simply classes to be used in case clauses is missing the bigger picture. That’s essentially Daniel’s point.

    What if everybody did a global, mental replacement of “case” with “bean”. Now we’re talking about bean classes. Does that make things any better?

    Dan Yankowsky Monday, March 16, 2009 at 6:20 am

Post a Comment

Comments are automatically formatted. Markup are either stripped or will cause large blocks of text to be eaten, depending on the phase of the moon. Code snippets should be wrapped in <pre>...</pre> tags. Indentation within pre tags will be preserved, and most instances of "<" and ">" will work without a problem.

Please note that first-time commenters are moderated, so don't panic if your comment doesn't appear immediately.

*
*