Thursday, December 3, 2009

Improve Your Releases

There is more to a good software release than a good program.

If you are releasing software and want it to be successful, you have to do more than just write a good program. You need to consider all of the things the user will want to do with your software before and after actually using it.

You can look at the steps below as an interpretation of the phases of the software lifecycle from the perspective of the user. Depending on how well you do your job, the user will have a better or worse experience with each of these phases. If, for one of these phases, you do nothing, the user is likely to have an unpleasant experience when he gets to that phase.

Develop

This is the phase that most open-source developers focus on. If we were looking at the software life cycle in a little more detail, we would split this phase into three separate phases: design, code, and test. In commercial development these three phases are often handled by separate groups, but from the user's perspective they can be lumped together as being the factors that contribute to the overall quality and usability of the software.

This is the time in which to consider all of the points below so that you can create your system in a way that makes it easy to do the right thing for all of the other phases of the software life cycle.

Release

Once the software comes out of test, it must be packaged up as a Release. It is useful for a user easily to be able to tell what released artifact he has acquired and what version of that artifact he has. The simplest way to handle this is to define a released artifact as being a single file. If you think you need to release a collection of files, they should be packaged up as a single file, such as in zip, tgz (tar-gzip), dmg or iso format. You can then give each file a name and version number to allow the user to identify it.

You may have a product that is composed of a number of other released artifacts. You can bundle these into one larger artifact that is a collection of the other artifacts plus an installer that can invoke the installers of the other artifacts, or that knows how to install those other artifacts directly. Operating system installers work in this way.

Once your released artifact is in a single file and appropriately labeled, it is easy to take the next step: generate and publish a checksum for that file. If, for any reason, the user is unsure about what artifact and version he has, he can then run a checksum on his file and compare it against your official list of checksums. Using a cryptographic checksum provides protection not only against accidental corruption, but against intentional modification (hacking) of the artifact as well. Depending on the level of security desired, you can use md5, sha1, or sha256 for your checksum. Operating system distributions such as Fedora do this, including the additional security step that the published list of checksum values is digitally signed.

Distribute

Your user needs to get your released artifacts. Long ago this used to be done by distributing physical media such as DVDs, CDs, floppies, or tapes. Today most distribution is done over the internet, which makes this step far simpler than it used to be.

Open source projects have easy solutions available through such services as sourceforge and github. Many commercial providers also distribute their software from download pages on their web sites, often with additional security such as restricting web access to customers with accounts, and using license files to enable specific functionality in the installed software.

Given how widespread and well-understood this model is, it makes sense to use it for internal software as well: set up a web site where your users can find all of your released artifacts. If the list of artifacts is small, you can just set up a few directories with files in them and serve up those files with your web server. When the number of artifacts gets large enough so that browsing listings gets cumbersome, you can add a search form. If you need to restrict which of your internal users have access to your downloads, you can do that in the same way as commercial vendors do, with password access or license-file control of the installed application.

Install

The user should be able to install the complete application from the downloaded artifact with a single command, or at most two commands (an unpack command followed by execution of a setup script). Installing a Windows application is generally done by downloading an exe file and then executing it; a Mac install is generally done by downloading a dmg file, double clicking on it, and dragging the app into another folder; a Java install is often done by downloading a jar file and then executing it (such as by running java -jar on it). These are all examples of simple installation mechanisms. Once running, an installer can direct the user to select values for options and installation paths.

In particular, you should not require the user to unpack the software and then manually execute a number of other steps such as moving files around or editing config files. These are steps that should be handled by an installer.

When the files are installed on the user's system, there should be an easy way to determine what version is installed and in use. This information should be easily available in the application, such as in an About menu in a GUI application. If a user might install multiple artifacts from your collection, there should be a simple way to get a list of all artifacts installed and in use along with their version numbers so that you can unambiguously tell what versions of your software are being used at that site.

Support

Finally, the user is using your software. If your software is well designed, well written and well tested, the user should have no problems using it and all will be well. In reality, it is unlikely that no user will ever have any problems with your software. When a user does have problems, what will he do (other than grumble or swear at your software, that is)? Assuming the user is motivated to solve the problem rather than just giving up, he will seek out resources that can provide him with the information he needs to solve his problem. You can make his life easier in this step by providing some or all of the following:
  • A Users Guide or set of guides (tutorial, reference).
  • In-application help (context-specific, page-specific, links to the manual, search, how-to).
  • On-line forums where users can share their problems and solutions.
  • Direct support, via telephone, email, or chat.
If a user experiences a crash or runs into a bug, it might be nice if he can easily submit a crash report or a bug report so that you can more effectively fix the problem. If so, you will want that report to automatically include the list of installed artifacts and versions, as discussed above in the Install section.

Upgrade

If your software is successful you will probably release new versions of it. A user who is already using your software should be able to start using the new version of your software with minimal hassle. As with the initial install, installing an upgrade should be done with at most one or two commands, as it could be with an upgrader that guides the user through whatever questions need to be answered for the upgrade.

You can add an option to your application to check for upgrades and ask the user if he wants to download and install them, saving the user the hassle of separately doing those steps. If you choose to implement this, you should allow the user to disable it. There should also still be a way that the user can download an upgrade (as a single file, just as with an initial install), copy it to another machine, and install it there, in case he is running on a machine that is not connected to the network or is behind a firewall that prevents your automated download from working.

There are two ways in which an upgrade is different from an install, leading to two additional goals for the upgrader:
  1. If the user has any configuration or customization, that should be carried over to the new version.
  2. If the user starts running the new version and soon discovers that it is unusable for him, he should quickly be able to roll back to the previous version.
An approach to handle the first goal is to keep the configuration and customization in a separate directory, such as in the user's home directory, or (for Unix systems) in /etc or (for Windows systems) in the Registry. There can still be problems when upgrading if the format of the config and customization files changes, or if the items being configured and customized have changed between versions. Your upgrader should take care of this.

One relatively easy way to satisfy the second goal is to install each version of the application in a separate directory that contains the version number in the name, then providing a current directory that is a link to the version to be used. Rolling back to a previous version might then be as simple as deleting the current link and recreating it to point to the previous version. Ideally, however, this rollback is also done by a program you provide, in case a rollback also requires any other changes such as to the configuration and customization files.

The upgrade and rollback should of course update the list of installed artifacts and current versions.

Patch

Occasionally you might want to deliver a minor update or bug fix to your software. You might send out one modified file and ask the user to install it in a specific location to fix a bug.

While this sounds like an easy mechanism for quick fixes, in the long run you will be better off ensuring that your upgrade process is streamlined enough that you can package up that one file in an upgrade and use your upgrade process.

The problem with sending out patch files and doing ad-hoc installs like this is that it makes it very difficult to keep track of what is installed at a customer site. If you send out four or five patches and then the customer starts reporting unique bugs, will you know what software is running at that site so you can track down those bugs? You could work on setting up a system to keep track of those patches, but you might as well invest that effort into making your upgrade process easier to use.

Perhaps you think that each customer will have a different set of patches, and you don't want to send the same patches to all of your customers, so you don't want to make them all standard upgrades. If it is really the case that you want to deliver different things to different customers, then you are not really delivering one artifact, you are delivering separate artifacts to each customer. In this case, you should just call them different artifacts, give them their own version numbers, and send out upgrades for those separate artifacts. In that way you can continue to use your standard upgrade process, and you can always know exactly what your customer has by collecting the list of artifacts and their version numbers for all of the artifacts installed at a customer site.

If you really think you need to send out patches, consider the following goals:
  • It should be easy for the user to install the patch with a single command.
  • It should be difficult for the user to make a mistake when installing the patch, such as could happen if he has to manually install files into specific directories or manually edit any files.
  • It should be easy for the user to rollback the patch if it doesn't work.
  • It should be possible for both you and the user to know exactly what version of software is installed at the site, including what patches have been applied, even if there is a patch of a patch.
If it seems to you that implementing a patch mechanism that does all of this is easier than adding some improvements to your upgrade process and perhaps dividing up a couple of your artifacts to more accurately reflect how you are actually installing them, then go for it.

Migrate

At some point one of your users might decide that he wants to stop using your software and move to some other package. If you are a commercial software provider you might think this is not something that should be in your list of goals - why should you help out a competitor? - but if you are interested in doing what is best for your user, you should at least recognize this phase of the software lifecycle and make a conscious decision about it. The better you treat a leaving customer, the more likely it is that he will some day be a returning customer.

To support your users in this step, you should provide export tools that allow the user to export all of his data from your application in a standard format. Depending on the application, this might mean exporting a CSV file, an XML file, an Open Document file, or something else.

If you also implement an import capability that reads the same standard file format as your export produces, this could help you in the future if you ever change your internal storage representation from one version to the next: just export from the old version into a file using the standard format, upgrade to the new version, and import that file.

Uninstall

Whether or not a user chooses to move to a different product, he may eventually decide he is done using your software and he would like to remove it from his system. As with the install, it should be possible for the user to uninstall your software with a single command. If an application was installed simply by unpacking it, that single command might be to remove that unpacked directory. With a more complicated installation, uninstallation is likely also to be more complicated, making an uninstaller program more important.

If you have set up your application such that the user-customized portions are separate from the standard install, your uninstaller can give the user the option of keeping those portions. Similarly, if the application maintains user data in its own directories, you should get confirmation from the user before deleting those files and give the user the option of keeping them.

You might also want to consider how you want your installer to behave if the user runs the uninstaller, keeps his customizations and data, then runs the installer. A user might want to do this to downgrade to a previous version if you do not otherwise provide a simple solution for that. Or perhaps you treated a departing customer well enough that he is now returning to your product, in which case he might be pleased to find that his old preferences and customizations are still available.

Wednesday, November 4, 2009

Overriding vals as Optional Parameters

For simple cases you can use Scala vals, selectively overridden, as a way of implementing optional parameters. Overriding can also be used for other interesting tricks.

Contents

Optional Class Parameters

In Java, a typical idiom for initializing an object that has a large number of optional parameters, of which only a few usually get set, is to construct the object and then call setter functions to customize each of the optional parameters. While this technique can be convenient, it leaves open the possibility that the setter might get called later on in the objects lifecycle at a time when changing that value could cause problems.

One solution to this problem is to use the builder pattern. This solution is available in Scala as well, and can be taken a step farther than in Java by using the type-safe builder pattern.

The type-safe builder can be overly complicated for many situations. Sometimes it would be nice to have something simpler than even the simplest of builders.

Scala 2.8 will have named parameters with default values, which will make it pretty easy to create classes that have optional parameters, although you might not want to do this if you have 30 optional parameters. Meanwhile, there is another approach you can use: overriding vals.

The approach is pretty simple: you define a base class with a constructor that includes all of the required parameters, and you then add a val for each of the optional parameters. When you want to create an instance of that class that sets some of the optional parameters, you create an anonymous subclass by adding a set of braces after the new statement that creates the instance, and inside the braces you override each val that you want to set.

In this example we define a Car class that represents a few pieces of information about a car. model and color are required parameters and appear in our constructor. Our optional parameters are hasRadio and hasSunRoof, so we make those vals rather than constructor parameters, and we assign them their default values. We include a toString method so we can easily see the results.

class Car(model:String, color:String) {
    val hasRadio = false
    val hasSunRoof = false

    override def toString() = {
        "Car{"+
            "model="+model+ 
            ",color="+color+
            (if (hasRadio) ",hasRadio" else "")+
            (if (hasSunRoof) ",hasSunRoof" else "")+
        "}"
    }
}
The normal use would be to call the constructor with no additional arguments:
val c1 = new Car("Ford", "red")
println(c1)

//Car{model=Ford,color=red}
To specify one of our optional arguments, we add a code block to the new call, which creates an anonymous subclass in which our val overrides the default:
val c2 = new Car("Chevy", "blue") { 
    override val hasRadio = true 
}   
println(c2)

//Car{model=Chevy,color=blue,hasRadio}
We can pass in values from the caller's context rather than constants:
val myHasSunRoof = true
val c3 = new Car("Honda", "white") {
    override val hasSunRoof = myHasSunRoof
}
println(c3)

//Car{model=Honda,color=white,hasSunRoof}

Optional Trait Parameters

You can use this same approach to pass in values for instance variables in traits, which don't have constructor parameters. For example, say we define a trait for an optional Touring package for our car:
trait Touring {
    val hasNavSystem = false
    val hasExtraSuspension = false
    val hasTowHitch = false
    val hasRunningBoards = false

    override def toString() = {
        super.toString()+
            "+Touring{"+
            (if (hasNavSystem) "navSystem," else "") +
            (if (hasExtraSuspension) "extraSuspension," else "") +
            (if (hasTowHitch) "towHitch," else "") +
            (if (hasRunningBoards) "runningBoards," else "") +
        "}"
    }
}
Now we can create an instance of a Car with Touring and pass in values for some of those "optional constructor parameters" defined in the Touring trait:
val c4 = new Car("Honda","white") with Touring {
    override val hasSunRoof = true      //from Car
    override val hasNavSystem = true    //from Touring
    override val hasRunningBoards = true  //from Touring
}

println(c4)

//Car{model=Honda,color=white,hasSunRoof}+Touring(hasNavSystem,hasRunningBoards,}
NOTE: Due to a bug in older versions of Scala, at least through 2.7.6, overriding a val on a trait as in the above example does not work. This does work properly in Scala 2.8.0 (at least it does in the 20091006 nightly build).

Early Definition

You may have a situation in which some of the vals that you are initializing in a trait or class depend on other vals. In this case, overriding a val as we did above may not give you the result you want: the initializer of the superclass runs to completion before the initializer of the subclass, which means all of the vals in the superclass get set before any of the overriding vals are evaluated.

For example, say we modify our Touring trait by adding a maxTowWeight value, as shown in bold below:
trait Touring {
    val hasNavSystem = false
    val hasExtraSuspension = false
    val hasTowHitch = false
    val hasRunningBoards = false
    val maxTowWeight = if (!hasTowHitch) 0 else
        { if (hasExtraSuspension) 1500 else 1000 }

    override def toString() = {
        super.toString()+
            "+Touring{"+
            (if (hasNavSystem) "navSystem," else "") +
            (if (hasExtraSuspension) "extraSuspension," else "") +
            (if (hasTowHitch) "towHitch," else "") +
            (if (hasRunningBoards) "runningBoards," else "") +
            "maxTowWeight="+maxTowWeight +
        "}"
    }
}
When we instantiate a Car with Touring the constructor code for Touring executes before the constructor code for the new class. In particular, val maxTowWeight gets evaluated before the overriding values are evaluated, so it always ends up with a value of zero:
val c5 = new Car("Honda","white") with Touring { override val hasTowHitch = true }

println(c5)

//Car with Touring = Car{model=Honda,color=white,hasRadio=false,hasSunRoof=false}+Touring{towHitch,maxTowWeight=0}
Scala provides a mechanism to address this issue: Early Definition (Scala Language Specification, section 5.1.6). The vals that you specify in the Early Definition block are evaluated in the context of the calling class, then that set of values is placed into the context of the new class being instantiated such that all of those values are available at the beginning of the process of instantiation, even before the initializer for Object is executed. In this way, any expression which uses one of those vals will have access to the value provided in the Early Definition.

It could be used with our Car example like this:
val c6 = new { override val hasTowHitch = true } with Car("Honda","white") with Touring

println(c6)

//Car with Touring = Car{model=Honda,color=white,hasRadio=false,hasSunRoof=false}+Touring{towHitch,maxTowWeight=1000}
A class definition for the above example could look like this:
class TouringCarWithHitch(name:String, color:String) extends {
            override val hasTowHitch = true
        } with Car(name,color) with Touring {
    //normal class overrides and additional elements here
}

val c7 = new TouringCarWithHitch("Honda","white")
//c7 is the same as c6 (but we have not implemented ==)

Required Trait Parameters

If you want to define a trait that has required parameters rather than optional parameters, you can omit the value from the declarations and instead specify only the type, which causes the val to be abstract. For example, if we want to make the hasTowHitch and hasNavSystem parameters to our modified Touring trait be required, that would look like this:
trait Touring {
    val hasNavSystem:Boolean   //abstract (no value)
    val hasExtraSuspension = false
    val hasTowHitch:Boolean    //abstract (no value)
    val hasRunningBoards = false
    val maxTowWeight = if (!hasTowHitch) 0 else
        { if (hasExtraSuspension) 1500 else 1000 }

    override def toString() = {
        super.toString()+
            "+Touring{"+
            (if (hasNavSystem) "navSystem," else "") +
            (if (hasExtraSuspension) "extraSuspension," else "") +
            (if (hasTowHitch) "towHitch," else "") +
            (if (hasRunningBoards) "runningBoards," else "") +
            "maxTowWeight="+maxTowWeight +
        "}"
    }
}
Now when we declare a concrete instance of this class, we are required to define values for those two variables else we will get a compiler error. Since the base declaration is now abstract, we omit the override keyword on those vals:
val c8 = new Car("Honda","white") with Touring {
    override val hasSunRoof = true      //from Car
    val hasNavSystem = true             //from Touring; required
    override val hasRunningBoards = true  //from Touring; optional
    val hasTowHitch = false             //from Touring; required
}

println(c8)

//Car{model=Honda,color=white,hasSunRoof}+Touring(hasNavSystem,hasRunningBoards,maxTowWeight=0}

Abstract Class Parameters

Sometimes it is convenient to use an abstract val rather than a constructor parameter for abstract classes. For example, say you have a Service and you want to define a set of case classes for service messages. The base class should have a reference to the Service object so that it can easily be processed by generic service methods, but each case class should also have the same reference as a case value for easy matching. For consistency, since these are the same value, the name should be the same. You could do this by defining the base class with one parameter declared as a val to make it accessible, then define the case classes to override that value, like this:
abstract class Service
abstract class ServiceMessage(val service:Service)
case class ServiceStart(override service:Service) extends ServiceMessage(service)
case class ServiceStop(override service:Service) extends ServiceMessage(service)
The case class automatically adds a val keyword to each of our parameters, so we need to specify the override keyword, but can omit the val keyword.

We can simplify our case classes a bit by changing the base class val from a constructor parameter to an abstract val, like this:
abstract class Service
abstract class ServiceMessage { val service:Service }
case class ServiceStart(service:Service) extends ServiceMessage
case class ServiceStop(service:Service) extends ServiceMessage
Not only have we dropped the override keyword, but we are also not passing the service parameter to the superclass. The implied val keyword on the case class parameters creates a concrete instance of the service parameter that overrides the abstract value defined in the base class.

Type Parameters

Just as scala has value parameters, concrete value members and abstract value members, it likewise has type parameters, concrete type members and abstract type members. The approach used above on values can generally by applied to types as well: rather than defining a class with a type parameter, you can often define that class with a type member. If the type is a required type that must be overridden by the extending class, make the type member abstract; if you want the subclass to be able to default to the type used in the superclass, use a concrete type and let the subclass use the override keyword if it wants to override that type.

Bill Venners has a nice blog post where he discusses the question of when to use a type parameter and when to use an abstract type member, with a reference to an interview with Martin Odersky where he talks about abstract type members in comparison to instance variables.

Caveats

Although in many ways you are free to choose between using a constructor parameter versus a class member, they are not entirely equivalent. In particular, once you start building up class hierarchies using abstract and concrete members with overrides, you have to be careful that the initialization order is what you expect. In the Early Definition section above I gave one example of how values can fail to initialize correctly due to ordering issues. That one is pretty easy to understand, but they can sometimes be far more subtle and hard to spot.

One thing you can do that will sometimes fix such problems is to use the lazy keyword on your value members in order to get lazy initialization. This causes initialization of the value to be delayed until the first time it is used, rather than being eagerly initialized when the class is initialized. Note that if you declare a concrete variable as lazy, then an overriding instance of that variable must also be declared as lazy; if the original concrete variable is not lazy, the overriding variable can not be lazy.

Note that overriding a val in Scala is not the same as declaring a variable of the same name in a subclass in Java. Consider this Java test program Test.java:
public class Test {
    public static void main(String[] args) {
        (new Test1()).test1();
        (new Test2()).test1();
        (new Test2()).test2();
    }
}

class Test1 {
    public int t = 1;

    public void test1() {
        System.out.println("t="+t);
    }
    public void test2() {
        System.out.println("t="+t);
    }
}

class Test2 extends Test1 {
    public int t = 2;

    public void test2() {
        System.out.println("t="+t);
    }
}
and the apparently equivalent Scala test program Test.scala (where I have used Java-like syntax where possible so that you can run "diff" on the two files):
object Test {
    def main(args: Array[String]) {
        (new Test1()).test1();
        (new Test2()).test1();
        (new Test2()).test2();
    }
}

class Test1 {
    val t = 1

    def test1() {
        System.out.println("t="+t);
    }
    def test2() {
        System.out.println("t="+t);
    }
}

class Test2 extends Test1 {
    override val t = 2

    override def test2() {
        System.out.println("t="+t);
    }
}
Copy these out to Test.java and Test.scala, then compile and run each one (don't try to compile both and then run both in the same directory, as the class files will collide). The Java test prints this out:
t=1
t=1
t=2
The Scala test prints this out:
t=1
t=2
t=2
Note the difference in the middle line, where we have called Test2.test1(). The Java program prints 1, but the Scala program prints 2. This is because the declaration of t in Test2 in Java does not override the value in Test1, it shadows it. The Test1 value of t is still there, and it used by any method in Test1 that refers to that variable.

In Scala, by contrast, references to t in Test1 refer to the overridden value provided by Test2. Scala can do this because, consistent with the Uniform Access Principle, a variable in Scala is accessed by a pair of functions to get and set its value. When a value is overridden, that creates new access functions in the subclass that override the access functions in the base class.

Sunday, October 11, 2009

Scala Case Statements As Partial Functions

A Scala case statement can be either a Function1 or a PartialFunction depending on the context.

In my previous post I presented a simple Publisher that I used to decouple my Swing actors from their targets. Reader nairb774 pointed out that the standard Scala library includes a Publisher class. In fact, there are two Publisher classes in Scala, scala.collection.mutable.Publisher and scala.swing.Publisher. Although I like my publisher class better, the swing publisher did have one feature that I thought was useful: it accepted as a callback a PartialFunction rather than, as mine did, a Function1. That would mean, I thought, that I could pass in a case statement as a callback.

For example, continuing the Mimprint example from my previous post, if I were only interested in Enabled events published by a particular publisher, rather than explicitly checking this in my callback with an isInstanceOf or a match statement that includes a case _ => clause, I could just use a one-line case statement:
showSingleViewerPublisher.subscribe { case e:Enabled => doSomething() }
My calling code in Publisher would call apply on the PartialFunction callback only if a call to its isDefinedAt method returned true, thus avoiding the MatchError that would occur if I treated it like a Function1 and called its apply method when the value was not Enabled. This seemed like useful functionality, so I decided to add it. I thought it would be easy, but unfortunately it was not.

Consider the following three definitions that assign a case statement to a partial function, full function, or no explicit function type, respectively:
val pfv:PartialFunction[String,Unit] = { case "x" => println("Got x") } val ffv:Function1[String,Unit] = { case "x" => println("Got x") } val nfv = { case "x" => println("Got x") }
For the first line, the variable pfv gets assigned a value which is a PartialFunction representing the case statement. For the second line, you might think that, since PartialFunction extends Function1 and we are assigning the same value to ffv as we did to pfv, that the variable ffv would be assigned a value which is a PartialFunction, just as for the value pfv. This is not the case.

The Scala Language Specification (SLS) explicitly states, in section 8.5, that the type of an anonymous function comprised of one or more case statements must be specified as either a FunctionK or a PartialFunction, and that the value generated by the compiler is different depending on that specified target type. So the value that gets assigned to ffv is a Function1, and ffv.isInstanceOf[PartialFunction[_,_]] evaluates to false. Note that we could assign the value pfv to the variable ffv, in which case ffv would have a value which is a PartialFunction and ffv.isInstanceOf[PartialFunction[_,_]] would evaluate to true.

What happens if you don't specify the type, as in the third line above where we assign the same value to nfv? You might think the compiler could infer the type of the resulting value, but since, as specified in the SLS, the type must be explicitly specified as either a FunctionK or a PartialFunction, our assignment to nfv is actually not a valid statement, and it fails to compile. It would be nice if the error message said something like "You must explicitly specify either a FunctionK or a PartialFunction for a case statement", but instead it gives this relatively unhelpful message:
<console>:4: error: missing parameter type for expanded function ((x0$1) => x0$1 match { case "x" => println("Got x") }) val nfv = { case "x" => println("Got x") } ^
In my case, the situation in which I encountered this message was a little different. Here is an example showing the problem I ran into:
class PF[T] { //partial function type def sub(x:PartialFunction[T,Unit]) = x } class FF[T] { //full function type def sub(x:Function[T,Unit]) = x } class NF[T] { //no unique function type def sub(x:PartialFunction[T,Unit]) = x def sub(x:Function[T,Unit]) = x } val pf = new PF[String] val ff = new FF[String] val nf = new NF[String] pf.sub{ case "x" => println("x") } //works, result is PartialFunction ff.sub{ case "x" => println("x") } //works, result is Function1 nf.sub{ case "x" => println("x") } //fails with compiler error msg
Calling the above method sub with a case statement works when there is only one method of that name, whether it takes a Function1 or a PartialFunction, but although the compiler has no problem compiling the overloaded pair of functions, once they both exist the compiler can no longer unambiguously determine the target type for the case statement, so it delivers that same error message "missing parameter type for expanded function".

In my case I was trying to modify the subscribe method in my Publisher class so that I could pass in either a regular function, such as println(_), or a PartialFunction, in particular an in-line case statement. The three options I tried are essentially classes PF, FF and NF listed above. When I used approach NF I was unable to directly pass in a case statement, but instead would get the compiler error mentioned above. When I used approach PF I could pass in a case statement as a PartialFunction, but I could not pass in a regular function. When I used approach FF I could pass in a regular function, and could pass in and properly deal with a PartialFunction, since it extends Function1, but when I used an in-line case statement it would get compiled as a Function1 rather than a PartialFunction, which would cause execution to fail when a value was passed to that case statement that it did not cover (since it was not a PartialFunction and thus did not have an isDefinedAt method to call first).

I don't like option FF because it would allow code (specifically, an in-line case statement) to compile but then not execute as expected. Options PF and NF are not very useful as is, since neither directly supports both case statements and full functions.

In a mailing list response to someone who was attempting to use option NF in his application, Paul Phillips suggested using option FF with a helper function pf that accepts a PartialFunction and returns the same value, then wrapping any case statements inside a call to that helper function; or, alternatively, assigning the case statement to a val declared as a PartialFunction before passing it to method sub. Unfortunately, if the user forgets to use either of these techniques on a case statement and just passes it directly to method sub in option FF, it will be handled as a Function1 rather than a PartialFunction, so it will compile but not behave as expected.

Paul's suggestion would also work in option NF (and in option PF, although in that case it would be redundant), which would behave much the same as option FF from the user's perspective except that passing a bare case statement to the overloaded method sub would not compile, so we would no longer have the undesirable situation of something that compiles but behaves unexpectedly.

As an alternative to Paul's pf helper function, I could write a helper function ff that takes a Function1 and turns it into a PartialFunction with an isDefinedAt method that always returns true. I would then use this with option PF. This would allow me to directly pass in case statements, but I would have to wrap all regular functions in a call to ff.

I have not yet made any changes to my Publisher class, since I don't particularly like either of the options and I don't currently really need the ability to use in-line case statements. Meanwhile, if I get the compiler error "missing parameter type for expanded function" while trying to use an in-line case statement, at least I now know one more thing to check for.

Wednesday, October 7, 2009

A Simple Publish/Subscribe Example in Scala

Here is an example where using a simple publish/subscribe mechanism allowed me to clean up some of my early Scala code.

My Mimprint program (now also on github) was originally written in Java, then ported to Scala soon after I first started learning that language. As such, much of that original ported code was "Java written in Scala". As I have continued to internalize the Scala approach I have gone back and modified various parts of the program to make it cleaner.

In one part of the program I set up a collection of menu checkboxes to allow the user to enable or disable various features. As those features are enabled or disabled, the states of other screen components change; sometimes a component is enabled or disabled, sometimes a component is hidden or made visible.

My original Java-ish Scala code to do this looked something like this (with irrelevant parts omitted):
class ViewListGroup ... { ... private var singleComp:Component = _ private var mShowFileInfo:SCheckBoxMenuItem = _ private var mShowFileIcons:SCheckBoxMenuItem = _ private var mShowDirDates:SCheckBoxMenuItem = _ private var mShowSingleViewer:SCheckBoxMenuItem = _ def getComponent():Component = { ... singleComp = playViewSingle.getComponent() ... //Add our menu items mShowFileInfo = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileInfo")( showFileInfo(mShowFileInfo.getState)) mShowFileInfo.setState(true) m.add(mShowFileInfo) mShowFileIcons = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileIcons")( showFileIcons(mShowFileIcons.getState)) mShowFileIcons.setState(false) m.add(mShowFileIcons) mShowDirDates = new SCheckBoxMenuItem( viewer,"menu.List.ShowDirDates")( showDirDates(mShowDirDates.getState)) mShowDirDates.setState(playViewList.includeDirectoryDates) ... m.add(mShowDirDates) mShowSingleViewer = new SCheckBoxMenuItem( viewer,"menu.List.ShowSingleViewer")( showSingleViewer(mShowSingleViewer.getState)) mShowSingleViewer.setState(true) m.add(mShowSingleViewer) showSingleViewer(mShowSingleViewer.getState) //make sure window state is in sync with menu item state ... } ... def showFileInfo(b:Boolean) { playViewList.showFileInfo(b) mShowFileInfo.setState(b) mShowFileIcons.setEnabled(b) mShowDirDates.setEnabled(b) } def showFileIcons(b:Boolean) { playViewList.showFileIcons(b) playViewList.redisplayList() } def showDirDates(b:Boolean) { playViewList.includeDirectoryDates = b playViewList.redisplayList() } def showSingleViewer(b:Boolean) { singleComp.setVisible(b) singleComp.getParent.asInstanceOf[JSplitPane].resetToPreferredSizes() mShowSingleViewer.setState(b) playViewList.requestSelect } ... }
There were two things about this code that I didn't like:
  1. Mutable instance variables using var, particularly since they were not really variable. These values were being assigned once, not at construction time, but had to be available to other methods.
  2. The close binding between the different UI components, since the action method called by one component directly modified attributes of possibly a number of other components.
After a recent conversation with a friend I realized that I could probably improve this code by using a publish/subscribe mechanism to loosen the coupling between the components. Mimprint already had an ActorPublisher class, where each subscriber is an Actor that accepts messages of the published object type, but in this case I wanted a lighter weight implementation, since I knew the subscriber actions would be quick. Also, this being Swing, the subscriber actions that update screen state should run in the Swing event thread, and the events being published are also coming from the event thread, so the simple thing to do is to run the subscriber actions directly from the publish method.

Writing a publish/subscribe handler in Scala is pretty easy, and for me it was even simpler, as I already had one. I grabbed my ListenerManager and modified it to use the publish/subscribe terminology. I also added synchronization to make it multi-thread safe, although for this app I don't really need it. It now looks like this:
package net.jimmc.util /** Manage a subscriber list. * There are no guarantees on the order of subscribers in the list. * This code is a slightly modified version of ListenerManager * as published to my blog in April 2009. */ trait Publisher[E] { type S = (E) => Unit private var subscribers: List[S] = Nil private object lock //By using lock.synchronized rather than this.synchronized we reduce //the scope of our lock from the extending object (which might be //mixing us in with other classes) to just this trait. /** True if the subscriber is already in our list. */ def isSubscribed(subscriber:S) = { val subs = lock.synchronized { subscribers } subs.exists(_==subscriber) } /** Add a subscriber to our list if it is not already there. */ def subscribe(subscriber:S) = lock.synchronized { if (!isSubscribed(subscriber)) subscribers = subscriber :: subscribers } /** Remove a subscriber from our list. If not in the list, ignored. */ def unsubscribe(subscriber:S):Unit = lock.synchronized { subscribers = subscribers.filter(_!=subscriber) } /** Publish an event to all subscribers on the list. */ def publish(event:E) = { val subs = lock.synchronized { subscribers } subs.foreach(_.apply(event)) } }
For each menu checkbox I would like to set up a publisher. In every case, I just need to publish whether that checkbox has just been enabled or disabled. I defined a simple case class hierarchy to represent the Enabled and Disabled messages:
sealed abstract class Abled case object Enabled extends Abled case object Disabled extends Abled
I then created a publisher class that uses that event type:
class AbledPublisher extends Publisher[Abled]
I want to easily publish the Enabled or Disabled object based on the current state of a checkbox, so I added an AbledPublisher companion object with an apply method to do that:
object AbledPublisher { object Abled { def apply(b:Boolean) = if (b) Enabled else Disabled } }
Conversely, upon receiving an Abled event in a subscriber for a UI component I want to be able to enable or disable that component. I could use a match statement with cases for Enabled and Disabled, but a simpler way is to modify the Abled case class hierarchy to encode a boolean state value into the Abled case object to allow easy translation from an Abled object back to a state:
sealed abstract class Abled { val state:Boolean } case object Enabled extends Abled { override val state = true } case object Disabled extends Abled { override val state = false }
Finally, I packaged up the case class hierarchy inside the AbledPublisher object to control scoping. The final AbledPublisher file looks like this:
package net.jimmc.util //For subscribers of things that turn on and off class AbledPublisher extends Publisher[AbledPublisher.Abled] // use "import AbledPublisher._" to pick up these definitions object AbledPublisher { sealed abstract class Abled { val state:Boolean } case object Enabled extends Abled { override val state = true } case object Disabled extends Abled { override val state = false } object Abled { def apply(b:Boolean) = if (b) Enabled else Disabled } }
Given the above AbledPublisher class and object, I modified my code so that the action method called by each menu checkbox publishes an Enabled or Disabled event that matches the new state of the checkbox, and for each place in the old code where an action method called a state-changing method on another component I set up that target component as a subscriber to the appropriate publisher that, when it receives a published event, takes appropriate action on itself.

With the above changes, and a slight change to my SCheckBoxMenuItem class so that it passes itself to the action callback, the code now looks like this:
import net.jimmc.util.AbledPublisher import net.jimmc.util.AbledPublisher._ class ViewListGroup ... { vlg:ViewListGroup => ... private val showFileInfoPublisher = new AbledPublisher private val showSingleViewerPublisher = new AbledPublisher private val showDirectoriesPublisher = new AbledPublisher ... def getComponent():Component = { ... val singleComp = playViewSingle.getComponent() showSingleViewerPublisher.subscribe((ev)=> { singleComp.setVisible(ev.state) singleComp.getParent.asInstanceOf[JSplitPane].resetToPreferredSizes() }) ... //Add our menu items val mShowFileInfo = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileInfo")((cb)=> showFileInfo(cb.getState)) mShowFileInfo.setState(true) showFileInfoPublisher.subscribe((ev)=> mShowFileInfo.setState(ev.state) ) m.add(mShowFileInfo) val mShowFileIcons = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileIcons")((cb)=> showFileIcons(cb.getState)) mShowFileIcons.setState(false) showFileInfoPublisher.subscribe((ev)=> mShowFileIcons.setState(ev.state) ) m.add(mShowFileIcons) val mShowDirDates = new SCheckBoxMenuItem( viewer,"menu.List.ShowDirDates")((cb)=> showDirDates(cb.getState)) mShowDirDates.setState(playViewList.includeDirectoryDates) mShowDirDates.setVisible(includeDirectories) showFileInfoPublisher.subscribe((ev)=> mShowDirDates.setState(ev.state) ) showDirectoriesPublisher.subscribe((ev)=> mShowDirDates.setVisible(ev.state) ) m.add(mShowDirDates) val mShowSingleViewer:SCheckBoxMenuItem = new SCheckBoxMenuItem( viewer,"menu.List.ShowSingleViewer")((cb)=> showSingleViewer(cb.getState)) mShowSingleViewer.setState(true) showSingleViewerPublisher.subscribe((ev)=> mShowSingleViewer.setState(ev.state) ) m.add(mShowSingleViewer) showSingleViewer(mShowSingleViewer.getState) //make sure window state is in sync with menu item state ... } ... def showFileInfo(b:Boolean) { playViewList.showFileInfo(b) showFileInfoPublisher.publish(Abled(b)) } def showFileIcons(b:Boolean) { playViewList.showFileIcons(b) playViewList.redisplayList() } def showDirDates(b:Boolean) { playViewList.includeDirectoryDates = b playViewList.redisplayList() } def showSingleViewer(b:Boolean) { showSingleViewerPublisher.publish(Abled(b)) playViewList.requestSelect } ... }
The total number of lines of code in ViewListGroup is actually a bit more than before, but I find the code a little easier to understand because all of the code that acts on a UI component is now localized in one place in the source file. All of the vars that held pointers to those components are now gone, replaced by a few vals for the publishers. The publishers use vars to maintain internal state, but that state is simple and easily understood, well encapsulated and multi-thread safe.

There is still more cleanup work to be done in Mimprint. For example, in the above code the checkbox action methods such as showFileInfo and showFileIcons call methods on the playViewList object as well as publishing an Abled event. Instead, I could set up playViewList as a listener on each of the published events, then make the menu checkbox actions directly publish an event and get rid of the showXXX methods. I will leave that for another round of cleanup.

Thursday, October 1, 2009

Initializing Immutable Variables in Scala

One of the guidelines I picked up when I learned Scala is to use immutable variables as much as possible. Besides the trivial but satisfying detail of making the declaration of an immutable variable (val) take no more characters than a mutable one (var), Scala also provides some interesting ways to set the values into those immutable variables.

In Scala, immutable variables are identified by declaring them using the val keyword rather than var. In Java, immutable variables are identified by adding the final qualifier to the variable declaration. But a Java final variable has slightly different semantics than a Scala val: in Java, you can declare a final variable without specifying a value for it, then fill in the value later. Java allows the variable to be assigned once, after which it can not be assigned again. In Scala, a concrete val must have its value assigned as part of the definition.

Consider this sample Java class, Interval, which represents an interval on the real number line. We want to allow the constructor to be called with endpoints in either order, but we want to store them internally in sorted order.
//Java code public class Interval { final double start; final double end; //invariant: end>=start public Interval(double x1, double x2) { if (x1>x2) { start = x2; end = x1; } else { start = x1; end = x2; } } //other methods that use start and end go here }
If you try this idiom in Scala, by replacing each final variable with a val but continuing to use the same initialization construct, you will get a compiler error "reassignment to val". When using a concrete val in Scala, you must supply the value in the statement where you declare the val.

For relatively simple cases, as in this example, we can take advantage of the fact that Scala allows us to build expressions with if in them, so we can express the same functionality as in the above Java code as follows:
class Interval(x1:Double, x2:Double) { val start = if (x2>x1) x1 else x2 val end = if (x2>x1) x2 else x1 //other methods that use start and end go here }
Sometimes the logic to calculate the values for the immutable variables is much more complicated than this and more expensive to calculate. Perhaps, as in our Java example, we don't want to recalculate that condition over again for each variable. We might also be more comfortable building up our values using mutable variables. We could take the easy and straightforward way and just use var rather than val for our variables, but it is worth a bit of effort to retain the immutability of our variables. Here is an approach I sometimes take:
class Interval(x1:Double, x2:Double) { val (start, end) = { def intervalNeedsReversing(a:Double,b:Double) = (a>b) if (intervalNeedsReversing(x1,x2)) (x2, x1) else (x1, x2) } //other methods that use start and end go here }
In the above approach, we have a block of code that calculates our values. Though not needed in this case, the intervalNeedsReversing function is an example of how you you can define functions within a block in order to refactor that code or better organize it. The value of the block is a tuple, which we then assign using a tuple-assignment to our immutable variables start and end.

A tuple-assignment is a pattern-matching operation that pulls apart the tuple data and stores each piece into the separate variables. It looks like the second line in this example:
val t2 = (123, "abc") //the type of t2 is Tuple2[Int, java.lang.String] val (n, s) = t2 //assigns n=123, s="abc"
You can use any expression in place of t2 that has the same type, including a function call, a variable, a literal tuple, or a code block.

You can include a type on each variable name; if the types of the assigned variables don't match the corresponding types of the value on the right hand side, you will get a compiler error.
val (n:Int, s:String) = t2 //ok val (s:String, n:Int) = t2 //error
The tuple syntax of parentheses around a comma-separated list of values is actually a shorthand for the TupleN class. For each pair of lines below, the first line is a shorthand ("syntactic sugar") for the second.
(a, b) Tuple2(a, b) (1, "x", "y") Tuple3(1, "x", "y") val (n, s) = t2 val Tuple2(n, s) = t2
The last of the three examples above is a pattern-matching assignment statement.

You can use the List pattern in an assignment as well:
val a :: b :: c = List(1,2,3,4) //This assigns a:Int=1, b:Int=2, c:List[Int]=List(3,4)
The List and Tuple classes can be used in a pattern-matching assignment like this because they each have an extractor defined by the unapply method in their companion object. You can use any extractor (that is, any declared object that includes an unapply method) in this way. For example, a case class can be used:
case class Foo(num:Int, str:String) val f = Foo(42,"ok") val Foo(n,s) = f //assigns n:Int=42, s:String="ok"
This works even if the case class happens to use mutable fields: the values at the time of the pattern match assignment are set into the new variables, which are immutable.
case class Bar(var num:Int, var str:String) val b = Bar(42,"ok") b.num += 1 b.str = "no" val Bar(n,s) = b //assigns n:Int=43, s:String="no" b.num += 1 //does not change n
For example, if you have a large number of values to set at once, you could declare a case class to represent them, and match on that to assign the values:
class AnotherExample { case class MyArgs(var name:String, var pathPart:String, var someNumber:Int) val MyArgs(path, part, num) = { val m = MyArgs("/path/foo/bar", "partX", 123) //change values of fields in m as desired m } }
You thus get the benefit of having immutable variables for use in your constructed object, but you can use mutable private data within the block to make it easier to do your construction.

You can use this technique to initialize immutable variables within a method as well. Effectively, you are using mutable variables only for the limited scope in which they are desired. By enclosing them in a block you prevent code outside that block from modifying those mutable values.

Since this technique is based on pattern matching, you can use it with any legal pattern. Pattern matching is typically used in the case clauses of match statements.

Patterns can include nested constructs, which allows you to pull out values from deep within a structure when that structure is known. By using the @ operator within a pattern you can extract the value of an entire subpattern:
case class Foo(n:Int, var s:String) case class Baz(f:Foo, b:Option[Baz]) val data = 123 :: Baz(Foo(3,"c"),Some(Baz(Foo(4,"d"),None))) :: 456 :: Nil val _ :: Baz(Foo(_,a),Some(b @ Baz(c @ Foo(d,e),_))) :: f :: _ = data // The above val statement assigns these values: // a = "c" // b = Baz(Foo(4,"d"),None) // c = Foo(4,"d") // d = 4 // e = "d" // f = 456
The undersccore indicates a placeholder for a part of the pattern whose value we don't care about and don't want assigned to anything.

Note that the variable c refers to the same object as the Foo object that appears in variable b. We defined Foo with a var for s. If we change the value of the Foo object referenced by variable c, then we will see that change when we ask for the value of variable b:
scala> b res0: Baz = Baz(Foo(4,d),None) scala> c res1: Foo = Foo(4,d) scala> c.s = "x" scala> c res2: Foo = Foo(4,x) scala> b res3: Baz = Baz(Foo(4,x),None)
Although b and c are themselves immutable variables, if they point to the same mutable object then changes made to that object through one variable will be visible through the other variable.

As you learn Scala and see examples of case statements, remember that any syntax that is valid as the pattern match in a case statement is also valid as a pattern match in a val assignment.

Thursday, September 10, 2009

Type Safe Builder in Scala, Part 4

A type-safe builder with mutually exclusive parameters.

In my previous three posts I presented various versions of a type-safe builder that enforced, at compile time, that required setters were called exactly once and optional setters were called no more than once. In those examples there was a one-to-one correspondence between the setters (such as withBrand) and type variables that were used to enforce the number of calls to those methods (such as HAS_BRAND). We can use the type variables in different ways to change what combination of methods calls is allowed. In particular, we can have multiple parameters which can be set in various combinations, only some of which are allowed. For example, we can have two mutually exclusive parameters, where you must set either one but not the other.

To show how this works, I have created a builder for a Pyramid calculator that can be used to calculate the physical parameters of a rectangular pyramid. After creating a builder, you can call various setter methods to set some of the physical parameters of the pyramid, including the length, width and area of the base, the height, and the length of an upright edge. You must set exactly two out of three of the length, width and area of the base, and you must set one but not both of the height and edge. Once you have done that, you call build to get back the calculator from which you can retrieve any of the five physical parameters listed above plus volume. If you call too few or too many of the setters in each group, the call to build will not compile.
object Pyramid { //A small collection of class types to define a state machine that counts abstract class COUNTER { type Count <: COUNTER } abstract class MANY extends COUNTER { type Count = MANY } abstract class TWO extends COUNTER { type Count = MANY } abstract class ZERO_OR_ONE extends COUNTER abstract class ONE extends ZERO_OR_ONE { type Count = TWO } abstract class ZERO extends ZERO_OR_ONE { type Count = ONE } //We require positive values for our calls class Positive(val d:Double) { if (d<=0) throw new IllegalArgumentException("non-positive value") } implicit def doubleToPositive(d:Double) = new Positive(d) implicit def intToPositive(n:Int) = new Positive(n) //The class that manages the state of our specification class Specs private[Pyramid]() { self:Specs => //Caller must set exactly two out of three of these val length:Double = 0 val width:Double = 0 val area:Double = 0 //Caller must set exactly one of these two heights val height:Double = 0 //vertical height to the tip val edge:Double = 0 //from base to tip along an edge //We maintain compiler-time state to count the two types of calls type TT <: { type COUNT_LENGTH <: COUNTER type COUNT_WIDTH <: COUNTER type COUNT_AREA <: COUNTER type COUNT_BASE <: COUNTER // length, width or area type COUNT_HEIGHT <: COUNTER type COUNT_EDGE <: COUNTER type COUNT_VERT <: COUNTER // height or edge } class SpecsWith(bb:Specs) extends Specs { override val length = bb.length override val width = bb.width override val area = bb.area override val height = bb.height override val edge = bb.edge } def setLength(d:Positive) = new SpecsWith(this) { override val length:Double = d.d type TT = self.TT { type COUNT_LENGTH = self.TT#COUNT_LENGTH#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setWidth(d:Positive) = new SpecsWith(this) { override val width:Double = d.d type TT = self.TT { type COUNT_WIDTH = self.TT#COUNT_WIDTH#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setArea(d:Positive) = new SpecsWith(this) { override val area:Double = d.d type TT = self.TT { type COUNT_AREA = self.TT#COUNT_AREA#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setHeight(d:Positive) = new SpecsWith(this) { override val height:Double = d.d type TT = self.TT { type COUNT_HEIGHT = self.TT#COUNT_HEIGHT#Count type COUNT_VERT = self.TT#COUNT_VERT#Count } } def setEdge(d:Positive) = new SpecsWith(this) { override val edge:Double = d.d type TT = self.TT { type COUNT_EDGE = self.TT#COUNT_EDGE#Count type COUNT_VERT = self.TT#COUNT_VERT#Count } } } //Starting point: nothing is set def apply() = new Specs { type TT = { type COUNT_LENGTH = ZERO type COUNT_WIDTH = ZERO type COUNT_AREA = ZERO type COUNT_BASE = ZERO type COUNT_HEIGHT = ZERO type COUNT_EDGE = ZERO type COUNT_VERT = ZERO } } //Required ending point: two base measures, one height measure, //no single parameter more than once type CompleteSpecs = Specs { type TT <: { type COUNT_LENGTH <: ZERO_OR_ONE type COUNT_WIDTH <: ZERO_OR_ONE type COUNT_AREA <: ZERO_OR_ONE type COUNT_BASE = TWO type COUNT_HEIGHT <: ZERO_OR_ONE type COUNT_EDGE <: ZERO_OR_ONE type COUNT_VERT = ONE } } //Calc1 includes the first set of values that can be calculated class Calc1 private[Pyramid](spec:CompleteSpecs) { import java.lang.Math.sqrt //The three related base measures lazy val length = if (spec.length!=0) spec.length else spec.area/spec.width lazy val width = if (spec.width!=0) spec.width else spec.area/spec.length lazy val area = if (spec.area!=0) spec.area else spec.length*spec.width //The two related height measures lazy val height = if (spec.height!=0) spec.height else sqrt(spec.edge*spec.edge-length*length/4-width*width/4) lazy val edge = if (spec.edge!=0) spec.edge else sqrt(length*length/4+width*width/4+spec.height*spec.height) lazy val volume = length * width * height / 3 } implicit def specsOK(spec:CompleteSpecs) = new { def build = new Calc1(spec) } }
As before, remember to import Pyramid._ when using this code.

Let's examine the code.

To start, I have a small type-based state machine, similar to what I used in my previous post. I changed the names of the classes to more accurately reflect the fact that I am counting the number of calls to methods, and I added a class to represent a count of two as distinct from larger numbers.

Next I define a class that ensures that the values passed to the setters are all strictly positive numbers (since they represent physical quantities), and I add a couple of implicit conversion methods to allow the caller to pass in ints or doubles. If the caller passes in a negative number, the builder will throw a runtime exception.

The Specs class is where I maintain my state information as the builder is being constructed. I use the same basic approach as in my previous post, with a set of parameter values and a set of compile-time constraints, the latter represented by the TT compound type. Note that in addition to the type parameter values that are directly associated with the parameters, which are used to ensure that each individual setter is called not more than once, there are two additional type values that do not directly correspond to parameter values or individual setters. The COUNT_BASE type value is associated with the length, width and area parameters, while the COUNT_VERT type value is associated with the height and edge parameters. This association is specified in the setter methods.

The SpecsWith class allows me to default all values to the previous step in the builder chain, so that I can override just the value I want to change in each setter.

Each of the five setters sets its parameter value, sets a new value for its individual type value counter, and also sets a new type value for one of the non-individual counters. Note how the setters for length, width and area all refer to COUNT_BASE, while the setters for height and edge both refer to COUNT_VERT.

I chose to define an apply() method rather than call it builder. This allows me to start my builder chain by specifying just Pyramid().

The CompleteSpecs type definition defines the end point that will be valid for a call to build. You can see here how it requires that COUNT_BASE be TWO and COUNT_HEIGHT be ONE. The other call counts can be zero or one.

Calc1 is the class that actually does the calculation of the physical parameters.

Finally, the specsOK implicit method provides the link that allows only a complete builder to call the build method that returns the calculator.

Here is an example of how you use it:
import Pyramid._ //we need the implicit conversion to be in scope val p = Pyramid().setLength(10).setWidth(8).setHeight(6).build p.length //returns 10 p.area //returns 80 p.volume //returns 160
Each of the following will give a compiler error:
Pyramid().setWidth(2).setHeight(2).build //only one BASE param, need 2 Pyramid().setWidth(2).setLength(3).setArea(6).setHeight(2).build //too many BASE params Pyramid().setWidth(2).setWidth(3).setHeight(2).build //setWidth called twice
Note the second example: even though the base area (6) is compatible with the width and length, this line gives a compiler error because three BASE parameters were specified; the compile time checks do not look at the values of the parameters.

Let's add a parameter for density such that, if we have called the setter for density, we can get back a calculator that can tell us the mass of the pyramid as well as everything else. The code below shows, in bold, what needs to be added to the above example in order to do that.
object Pyramid { //A small collection of class types to define a state machine that counts abstract class COUNTER { type Count <: COUNTER } abstract class MANY extends COUNTER { type Count = MANY } abstract class TWO extends COUNTER { type Count = MANY } abstract class ZERO_OR_ONE extends COUNTER abstract class ONE extends ZERO_OR_ONE { type Count = TWO } abstract class ZERO extends ZERO_OR_ONE { type Count = ONE } //We require positive values for our calls class Positive(val d:Double) { if (d<=0) throw new IllegalArgumentException("non-positive value") } implicit def doubleToPositive(d:Double) = new Positive(d) implicit def intToPositive(n:Int) = new Positive(n) //The class that manages the state of our specification class Specs private[Pyramid]() { self:Specs => //Caller must set exactly two out of three of these val length:Double = 0 val width:Double = 0 val area:Double = 0 //Caller must set exactly one of these two heights val height:Double = 0 //vertical height to the tip val edge:Double = 0 //from base to tip along an edge //Optional value; if set, we can calculate mass val density:Double = 0 //We maintain compiler-time state to count the two types of calls type TT <: { type COUNT_LENGTH <: COUNTER type COUNT_WIDTH <: COUNTER type COUNT_AREA <: COUNTER type COUNT_BASE <: COUNTER // length, width or area type COUNT_HEIGHT <: COUNTER type COUNT_EDGE <: COUNTER type COUNT_VERT <: COUNTER // height or edge type COUNT_DENSITY <: COUNTER } class SpecsWith(bb:Specs) extends Specs { override val length = bb.length override val width = bb.width override val area = bb.area override val height = bb.height override val edge = bb.edge override val density = bb.density } def setLength(d:Positive) = new SpecsWith(this) { override val length:Double = d.d type TT = self.TT { type COUNT_LENGTH = self.TT#COUNT_LENGTH#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setWidth(d:Positive) = new SpecsWith(this) { override val width:Double = d.d type TT = self.TT { type COUNT_WIDTH = self.TT#COUNT_WIDTH#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setArea(d:Positive) = new SpecsWith(this) { override val area:Double = d.d type TT = self.TT { type COUNT_AREA = self.TT#COUNT_AREA#Count type COUNT_BASE = self.TT#COUNT_BASE#Count } } def setHeight(d:Positive) = new SpecsWith(this) { override val height:Double = d.d type TT = self.TT { type COUNT_HEIGHT = self.TT#COUNT_HEIGHT#Count type COUNT_VERT = self.TT#COUNT_VERT#Count } } def setEdge(d:Positive) = new SpecsWith(this) { override val edge:Double = d.d type TT = self.TT { type COUNT_EDGE = self.TT#COUNT_EDGE#Count type COUNT_VERT = self.TT#COUNT_VERT#Count } } def setDensity(d:Positive) = new SpecsWith(this) { override val density:Double = d.d type TT = self.TT { type COUNT_DENSITY = self.TT#COUNT_DENSITY#Count } } } //Starting point: nothing is set def apply() = new Specs { type TT = { type COUNT_LENGTH = ZERO type COUNT_WIDTH = ZERO type COUNT_AREA = ZERO type COUNT_BASE = ZERO type COUNT_HEIGHT = ZERO type COUNT_EDGE = ZERO type COUNT_VERT = ZERO type COUNT_DENSITY = ZERO } } //Required ending point: two base measures, one height measure, //no single parameter more than once type CompleteSpecs = Specs { type TT <: { type COUNT_LENGTH <: ZERO_OR_ONE type COUNT_WIDTH <: ZERO_OR_ONE type COUNT_AREA <: ZERO_OR_ONE type COUNT_BASE = TWO type COUNT_HEIGHT <: ZERO_OR_ONE type COUNT_EDGE <: ZERO_OR_ONE type COUNT_VERT = ONE } } //Calc1 includes the first set of values that can be calculated class Calc1 private[Pyramid](spec:CompleteSpecs) { import java.lang.Math.sqrt //The three related base measures lazy val length = if (spec.length!=0) spec.length else spec.area/spec.width lazy val width = if (spec.width!=0) spec.width else spec.area/spec.length lazy val area = if (spec.area!=0) spec.area else spec.length*spec.width //The two related height measures lazy val height = if (spec.height!=0) spec.height else sqrt(spec.edge*spec.edge-length*length/4-width*width/4) lazy val edge = if (spec.edge!=0) spec.edge else sqrt(length*length/4+width*width/4+spec.height*spec.height) lazy val volume = length * width * height / 3 } implicit def specsOK(spec:CompleteSpecs) = new { def build = new Calc1(spec) } //Second set of allowable computations type CompleteSpecs2 = Specs { type TT <: { type COUNT_LENGTH <: ZERO_OR_ONE type COUNT_WIDTH <: ZERO_OR_ONE type COUNT_AREA <: ZERO_OR_ONE type COUNT_BASE = TWO type COUNT_HEIGHT <: ZERO_OR_ONE type COUNT_EDGE <: ZERO_OR_ONE type COUNT_VERT = ONE type COUNT_DENSITY = ONE } } class Calc2 private[Pyramid](spec:CompleteSpecs2) extends Calc1(spec) { lazy val mass = spec.density * volume } implicit def specsOK2(spec:CompleteSpecs2) = new { def build2 = new Calc2(spec) } }
To get the mass, we have to call all of the appropriate setters, including density, then call build2 rather than build to get our calculator. That will give us a calculator that can give us the mass value as well as all the values in the calculator returned by build.
val p = Pyramid().setHeight(2).setWidth(3).setArea(6).setDensity(2).build2 p.volume //returns 4 p.mass //returns 8
These examples fail:
Pyramid().setHeight(2).setWidth(3).setArea(6).build2 //no density Pyramid().setHeight(2).setWidth(3).setArea(6).setDensity(2).setDensity(3).build2 //density specified twice
I intentionally left out one line when adding the density code, so the following code compiles:
Pyramid().setHeight(2).setWidth(3).setArea(6).setDensity(2).setDensity(3).build
Since the build method returns a calculator that does not do anything with the density, this is perhaps not a problem. You can test your understanding of Scala types by figuring out what one line you need to add to the density example to make this last call fail to compile.

Wednesday, September 9, 2009

Type Safe Builder in Scala, Part 3

Another solution to implementing a type-safe builder in Scala.

In my previous two posts I presented a couple of different implementations of the type-safe builder pattern originally presented by Rafeal de F. Ferreira. But there was something about them that bothered me: both of my implementations, both of Rafael's implementations, and gambistics' implementation written in response to my second attempt, all exhibit the same bothersome characteristic: every setter includes references to all of the state information (both type information and parameter values) being built up in the Builder.

I found this to be a very inelegant aspect of these solutions. Given N parameters with their corresponding setters, there is an O(N^2) maintenance problem: every time you add or remove a parameter, or make certain changes to existing parameters, such as changing name or type, you have to make that change in every setter method.

Below is an implementation without any source code interaction between the setters or parameters; each setter only deals with its own data. Adding, removing or changing any parameter and corresponding setter can be done without dealing with any of the other parameters, making this implementation O(N) for N parameters. As with my Part 2 implementation, this implementation handles both optional and required parameters, ensuring at compile time that required setters are called exactly once and optional setters are not called more than once.
object Scotch { //A small collection of class types to define a state machine abstract class STATE { type TrueOnce <: STATE } abstract class NOT_MULTI extends STATE abstract class MULTI extends STATE { type TrueOnce = MULTI } abstract class TRUE extends NOT_MULTI { type TrueOnce = MULTI } abstract class FALSE extends NOT_MULTI { type TrueOnce = TRUE } sealed abstract class Preparation case object Neat extends Preparation case object OnTheRocks extends Preparation case object WithWater extends Preparation sealed abstract class Glass case object Short extends Glass case object Tall extends Glass case object Tulip extends Glass case class OrderOfScotch private[Scotch] ( val brand:String, val mode:Preparation, val isDouble:Boolean, val glass:Option[Glass]) class ScotchBuilder private[Scotch]() { self:ScotchBuilder => val theBrand:Option[String] = None val theMode:Option[Preparation] = None val theDoubleStatus:Option[Boolean] = None val theGlass:Option[Glass] = None type TT <: { type HAS_BRAND <: STATE type HAS_MODE <: STATE type HAS_DOUBLE <: STATE type HAS_GLASS <: STATE } class ScotchBuilderWith(sb:ScotchBuilder) extends ScotchBuilder { override val theBrand = sb.theBrand override val theMode = sb.theMode override val theDoubleStatus = sb.theDoubleStatus override val theGlass = sb.theGlass } def withBrand(b:String) = new ScotchBuilderWith(this) { override val theBrand:Option[String] = Some(b) type TT = self.TT { type HAS_BRAND = self.TT#HAS_BRAND#TrueOnce } } def withMode(p:Preparation) = new ScotchBuilderWith(this) { override val theMode:Option[Preparation] = Some(p) type TT = self.TT { type HAS_MODE = self.TT#HAS_MODE#TrueOnce } } def isDouble(b:Boolean) = new ScotchBuilderWith(this) { override val theDoubleStatus:Option[Boolean] = Some(b) type TT = self.TT { type HAS_DOUBLE = self.TT#HAS_DOUBLE#TrueOnce } } def withGlass(g:Glass) = new ScotchBuilderWith(this) { override val theGlass:Option[Glass] = Some(g) type TT = self.TT { type HAS_GLASS = self.TT#HAS_GLASS#TrueOnce } } } //Starting point: nothing is set lazy val builder = new ScotchBuilder { type TT = { type HAS_BRAND = FALSE type HAS_MODE = FALSE type HAS_DOUBLE = FALSE type HAS_GLASS = FALSE } } //Required ending point: TRUE for required, NOT_MULTI for optional type CompleteBuilder = ScotchBuilder { type TT <: { type HAS_BRAND = TRUE type HAS_MODE = TRUE type HAS_DOUBLE = TRUE type HAS_GLASS <: NOT_MULTI } } implicit def enableBuild(builder:CompleteBuilder) = new { def build() = new OrderOfScotch( builder.theBrand.get, builder.theMode.get, builder.theDoubleStatus.get, builder.theGlass); } }
It actually took me quite a while to come up with this solution. I spent a lot of time trying different things that almost worked. Scala's type system is powerful, but debugging is a bear.

Monday, September 7, 2009

Type Safe Builder in Scala, Part 2

Here is a way to limit the number of calls to each setter method in a builder without using Church Numerals.

Update 2009-09-09: See also Part 3, which shows an O(N) implementation.
Copyright Note: The code in this post may not be covered by the LGPL.

The BuilderPattern code is a derivative of code posted by Rafael de F. Ferreira, so is covered by his copyright.

All code is used here for educational purposes under Fair Use.
A number of people commented in response to Rafael's original post that this type-safe builder approach (of explicitly using types to track desired behavior at compile-time) is much more complicated than a simple builder class in which the required parameters are constructor args to the builder, and the optional parameters are setters. Of course that's true, and with named parameters and default values coming in Scala 2.8, defining and using such builders can be even simpler. But, besides the fact that the type-safe approach can be used in more complex builders in which the constructor approach does not work well, the issue is not relevant, because the main point of this exercise is to see how Scala's type system can be used to do interesting things.

In my previous blog post I showed how to use Church Numerals to limit (at compile time) calls to setters to no more than once per setter. Using Church Numerals was handy for me because I already had them, so it was easy to switch from booleans to integers to keep track of how many times a setter was called.

Keeping count of calls this way could be useful in some context, but in this case all I was doing was ensuring that an item was called no more than once. Also, the recommendation I made in my closing paragraph to use multiple implicit conversion functions to deal with optional setters does not scale well when there are multiple optional setters.

Below is a simpler approach that ensures that required setters are called exactly once, that optional setters are called not more than once, that doesn't require Church Numerals, and that uses only a single implicit conversion method. As before, my changes from Rafael's original are in bold.
object BuilderPattern { sealed abstract class Preparation case object Neat extends Preparation case object OnTheRocks extends Preparation case object WithWater extends Preparation sealed abstract class Glass case object Short extends Glass case object Tall extends Glass case object Tulip extends Glass case class OrderOfScotch private[BuilderPattern] (val brand:String, val mode:Preparation, val isDouble:Boolean, val glass:Option[Glass]) abstract class STATE { type TrueOnce <: STATE } abstract class NOT_MULTI extends STATE abstract class MULTI extends STATE { type TrueOnce = MULTI } abstract class TRUE extends NOT_MULTI { type TrueOnce = MULTI } abstract class FALSE extends NOT_MULTI { type TrueOnce = TRUE } abstract class ScotchBuilder { self:ScotchBuilder => protected[BuilderPattern] val theBrand:Option[String] protected[BuilderPattern] val theMode:Option[Preparation] protected[BuilderPattern] val theDoubleStatus:Option[Boolean] protected[BuilderPattern] val theGlass:Option[Glass] type HAS_BRAND <: STATE type HAS_MODE <: STATE type HAS_DOUBLE_STATUS <: STATE type HAS_GLASS <: STATE def withBrand(b:String) = new ScotchBuilder { protected[BuilderPattern] val theBrand:Option[String] = Some(b) protected[BuilderPattern] val theMode:Option[Preparation] = self.theMode protected[BuilderPattern] val theDoubleStatus:Option[Boolean] = self.theDoubleStatus protected[BuilderPattern] val theGlass:Option[Glass] = self.theGlass type HAS_BRAND = self.HAS_BRAND#TrueOnce type HAS_MODE = self.HAS_MODE type HAS_DOUBLE_STATUS = self.HAS_DOUBLE_STATUS type HAS_GLASS = self.HAS_GLASS } def withMode(p:Preparation) = new ScotchBuilder { protected[BuilderPattern] val theBrand:Option[String] = self.theBrand protected[BuilderPattern] val theMode:Option[Preparation] = Some(p) protected[BuilderPattern] val theDoubleStatus:Option[Boolean] = self.theDoubleStatus protected[BuilderPattern] val theGlass:Option[Glass] = self.theGlass type HAS_BRAND = self.HAS_BRAND type HAS_MODE = self.HAS_MODE#TrueOnce type HAS_DOUBLE_STATUS = self.HAS_DOUBLE_STATUS type HAS_GLASS = self.HAS_GLASS } def isDouble(b:Boolean) = new ScotchBuilder { protected[BuilderPattern] val theBrand:Option[String] = self.theBrand protected[BuilderPattern] val theMode:Option[Preparation] = self.theMode protected[BuilderPattern] val theDoubleStatus:Option[Boolean] = Some(b) protected[BuilderPattern] val theGlass:Option[Glass] = self.theGlass type HAS_BRAND = self.HAS_BRAND type HAS_MODE = self.HAS_MODE type HAS_DOUBLE_STATUS = self.HAS_DOUBLE_STATUS#TrueOnce type HAS_GLASS = self.HAS_GLASS } def withGlass(g:Glass) = new ScotchBuilder { protected[BuilderPattern] val theBrand:Option[String] = self.theBrand protected[BuilderPattern] val theMode:Option[Preparation] = self.theMode protected[BuilderPattern] val theDoubleStatus:Option[Boolean] = self.theDoubleStatus protected[BuilderPattern] val theGlass:Option[Glass] = Some(g) type HAS_BRAND = self.HAS_BRAND type HAS_MODE = self.HAS_MODE type HAS_DOUBLE_STATUS = self.HAS_DOUBLE_STATUS type HAS_GLASS = self.HAS_GLASS#TrueOnce } } type CompleteBuilder = ScotchBuilder { type HAS_BRAND = TRUE type HAS_MODE = TRUE type HAS_DOUBLE_STATUS = TRUE type HAS_GLASS <: NOT_MULTI } implicit def enableBuild(builder:CompleteBuilder) = new { def build() = new OrderOfScotch(builder.theBrand.get, builder.theMode.get, builder.theDoubleStatus.get, builder.theGlass); } def builder = new ScotchBuilder { protected[BuilderPattern] val theBrand:Option[String] = None protected[BuilderPattern] val theMode:Option[Preparation] = None protected[BuilderPattern] val theDoubleStatus:Option[Boolean] = None protected[BuilderPattern] val theGlass:Option[Glass] = None type HAS_BRAND = FALSE type HAS_MODE = FALSE type HAS_DOUBLE_STATUS = FALSE type HAS_GLASS = FALSE } }
As before, remember to import BuilderPattern._ to ensure that the implicit conversion method is in scope when using this patterm.

The changes are straightforward:
  1. I added the HAS_GLASS type to track the state of the number of calls to the withGlass method. It is coded in exactly the same way as the other state tracking types with the one exception of its value in the CompleteBuilder type. The CompleteBuilder type encodes which parameters are optional and which are required.
  2. Rather than having two states (just TRUE and FALSE), I am using a little class hierarchy with three states, representing no calls to a setter (FALSE), one call to a setter (TRUE), and more than one call to a setter (MULTI).
  3. Instead of just setting a state to TRUE when a setter is called, I am using a type member of the current state to implement a little state machine. The state machine transitions from FALSE to TRUE to MULTI and then stays in the MULTI state.
  4. The TRUE and FALSE states are in a separate subtree NOT_MULTI so that I can specify a value that includes either of those states, but not the MULTI state. I use this value in the CompleteBuilder type to specify that the HAS_GLASS value can be either TRUE or FALSE.
Bottom line: this implementation provides compile-time checking that the required setters are called exactly once, that the optional setters are called at most once, and is much simpler than the Church Numerals implementation.