Thursday, 4 July 2013

A Benefit of Implicit Conversions: Avoiding Long Boilerplate in Wrappers

In OO practice we use wrapper classes a lot. So much in fact, that the Gang of Four identified several types of wrappers (differentiated more by purpose than structure): Adapters, Decorators, Facades, and Proxies. In this blog post I am going to concentrate on the Proxy design pattern, but the same technique of using implicit def to cut down code repetition can be applied for many different types of wrappers.

As an example, I'm going to take a data structure from my open-source quiz app Libanius. It's not necessary to go into the full data model of the app, but one of the central domain objects is called a WordMappingValueSet, which holds pairs of words and associated user data. This data is serialized and deserialized from a custom format in a simple file.

The problem is that the deserialization, which involves parsing, is rather slow for tens of thousands of WordMappingValueSet's, especially on Android, the target platform. The app performs this slow initialization on startup, but we don't want to keep the user waiting before starting the quiz. How can the delay be avoided? The solution is to wrap the WordMappingValueSet in a Lazy Load Proxy, which gives the app the illusion that it immediately possesses all these WordMappingValueSet objects when in fact all it has is a collection of proxies that each contain the serialized String for a WordMappingValueSet. The expensive work of deserializing the WordMappingValueSet is done by the proxy only when each object is really needed.

The question now is how to implement the Lazy Load Proxy. The classic object-oriented way
to do it, as documented by the Gang of Four book, is to create an interface to the WordMappingValueSet that contains all the same methods, and then a WordMappingValueSetLazyProxy implementation of that interface. The client code instantiates the proxy object but it makes calls on the interface as if it were calling the real object. In this way, the illusion is preserved so not much has to change in the client code, but "under the hood" what is happening is that the proxy is deserializing the String and then forwarding each call to the real object.

In the above diagram, the function fromCustomFormat() does the work of deserialization. In Java this would be a static method; in Scala it is placed within the singleton object WordMappingValueSet. Inside WordMappingValueSetLazyProxy, the serialized String is stored, and fromCustomFormat() is called the first time one of the business methods is called, before  the call is delegated to the method of the same name in WordMappingValueSet.

This kind of wrapper is very common in the Java world. But there is a problem. There are a lot of methods in WordMappingValueSet. These all need to be duplicated in IWordMappingValueSet and WordMappingValueSetLazyProxy. It violates the principle of DRY (Don't Repeat Yourself). Implementing this in Scala, I would go from about a hundred lines of code to two hundred lines. (In Java I would be going from two hundred lines to four hundred lines.) Furthermore, every time I need to add a method to WordMappingValueSet, I now need to do the same to the other two types, so the lightweight flexibility of the data structure has been lost.

In Scala, there is a better solution: use an implicit conversion. Firstly, as before, the WordMappingValueSet is wrapped inside a WordMappingValueSetLazyProxy:

case class WordMappingValueSetLazyProxy(valuesString: String) {  
  lazy val wmvs: WordMappingValueSet = WordMappingValueSet.fromCustomFormat(valuesString)

But that's all that's necessary for that class. We won't be defining the same methods all over again. Notice also that the lazy keyword, provided by Scala, means that fromCustomFormat() won't actually be called until some method is called on wmvs, i.e. until it is really needed.

But the real trick to make this work is to define the implicit conversion:

object WordMappingValueSetLazyProxy {
  implicit def proxy2wmvs(proxy: WordMappingValueSetLazyProxy): WordMappingValueSet = proxy.wmvs

This "converts" a WordMappingValueSetLazyProxy to a WordMappingValueSet. More specifically, it says to the compiler, "Whenever a method is called on WordMappingValueSetLazyProxy such that it doesn't seem to compile because no such method exists, try calling that method on  WordMappingValueSet instead and see if that works." So we get the benefits of adding all the methods of WordMappingValueSet to WordMappingValueSetLazyProxy without having to actually do it by hand as in Java.

There are other types of proxies, objects that give you the illusion that a resource has been acquired before it actually has. Perhaps the most common type is the network proxy. This is how EJBs are  implemented, and perhaps the most frequent criticism of EJBs, since the early days of 2.x, is the large amount of code redundancy: for each domain object, you need to write a plethora of interfaces surrounding it. A lot of tools and techniques have emerged over the years to deal with the redundancy: code generation, AOP byte code weaving, annotations, etc. Could implicit conversions have avoided this? Some believe that implicits are too powerful and scary, but considering how much time has been spent over the years on frameworks involving reams of XML to work around the limitations of Java, it seems there might be a case for allowing this kind of power to be built into the language. Scala has this.