Tuesday 23 September 2014

The QuizItemSource Typeclass

When talking about scaling an application, people are usually referring to performance: how to design the software so that limited hardware can support millions of users? But it is also important to be able to scale the feature set of an application. If users say they want a new button, you don't want to have to make extensive refactorings of your core model classes as is all too often the case in industry. The key to both of these related demands is decoupling.

While writing the Libanius quiz application, I found typeclasses to be very useful in decoupling behaviour from the model classes. Typeclasses (formally spelt "type classes") originated in Haskell, but can be used in Scala too, being a language that supports FP well. I'll describe one use of typeclasses in Libanius, an example that should be less "theoretical" and more extended than the ones that are usually given.

To start with, the core model in Libanius is a simple containment hierarchy as shown in the diagram. The Quiz data structure contains all the information necessary to provide a stream of quiz items to a single user. It is split by topic into QuizGroups, which, for performance reasons, are further partitioned into "memory levels" according to how well the user has learned the items. The flow of a typical request then is Quiz.findQuizItem() followed by QuizGroup.findQuizItem() calls, followed by QuizGroupMemoryLevel.findQuizItem() calls.

In practice the findQuizItem() signatures and functionality were similar but not identical. Rather than have them scattered throughout the model classes, it was desirable to put them together in a single place where the similarites could be observed and moulded, avoiding any unnecessary repetition. The same was true of other operations in the model classes. Pulling them out would also solve the problem of too much bloat in the model.

Furthermore, although the Quiz data structures are optimized for finding quiz items, I wanted to be able to extend the same action to other entities. A wide range of scenarios can be imagined:

  1. A Twitter feed of quotes might be seen as a mapping from speakers to quotations, which could be turned into a Who-Said-What quiz.
  2. A set of web pages with recipes could be read as a "What ingredient goes with what dish?" quiz.
  3. In a chat forum context, a person might be thinking of quiz questions and feeding them through.
  4. A clever algorithm might not simply be "finding" quiz items in a store, but dynamically generating them based on previous answers. 
  5. The Quiz might not be a Scala data structure, but a database. This could be a NoSQL in-memory database like Redis or leveldb. It could also be a traditional RDBMS like Postgres accessed via Slick.

What links all these cases together is that a findQuizItem() function is necessary but with a new implementation. So the immediate priority is to decouple the findQuizItem() action from the simple Quiz data structure. Notice that this action might be applied to structures that didn't even know they were supposed to be quizzes. We don't want to have to retroactively modify classes like TwitterFeed and ChatParticipant to make them "quiz aware".

If you know a lot of Java, some design patterns might occur to you here, like Adapter, Composite, or Visitor. In the Visitor pattern, in particular, you can separate out arbitary operations from the data structures and form a parallel hierarchy to run those operations. But the original data structures still have to have an accept() method defined. In Scala a more convenient alternative, graced with all the polymorphism you could ever need, is the typeclass construct. Typeclasses excel at separating out orthogonal concerns.

Firstly a trait is defined for the typeclass with the behaviour we wish to separate out. Let's begin by calling the main function produceQuizItem() rather than findQuizItem(), to abstract it more from the implementation:

trait QuizItemSource[A] {
  def produceQuizItem(component: A): Option[QuizItem]}
}

The component could be a Quiz, a QuizGroup, a QuizGroupMemoryLevel or anything else. Instead of calling produceQuizItem() on the component, the component is passed as a parameter to the operation.

Next, define some sample implementations (aka typeclass instances) for the entities (aka model components) that we already have:

object modelComponentsAsQuizItemSources {

  implicit object quizAsSource extends QuizItemSource[Quiz] {
    def produceQuizItem(quiz: Quiz): Option[QuizItem]} = 
      … // implementation calls methods on quiz
  }

  implicit object quizGroupAsSource extends QuizItemSource[QuizGroup] {
    def produceQuizItem(quizGroup: QuizGroup): Option[QuizItem]} = 
      … // implementation calls methods on quizGroup
  }

  implicit object quizGroupMemoryLevelAsSource extends QuizItemSource[QuizGroupMemoryLevel] {
    def produceQuizItem(quizGroupMemoryLevel: QuizGroupMemoryLevel): Option[QuizItem]} = 
      … // implementation calls methods on quizGroupMemoryLevel
  }
}

These sample implementations can be accessed in client code once it is imported in this way:

import modelComponentsAsQuizItemSources._

However, we don't want to call implementations directly, like quizGroupAsSource.produceQuizItem(myQuizGroup). Instead we want to just call produceQuizItem(myQuizGroup) and let the correct implementation be selected automatically: this is what polymorphism is about. To facilitate this, let's define a function in the middle for accessing the implementations (and this is really the final piece of the puzzle for typeclasses!):

object QuizItemSource {

  def produceQuizItem[A](component: A)(implicit qis: QuizItemSource[A]): Option[QuizItem] =
    qis.produceQuizItem(component)
}

If you're coming from other languages, it's the use of the implicit that is at first difficult to grasp, but this is really the Scala feature that makes typeclasses convenient to use, where they are not in, say, Java. Let me explain what is happening here:

Suppose in client code, the modelComponentsAsQuizItemSources has been imported as above. Now the quizGroupAsSource is implicitly in scope, among other implementations. If there is now a call to QuizItemSource.produceQuizItem(quizGroup), the compiler, seeing that quizGroup is of type QuizGroup will match qis to an implicit of type QuizItemSource[QuizGroup]: this is quizGroupAsSource, the correct implementation.

The great thing is that the correct QuizItemSource implementation will always be selected based on the type of entity passed to produceQuizItem, and also the context of the client, specifically the imports it has chosen. Also, we have achieved our original goal that whenever we realize we want to turn an entity into a quiz (TwitterFeed, ChatParticipant, etc.) we don't have to modify that class: instead, we add an instance of the typeclass to modelComponentsAsQuizItemSources and import it.

This approach is a break with OO, moving the emphasis from objects to actions. Instead of adding actions (operations) to objects, we are adding objects to an action generalization (typeclass). We can make that action as powerful as we like by continuing to add objects to it. Again, unlike in OO, those objects do not need to be custom-built to be aware of our action.

That is the main work done. There are two more advanced points to cover.

Firstly, it is possible to add some "syntactical sugar" to the definition of produceQuizItem() above. In this code

def produceQuizItem[A](component: A)(implicit qis: QuizItemSource[A]): Option[QuizItem] =
    qis.produceQuizItem(component)

… the qis variable may be viewed as clutter. We can push it into the background using context bound syntax:

def produceQuizItem(A : QuizItemSource](component: A): QuizItem =
  implicitly[QuizItemSource].produceQuizItem(component)

What happens is that the context bound A : QuizItemSource establishes an implicit parameter, which is then pulled out by implicitly. This syntax is a bit shorter, and looks shorter still if you do not need to call implicitly but are simply passing the implicit parameter down to another function.



Secondly, it is possible to have a typeclass with multiple type parameters. Context bound syntax is not convenient for this case, so, first, let's go back to the original syntax: 



def produceQuizItem[A](component: A)(implicit qis: QuizItemSource[A]): Option[QuizItem] =
    qis.produceQuizItem(component)

Suppose a Quiz were passed in, the quizAsSource instance will be selected. You will remember this looked like this:

implicit object quizAsSource extends QuizItemSource[Quiz] {
  def produceQuizItem(quiz: Quiz): Option[QuizItem]} = 
    … // implementation calls methods on quiz
}

However, I simplified a bit here. The original implementation did not in fact return simple quiz items in the case of a Quiz. Whenever the Quiz retrieved a quiz item from a QuizGroup, it would add some information. Specifically it would generate some "wrong choices" for the question and return that to the user to allow a multiple-choice presentation of the quiz item. So the QuizItem would be wrapped in a QuizItemViewWithChoices object. The signature had to be:

  def produceQuizItem(quiz: Quiz): Option[QuizItemViewWithChoices]} = … 

Now the problem is this does not conform to the typeclass definition. So let's massage the typeclass definition a bit by turning the return type into a new parameter, C:

trait QuizItemSource[A, C] {
  def produceQuizItem(component: A): Option[C]
}

Then the typeclass instance for Quiz will look as desired:

implicit object quizAsSource extends QuizItemSource[Quiz, QuizItemViewWithChoices] {
  def produceQuizItem(quiz: Quiz): Option[QuizItemViewWithChoices]} = 
    … // implementation calls methods on quiz
}

The intermediate function for accessing the typeclass becomes a bit more complex. Recall this was previously:

  def produceQuizItem[A](component: A)
      (implicit qis: QuizItemSource[A]): Option[QuizItem] =
    qis.produceQuizItem(component)

The second type parameter for the return type is added as C:

  def produceQuizItem[A, C](component: A)
      (implicit qis: QuizItemSourceBase[A, C], c: C => QuizItem): Option[C] =
    qis.produceQuizItem(component)

C is constrained so that the return value must be compatible with QuizItem. To make QuizItemViewWithChoices compatible with QuizItem, it is not necessary to make it a subclass with lots of boilerplate as in Java. Instead an implicit conversion is defined next to the QuizItemViewWithChoices class like this:

object QuizItemViewWithChoices {
  // Allow QuizItemViewWithChoices to stand in for QuizItem whenever necessary
  implicit def qiView2qi(quizItemView: QuizItemViewWithChoices): QuizItem =
    quizItemView.quizItem
}

Now it all works!

That ends the description of the QuizItemSource typeclass. For more examples, see the other typeclasses in the Libanius source code such as CustomFormat and ConstructWrongChoices, which cover other concerns orthogonal to the model, or view this excellent video tutorial by Dan Rosen.

Monday 3 March 2014

The Akka Event Bus on Android

Event buses are useful for keeping components in a system loosely coupled, and we generally see them in two areas: distributed systems, and UIs. This post is about the use of an event bus in a UI, specifically Android.

There exist event bus libraries for Android like Otto and greenrobot. There also exists an Event Bus API for Akka. What I haven't seen yet is the Akka Event Bus set up on Android. So, here it is!

First a word on the motivation. In the Libanius Android application, I recently separated part of a screen (in Android's parlance, an Activity) into its own separate class. A problem was that, in response to certain events, the new class needed to add data back to the data structure in the main class for the Activity. My initial approach was to define a closure that could do the work of adding the data, and pass the closure as an argument to the new class (the sub-Activity). However, this felt a bit ad hoc, so I decided it was time to introduce a general event-handling mechanism into Libanius Android. Having helped create an event bus for the GWT UI framework before, I thought it would be interesting to see if the same idea could be applied in Android.

The Libanius Event Bus just extends Akka's ActorEventBus. It is nearly as simple as this:

class LibaniusActorEventBus extends ActorEventBus with LookupClassification
    with AppDependencyAccess {

  type Event = EventBusMessageEvent
  type Classifier = String

  protected def classify(event: Event): Classifier = event.channel

  protected def publish(event: Event, subscriber: Subscriber) {
    subscriber ! event
  }
}

The diagram below shows what use we are actually making of the event bus:



The Android parts are OptionsScreen and DictionarySearch. In the UI, DictionarySearch is actually part of the OptionsScreen activity, but they communicate through the event bus. When the user clicks on a button from DictionarySearch to add a new word to the quiz, a NewQuizItemMessage is constructed. This event is just a wrapper for the quiz item and an identifying header:

case class NewQuizItemMessage(header: QuizGroupHeader, quizItem: QuizItem)
  extends EventBusMessage

The NewQuizItemMessage is published to the QUIZ_ITEM_CHANNEL by calling sendQuizItem() in LibaniusActorSystem. LibaniusActorSystem is a facade to LibaniusActorEventBus and other actors in the system not discussed here.

object LibaniusActorSystem extends AppDependencyAccess {
  val system: ActorSystem = ActorSystem("LibaniusActorSystem")

  val appActorEventBus = new LibaniusActorEventBus
  val QUIZ_ITEM_CHANNEL = "/data_objects/quiz_item"

  def sendQuizItem(qgh: QuizGroupHeader, quizItem: QuizItem) {    
    val newQuizItemMessage = NewQuizItemMessage(qgh, quizItem)
    appActorEventBus.publish(EventBusMessageEvent(QUIZ_ITEM_CHANNEL, newQuizItemMessage))
  }
}

That's the guts of it. There is some more code in OptionsScreen to create an actor to subscribe to the QUIZ_ITEM_CHANNEL, and actually add the QuizItem when it sees the event. For more detail, see the code in the Libanius Android project.

Friday 3 January 2014

Extending Play with AngularJS: Nested MVCs in a Webapp

I finally got a bit of spare time to tighten up the JavaScript in Libanius-Play, the web interface to my quiz application. For a long time, I was dissatisfied with the way data was being passed around from server to client and then back from client to server: there was too much repetition. The JavaScript library jQuery was of use to create AJAX callbacks, but still, any time there was a new parameter, it had to be added in too many places.

AngularJS is another JavaScript library. Like jQuery it is geared towards asynchronous communication, but it provides an MVC framework. The Play Framework is also MVC, so the result is an MVC-within-MVC architecture (see diagram). It seems to work quite well.



The AngularJS controller has a services layer to fetch data from the Scala back-end. Then it updates the page. The nice thing about it is that I don't have to update variables in the web page one by one: I can just bind variables in the web page to attributes in the model. This is achieved in the HTML using AngularJS tags like ng-model and the use of curly braces like this: 

    
    {{ quizData.prompt }}

There are a number of variables like this scattered through the HTML, but they can all be updated at once in a single line of JavaScript if the Play backend returns a JSON structure. (The data structure can be defined as a case class in Scala on the server, and will be dynamically mirrored as a JavaScript structure on the client-side without the need to map attributes manually.) Here is a simplified version of what happens in the JavaScript controller after the server has processed a user response: basically it returns a new quiz item. 

    
    services.processUserResponse($scope.quizData, response).then(function(freshQuizData) {
       $scope.quizData = freshQuizData.data // updates values in the web page automagically
       labels.setColors($scope.quizData)    // extra UI work using a custom JavaScript module called labels
    }

In the first line, services.processUserResponse is a function in the services.js file which sends user input to the server, and immediately returns a promise of new data. The promise has a handy then function which takes a callback argument, describing what happens when the data actually comes through, i.e. updating the view.

All in all, a good experience with AngularJS.