linguistics, computers, and puns

# A Scala Interview - Part 3

Tim Hammerquist July 08, 2018 #deepdive #generics #interview #scala

In the last post we finished implementing the `TopQueue` class and looked at how it was able to perform so much better than the plain sorting solution from Part 1.

This time we're going to drill down into Scala and turn our special purpose integer collection into an elegant, general purpose tool. We'll implement and briefly discuss a number of Scala features along the way.

I'll include snippets of our changes as we go. When we're done, I'll link to the finished (and tested) project.

## Traits

Recall that `TopQueue` has only three public methods:

• `add(elem: Int): Unit`
• `size: Int`
• `result: Seq[Int]`

Here are the methods we need to override to extend the Builder trait. In fact, `TopQueue` is such a good match that you'd almost think I planned it that way. (I didn't.)

• `+=(elem: A): Builder.this.type`
• `result: B`
• `clear: Unit`

The `result` method maps directly onto our own interface. `+=` is essentially the same as `add` with a return value. `clear` is the only missing element, but is easily added.

``````import collection.mutable.Builder

class TopQueue(val capacity: Int) extends Builder[Int, Seq[Int]] {

private[this] val queue = new PriorityQueue[Int]()(Ordering[Int].reverse)

override def +=(elem: Int): this.type = {
if (queue.size < capacity) {
queue.enqueue(elem)
} else if (elem > queue.head) {
val _ = queue.dequeue
queue.enqueue(elem)
}
this
}

def length: Int = queue.length
def size: Int = length

override def clear: Unit = queue.clear
override def result: Seq[Int] = queue.dequeueAll.reverse
}
``````
• In Line 1, we import the `Builder` type.
• In Line 3, we extend `Builder` and plug in the type we're building from (`Int`), and the type we're building to (`Seq[Int]`).
• In Line 7, we rename `add` to `+=`, and change the return type from `Unit` ("void") to `this.type`
• In Line 14, we explicitly return `this`.
• In Line 19, we add `clear` and delegate it directly to the `PriorityQueue` object, just like `length`.

After annotating each of our three methods with `override`, we can take full advantage of all the functionality derived from this trait.

``````scala> val nums = Seq.fill(20)(rng.nextInt)
nums: Seq[Int] = List(1472582494, -1205266267, 2110246338, -1899208084, -48495158, -1555380301, -930665556, -506692887, -1803449477, 335650907, 1588492614, -1883579611, 1106055165, -207075109, -874047448, 1074765327, -2058546192, -1073301285, 1441493014, -1317047274)

scala> val tq = new TopQueue(3)
tq: TopQueue = TopQueue@b1cc697

scala> tq += 1
res0: tq.type = TopQueue@b1cc697

scala> tq ++= Seq(2, 4, 3, 9)
res1: tq.type = TopQueue@b1cc697

scala> nums.foldLeft(tq)(_ += _)
res2: TopQueue = TopQueue@b1cc697

scala> tq.result
res3: Seq[Int] = Vector(2110246338, 1588492614, 1472582494)
``````

## Implicits

`TopQueue` is fine if you want to find the highest grade levels in your class, the authors who belong on the New York Times Best Sellers list, or which World Cup athletes spent the most time rolling on the ground.

But with just a little work, we can also use it to find the fastest times for the Tour de France, the lowest unemployment rates, or the politicians with the lowest approval rating.

Instead of telling PriorityQueue how we want to order our values, why don't we let our users tell us what they want? Using `implicit` values, we can turn our TopQueue into a MinQueue. First, here's how it will work:

``````scala> implicit val minOrd = Ordering[Int].reverse
minOrd: scala.math.Ordering[Int] = scala.math.Ordering\$\$anon\$4@44bbdd5c

scala> val tq = new TopQueue(3)
tq: TopQueue = TopQueue@536d0528

scala> nums.foldLeft(tq)(_ += _)
res4: TopQueue = TopQueue@536d0528

scala> tq.result
res5: Seq[Int] = Vector(-2130155797, -1828018352, -1822446556)
``````

Much like in our `TopQueue` class, we call `reverse` on an `Ordering[Int]` instance. We then assign the "reversed" instance to a variable named `minOrd` using the `implicit` keyword ... and never mention it again.

And yet TopQueue has still become a MinQueue, correctly returning the three smallest values passed to it. What is this magic?!

Let's look at the updated implementation:

``````class TopQueue(val capacity: Int)(implicit val ord: Ordering[Int])
extends Builder[Int, Seq[Int]] {

private[this] val queue = new PriorityQueue[Int]()(ord.reverse)

override def +=(elem: Int): this.type = {
if (queue.size < capacity) {
queue.enqueue(elem)
} else if (ord.gt(elem, queue.head)) {
val _ = queue.dequeue
queue.enqueue(elem)
}
this
}

...
}
``````

We've only made three changes here:

• In Line 1, we add an `implicit` parameter to the `TopQueue` constructor, specifying the name `ord` and the type `Ordering[Int]`.
• In Line 4, we call `reverse` on the implicitly passed `ord` variable when creating our `PriorityQueue`.
• In Line 9, we've changed the `>` operator to the `ord.gt()` method.

With these changes, all decision logic is derived from the `Ordering[Int]` object (that we never even passed... or did we?) If you haven't seen implicit variables before, there are probably a few questions you're asking about now.

First, how does Scala know to implicitly pass the `minOrd` variable to `TopQueue` when we didn't even mention it?!

In order for a variable to be implicitly passed, several things have to be in alignment:

• The method must be declared to accept an implicit parameter using the `implicit` keyword.
• The variable to be passed must be declared with the `implicit` keyword.
• And finally, the type of the implicit variable must match the type of the method's implicit parameter!

If the method parameter is not declared implicit, the method will expect the variable to be passed explicitly.

If there is no implicit variable with the correct type in scope when the method is called, the invocation will fail.

Okay, second question: isn't this implicit business just a little too clever and magical? Probably. But it's also very powerful, and allows the programmer to deterministically choose when to override default behavior --- hence our use of it in TopQueue.

Note that I'm not defending the language's use of implicits. I merely find myself in Rome, surrounded by Romans.

## Generics

We've factored the ordering preference out of our class entirely. We can track smallest or largest values at the user's whim. There's nothing innately numeric about any of our code anymore, so why limit ourselves to integers?

In fact, if we look at the implementation of `PriorityQueue`, we see that it doesn't care what we store in it either. It will happily store and sort any type of value, so long as there's an `Ordering[A]` implicit for it:

``````sealed class PriorityQueue[A](implicit val ord: Ordering[A])
``````

And if `PriorityQueue` can do it, so can we.

And because we've done such a good job of factoring out type-specific logic, turning `TopQueue` into a generic class is literally just changing `Int` to `A`.

``````class TopQueue[A](val capacity: Int)(implicit val ord: Ordering[A])
extends Builder[A, Seq[A]] {

private[this] val queue = new PriorityQueue[A]()(ord.reverse)

override def +=(elem: A): this.type = {
if (queue.size < capacity) {
queue.enqueue(elem)
} else if (ord.gt(elem, queue.head)) {
val _ = queue.dequeue
queue.enqueue(elem)
}
this
}

def length: Int = queue.length
def size: Int = length

override def clear: Unit = queue.clear
override def result: Seq[A] = queue.dequeueAll.reverse
}
``````

And now for a test spin:

``````scala> val iq = new TopQueue[Int](3)
scala> iq ++= Seq(1, 3, 5, 2, 4, 7, 8)
scala> iq.result
res0: Seq[Int] = Vector(8, 7, 5)

scala> val dq = new TopQueue[Double](3)
scala> dq ++= Seq(1.0, 2.4, 3.14159, 4.2, 3.0, 2.0, 5)
scala> dq.result
res1: Seq[Double] = Vector(5.0, 4.2, 3.14159)

scala> val sq = new TopQueue[String](3)
scala> sq ++= lorem.split(" ")
scala> sq.result
res2: Seq[String] = Vector(voluptate, veniam, velit)
``````

## Companion Objects

Now let's make our user's lives a little easier. Using a companion object, we can streamline the creation of `TopQueue` instances. Here's how it'll work.

``````scala> val tq = TopQueue[Int](3)
tq: TopQueue[Int] = TopQueue@4cceb3a1
``````

Well, that's got rid of the `new`. Feels a little more like Python, but no big deal. Let's move on.

``````scala> val nums = Seq.fill(20)(rng.nextInt(100))
nums: Seq[Int] = List(70, 34, 59, 48, 3, 20, 54, 0, 4, 66, 3, 59, 64, 66, 72, 8, 99, 13, 83, 72)

scala> TopQueue(3, nums).result
res0: Seq[Int] = Vector(99, 83, 72)

scala> TopQueue(3, lorem.split(" ")).result
res1: Seq[String] = Vector(voluptate, veniam, velit)
``````

With one method, we've disposed of the `new`, automated populating the queue with our data, and got rid of that pesky type annotation (`[Int]`, `[Double]`, `[String]`). This works because Scala is able to infer the type of `A` from the collection we pass (`Seq[A]`).

But let's go one step further. The following code generates 1,000 192-bit `BigInt` values, selects the five largest values, and displays them one per line:

``````scala> val bigNums = TopQueue(5, 1000) {
| BigInt(192, rng)
}
scala> bigNums.result.foreach(println)
6271062381133668254405510778307065078929875644850405394465
6267447238665173934555606099974970878732800791325289036101
6261977580545774650519866804001706195383900351836286095420
6247944717903870481800176157059032425225672699263315794305
6241830593043113801159745624306070018623673052833017718145
``````

We accomplish this by passing a code block to our object method as a parameter where it's evaluated 1,000 times, and each result is added to the queue.

Here's the code for our `TopQueue` companion object.

``````object TopQueue {
def apply[A](size: Int)(implicit ord: Ordering[A]): TopQueue[A] =
new TopQueue[A](size)

def apply[A](size: Int, coll: Iterable[A])(implicit ord: Ordering[A]): TopQueue[A] = {
val tq = new TopQueue[A](size)
coll.foreach(tq += _)
tq
}

def apply[A](size: Int, count: Int)(block: => A)(implicit ord: Ordering[A]): TopQueue[A] = {
val tq = new TopQueue[A](size)
for (_ <- 1 to count) tq += block;
tq
}
}
``````
• In Line 2, we add a simple wrapper around the constructor.
• In Line 5, we let the user pass in any `Iterable[A]` and iterate over it, adding each value to the queue.
• In Line 11, we see the syntax for passing a block (`=> A` --- essentially "any block expression that evaluates to an `A` value"). We evaluate the expression `count` times, pushing the result each time.

These are short, simple methods, but they go a long way to making our class more elegant and intuitive to use. Companion objects are frequently used as convenience constructors for Scala types, making `TopQueue` more intuitive for Scala programmers to use.

## Implicit Classes

Finally for some pure syntactic sugar. Implicit classes allow us to approximate what C# users know as extension methods.

Our target behavior is this:

``````scala> val nums = Seq.fill(1000)(rng.nextInt(1000))
nums: Seq[Int] = List(223, 213, 19, 968, 555, 714, 291, 638, 1, 476, 199, 957, 727, 206, 283...

scala> nums.top(3)
<console>:13: error: value top is not a member of Seq[Int]
nums.top(3)
^
scala> import topqueue.utils._
scala> nums.top(3)
res0: Seq[Int] = Vector(999, 996, 995)
``````

Let's take a look at this magical `utils` package:

``````package object utils {
implicit class IterableWithTop[A](val iter: Iterable[A]) {
def top(capacity: Int)(implicit ord: Ordering[A]): Seq[A] =
TopQueue(capacity, iter).result
}
}
``````

Implicit classes work much the same way as implicit variables. Where implicit variables are matched by type and passed whenever applicable, implicit classes are also matched by type and invoked when necessary to provide the requested method.

Consider the following.

``````val lorem = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
val words = lorem.split("\\W+")

import topqueue.utils._

val asciiAscending = Ordering[String].reverse
val alphaAscending = Ordering.by[String, (String, String)] {
(word) => (word.toLowerCase, word)
}.reverse

val first5ASCII = words.top(5)(asciiAscending)
val first5Alpha = words.top(5)(alphaAscending)
``````

You might remember above that I said implicit values had to be labeled `implicit`. This is only if we want Scala to pass them implicitly. As you can see here, we're passing them explicitly, so no `implicit` keyword is required.

This might seem a lot like default parameters, like you find in a lot of dynamic programming languages. But not quite. Recall that all these `Ordering` instances are being passed from the caller's context, whether they are the "default" values or custom overrides.

`Ordering` objects make good candidates for implicit values because Scala provides default values automatically. Another common implicit is the `ExecutionContext` used in the implementation of Scala's `Future` and `Promise`.

## The Complete Works

With Scala as our canvas, we've followed a simple interview question from rough idea to finished work. We sketched several concepts and trashed the impractical ones. We finished a highly effective design and had it proofed. Once we had executed (and tested) our work, we fleshed it out to a complete, user-friendly tool.

As promised, I've posted the finished product for you to browse, study, or experiment with.

Thank you for joining me!