Scala3 typeclassery with graphs

2021-09-02 8 minutes read

scala3 • graphs • algorithms

Before anything

I wrote this thing to see for myself if a typelevel style library for graphs makes any sense. It turns out I LIKE IT! but if you’re coming from Python, Java, Go or Kotlin you may will find it odd. In fact even typeclasses themselves are quite controversial, this comment from Don Syme (the creator of F#) caught my eye and it’s interesting to see that even in a functional and strongly typed language such as F# typeclasses are, to put it mildly, NOT WELCOMED!. On the other hand Rust and Swift use this design quite successfully without higher kinded types. Python now has support for protocols so you may be surprised, if this looks foreign, to find typeclasses adopted more and more. Supporting code is here.

Let’s take a stab

This is the gentle kind of graph, that fits in memory and doesn’t reach over to disk or over the network and doesn’t come home late at night smelling of booze.

trait Graph0[G[_, _]]:
  extension [V, E](g: G[V, E])
    def neighbours(v: V): Iterable[(V, E, V)]
    def vertices: Iterable[V]
    def triplets: Iterable[(V, E, V)]

    def addV(v: V): G[V, E]
    def edge(src: V, e: E, dst: V): G[V, E]

    def neighboursV(v: V): Iterable[V] // this is derived from neighbours

  def empty[V, E]: G[V, E]
// more and more stuff down here but you get the point

Legal advice needed

This is not a GUARANTEED 100% TYPECLASS™ since it doesn’t have any laws to govern it’s behaviour. Sure the function is called neighbours but that can return an endless list of triplets with null values, an empty Iterable or anything in between.

Friends, what are they good for?

Let’s write a simple law say called iCanHazFriend where if there is an edge between a and b then b is a neighbour of a. It’s a silly law but there you go. You can express this in terms of Graph0 as such and it can be used to validate anything that claims to be a Graph with at most 1 edge.

  def iCanHazFriend[V:Arbitrary, E:Arbitrary, G[_, _]: Graph0] =
    forAll { (triplet: Triplet[V, E]) =>
      // insert the triplet in the empty graph
      val g = Graph0[G].empty[V, E].addTriplets(Tuple.fromProductTyped(triplet))

      // call neighbours on the source and check the destination is in the returned list
      g.neighboursV(triplet.src)
        .find ( _ == triplet.dst )
        .isDefined
    }

Friends, what do they look like?

A graph that can guarantee one edge at a time is not the most useful kind. So let’s step-up from there and create a law that verifies we can build paths. we’ll call it friendOfAFriend and it will check that if we add a chain of nodes and edges we can reach all of them if we keep calling neighbours starting at the first node.

We first need a little function that just walks the graph by only looking at the first node in the neighbours list, since we’re expecting a path that will do. We’ll come back to walk a little later as it holds the key to traversing graphs but for now the simple approach works in our favour.


      // start at the first node and keep calling neighbours
      // only walk onto the first node of any neighbours list
      // since the graph is directed and we expect it to be a path
      // we should eventually reach the last node
    def walk[G[_, _]: Graph0, V, E](g: G[V, E])(start: V): LazyList[V] =
      g.neighbours(start).headOption match {
        case None => LazyList(start)
        case Some((node, _, next)) =>
          node #:: walk(g)(next)
      }

After inserting the path we start at the first node and KEEP WALKING™. We expect the result to be all the nodes we inserted.

  def friendOfAFriend[V: Arbitrary, E: Arbitrary, G[_, _]: Graph0] =
    forAll { (path: Path[V, E]) =>
      // insert the triplet in the empty graph
      val g = Graph0[G]
        .empty[V, E]
        .addTriplets(path.ts.map(Tuple.fromProductTyped(_)): _*)

      // call walk from the first node
      val actual = path.ts match
        case first :: _ => walk(g)(first.src).toVector
        case Nil        => Vector.empty

      val expected = g.vertices.toVector
      // we should have all the vertices in the end
      s"$actual != $expected" |: (actual == expected)
    }

There are more laws that you can write for this but I’m bored so we’ll move along.

To fold or not to fold?

Our Graph0 typeclass seems to be missing an easy way to traverse itself starting from a node. Our simple walk function won’t do if we have nodes pointing back at each other or more than 1 child as it will get stuck in a loop or return an incomplete list.

Furthermore there are a few ways to walk a graph the most popular being depth first and breadth first. They stems from the choice of how you want to proceed once neighbours returns a list of nodes. Do you return all the children nodes then proceed with their children breadth or you keep calling neighbours on the first node until you get an empty list then make your way back up depth.

In typelevel cats there is a typeclass called Foldable, unfortunately Graph doesn’t quite have what it takes to act like a foldable, it’s missing what List and Vector take for granted: the first item!. And even if it had a first item trying to reach all of the graph from just once place is not always possible in graphs, directed or undirected.

We’re also not going to call it Foldable, instead we’re going for the wildly imaginative GraphTraversal on account of not being a Foldable and for clarity purposes.

trait GraphTraversal[F[_]]:
  extension [A](f: F[A])
    def traversal[B](start:A, b: B)(acc: (B, A) => B): B

    def collect[C1](from: A, into: Factory[A, C1]): C1 =
      traversal(from, into.newBuilder)((b, a) => b += a).result

The obviously odd thing is that we seemed to have lost [V, E] along the way, and it’s because we’re only going to return a GraphTraversal for nodes and ignore edges. Sadly in the attention economy we live in some things had to be cut.

The less obvious benefit of a GraphTraversal is that we can use any Scala collection with it and obtain whatever instance we want just by calling collect(from = start, into = Vector) or collect(from = start, into = LazyList) this will prove useful if we want to traverse our graph eagerly or lazily.

Here’s how it looks:

  def dfs[E] = new GraphTraversal[[A] =>> G[A, E]] {
    extension [A](g: G[A, E])
      def traversal[B](start: A, b: B)(acc: (B, A) => B): B =
        @tailrec
        def loop(stack: List[A], seen: Set[A], b: B): B =
          stack match {
            case Nil => b
            case first :: rest => // (aka. stack.pop)
              val newB = acc(b, first)
              val newStack = g
                .neighboursV(first)
                .filterNot(seen)
                .foldLeft(rest) { (s, child) => child :: s }
              loop(newStack, seen + first, newB)
          }

        loop(List(start), Set(start), b)
  }

Honestly GraphTraversal could be a bit simpler and the type lambda [A] =>> G[A, E] is excessive but the title is Scala3 typeclassery with graphs not sensible approach to graphs so that’s what you get.

If you’ve never seen a dfs traversal before then go here. This looks more like the BFS loop if you scroll down the wikipedia page. That’s because most DFS textbook implementations use recursion but here we use a List as a stack to avoid recursion, even if we use a tail recursive loop for extra PURITY™.

Here is the story: There once was a loop with a stack of nodes, a working memory of all the nodes it had seen and an accumulator calling itself Mr b

        def loop(stack: List[A], seen: Set[A], b: B): B =

And every time it went around it checked if the node on top of the stack had children that it had never encountered before and if it did it promptly added all of them on top of the stack. Before anything tho it made sure the accumulator (Mr b) was informed.

    case first :: rest =>
        val newB = acc(b, first)
        val newStack = g
        .neighboursV(first)
        .filterNot(seen)
        .foldLeft(rest) { (s, child) => child :: s }

This went on until the stack was empty and Mr b was pleased with the result and out the door.

If you don’t like my storrytelling I dont blame you let’s move on on how we put this to good use. The friendOfAFriend law becomes free of the walk function and uses collect instead to lift every node we encounter along the traversal in a Vector

// instead of walk(g)(first.src).toVector we have
          Graph0[G].dfs[E].collect(g)(from = first.src, into = Vector)

Just one more thing (if you made it this far)

It turns out the traversal code for bfs and dfs is identical and we can get them both in one swoop if we abstract over the stack with .. you guessed it another typeclass. There is a structure that can act like a stack and a queue called Deque but we’re not going to call it that.

trait TraversalSupport[F[_]]:
  extension [A](f: F[A])
    def take: Option[(A, F[A])]
    def offer(a: A): F[A]
  def from[A](a: A): F[A]

Our lonley dfs function gets to be a specific case of graph traversal where the support data structure is a List while bfs follows the same pattern but uses a Queue.

  def dfs[E] = traversal[E, List]
  def bfs[E] = traversal[E, Queue]

  def traversal[E, F[_]: TraversalSupport] =
    new GraphTraversal[[A] =>> G[A, E]] {
      extension [A](g: G[A, E])
        def traversal[B](start: A, b: B)(acc: (B, A) => B): B =
          @tailrec
          def loop(stack: F[A], seen: Set[A], b: B): B =
            stack.take match {
              case None => b
              case Some((first, rest)) => // in other words stack.pop
                val newB = acc(b, first)
                val newStack = g
                  .neighboursV(first)
                  .filterNot(seen)
                  .foldLeft(rest) { (s, child) => s.offer(child) }
                loop(newStack, seen + first, newB)
            }

          loop(TraversalSupport[F].from(start), Set(start), b)
    }