Scala3 typeclassery with graphs
Before anything
I wrote this thing to see for myself if a typelevel style library for graphs makes any sense. It turns out I LIKE IT! but if you’re coming from Python, Java, Go or Kotlin you may will find it odd. In fact even typeclasses themselves are quite controversial, this comment from Don Syme (the creator of F#) caught my eye and it’s interesting to see that even in a functional and strongly typed language such as F# typeclasses are, to put it mildly, NOT WELCOMED!. On the other hand Rust and Swift use this design quite successfully without higher kinded types. Python now has support for protocols so you may be surprised, if this looks foreign, to find typeclasses adopted more and more. Supporting code is here.
Let’s take a stab
This is the gentle kind of graph, that fits in memory and doesn’t reach over to disk or over the network and doesn’t come home late at night smelling of booze.
trait Graph0[G[_, _]]:
extension [V, E](g: G[V, E])
def neighbours(v: V): Iterable[(V, E, V)]
def vertices: Iterable[V]
def triplets: Iterable[(V, E, V)]
def addV(v: V): G[V, E]
def edge(src: V, e: E, dst: V): G[V, E]
def neighboursV(v: V): Iterable[V] // this is derived from neighbours
def empty[V, E]: G[V, E]
// more and more stuff down here but you get the point
Legal advice needed
This is not a GUARANTEED 100% TYPECLASS™ since it doesn’t have any laws to govern it’s behaviour. Sure the function is called neighbours but that can return an endless list of triplets with null values, an empty Iterable or anything in between.
Friends, what are they good for?
Let’s write a simple law say called iCanHazFriend
where if there is an edge between a
and b
then b
is a neighbour of a
. It’s a silly law but there you go. You can express this in terms of Graph0
as such and it can be used to validate anything that claims to be a Graph with at most 1 edge.
def iCanHazFriend[V:Arbitrary, E:Arbitrary, G[_, _]: Graph0] =
forAll { (triplet: Triplet[V, E]) =>
// insert the triplet in the empty graph
val g = Graph0[G].empty[V, E].addTriplets(Tuple.fromProductTyped(triplet))
// call neighbours on the source and check the destination is in the returned list
g.neighboursV(triplet.src)
.find ( _ == triplet.dst )
.isDefined
}
Friends, what do they look like?
A graph that can guarantee one edge at a time is not the most useful kind. So let’s step-up from there and create a law that verifies we can build paths. we’ll call it friendOfAFriend
and it will check that if we add a chain of nodes and edges we can reach all of them if we keep calling neighbours
starting at the first node.
We first need a little function that just walks the graph by only looking at the first node in the neighbours
list, since we’re expecting a path that will do. We’ll come back to walk
a little later as it holds the key to traversing graphs but for now the simple approach works in our favour.
// start at the first node and keep calling neighbours
// only walk onto the first node of any neighbours list
// since the graph is directed and we expect it to be a path
// we should eventually reach the last node
def walk[G[_, _]: Graph0, V, E](g: G[V, E])(start: V): LazyList[V] =
g.neighbours(start).headOption match {
case None => LazyList(start)
case Some((node, _, next)) =>
node #:: walk(g)(next)
}
After inserting the path we start at the first node and KEEP WALKING™. We expect the result to be all the nodes we inserted.
def friendOfAFriend[V: Arbitrary, E: Arbitrary, G[_, _]: Graph0] =
forAll { (path: Path[V, E]) =>
// insert the triplet in the empty graph
val g = Graph0[G]
.empty[V, E]
.addTriplets(path.ts.map(Tuple.fromProductTyped(_)): _*)
// call walk from the first node
val actual = path.ts match
case first :: _ => walk(g)(first.src).toVector
case Nil => Vector.empty
val expected = g.vertices.toVector
// we should have all the vertices in the end
s"$actual != $expected" |: (actual == expected)
}
There are more laws that you can write for this but I’m bored so we’ll move along.
To fold or not to fold?
Our Graph0
typeclass seems to be missing an easy way to traverse itself starting from a node. Our simple walk
function won’t do if we have nodes pointing back at each other or more than 1 child as it will get stuck in a loop or return an incomplete list.
Furthermore there are a few ways to walk
a graph the most popular being depth first and breadth first. They stems from the choice of how you want to proceed once neighbours
returns a list of nodes.
Do you return all the children nodes then proceed with their children breadth or you keep calling neighbours
on the first node until you get an empty list then make your way back up depth.
In typelevel cats there is a typeclass called Foldable, unfortunately Graph
doesn’t quite have what it takes to act like a foldable, it’s missing what List and Vector take for granted: the first item!. And even if it had a first item trying to reach all of the graph from just once place is not always possible in graphs, directed or undirected.
We’re also not going to call it Foldable, instead we’re going for the wildly imaginative GraphTraversal
on account of not being a Foldable and for clarity purposes.
trait GraphTraversal[F[_]]:
extension [A](f: F[A])
def traversal[B](start:A, b: B)(acc: (B, A) => B): B
def collect[C1](from: A, into: Factory[A, C1]): C1 =
traversal(from, into.newBuilder)((b, a) => b += a).result
The obviously odd thing is that we seemed to have lost [V, E]
along the way, and it’s because we’re only going to return a GraphTraversal
for nodes and ignore edges. Sadly in the attention economy we live in some things had to be cut.
The less obvious benefit of a GraphTraversal
is that we can use any Scala collection with it and obtain whatever instance we want just by calling collect(from = start, into = Vector)
or collect(from = start, into = LazyList)
this will prove useful if we want to traverse our graph eagerly or lazily.
Here’s how it looks:
def dfs[E] = new GraphTraversal[[A] =>> G[A, E]] {
extension [A](g: G[A, E])
def traversal[B](start: A, b: B)(acc: (B, A) => B): B =
@tailrec
def loop(stack: List[A], seen: Set[A], b: B): B =
stack match {
case Nil => b
case first :: rest => // (aka. stack.pop)
val newB = acc(b, first)
val newStack = g
.neighboursV(first)
.filterNot(seen)
.foldLeft(rest) { (s, child) => child :: s }
loop(newStack, seen + first, newB)
}
loop(List(start), Set(start), b)
}
Honestly GraphTraversal
could be a bit simpler and the type lambda [A] =>> G[A, E]
is excessive but the title is Scala3 typeclassery with graphs not sensible approach to graphs so that’s what you get.
If you’ve never seen a dfs traversal before then go here. This looks more like the BFS loop if you scroll down the wikipedia page. That’s because most DFS textbook implementations use recursion but here we use a List
as a stack to avoid recursion, even if we use a tail recursive loop for extra PURITY™.
Here is the story:
There once was a loop with a stack of nodes, a working memory of all the nodes it had seen and an accumulator calling itself Mr b
def loop(stack: List[A], seen: Set[A], b: B): B =
And every time it went around it checked if the node on top of the stack had children that it had never encountered before and if it did it promptly added all of them on top of the stack. Before anything tho it made sure the accumulator (Mr b
) was informed.
case first :: rest =>
val newB = acc(b, first)
val newStack = g
.neighboursV(first)
.filterNot(seen)
.foldLeft(rest) { (s, child) => child :: s }
This went on until the stack was empty and Mr b
was pleased with the result and out the door.
If you don’t like my storrytelling I dont blame you let’s move on on how we put this to good use. The friendOfAFriend
law becomes free of the walk
function and uses collect instead to lift every node we encounter along the traversal in a Vector
// instead of walk(g)(first.src).toVector we have
Graph0[G].dfs[E].collect(g)(from = first.src, into = Vector)
Just one more thing (if you made it this far)
It turns out the traversal code for bfs and dfs is identical and we can get them both in one swoop if we abstract over the stack
with .. you guessed it another typeclass. There is a structure that can act like a stack and a queue called Deque but we’re not going to call it that.
trait TraversalSupport[F[_]]:
extension [A](f: F[A])
def take: Option[(A, F[A])]
def offer(a: A): F[A]
def from[A](a: A): F[A]
Our lonley dfs
function gets to be a specific case of graph traversal where the support data structure is a List
while bfs follows the same pattern but uses a Queue
.
def dfs[E] = traversal[E, List]
def bfs[E] = traversal[E, Queue]
def traversal[E, F[_]: TraversalSupport] =
new GraphTraversal[[A] =>> G[A, E]] {
extension [A](g: G[A, E])
def traversal[B](start: A, b: B)(acc: (B, A) => B): B =
@tailrec
def loop(stack: F[A], seen: Set[A], b: B): B =
stack.take match {
case None => b
case Some((first, rest)) => // in other words stack.pop
val newB = acc(b, first)
val newStack = g
.neighboursV(first)
.filterNot(seen)
.foldLeft(rest) { (s, child) => s.offer(child) }
loop(newStack, seen + first, newB)
}
loop(TraversalSupport[F].from(start), Set(start), b)
}