# Dependently typed DAGs

A colleague of mine recently needed to represent DAGs (directed acyclic graphs) in Coq, and asked around for ideas. Since Coq is not a nice language to program in, I decided to use Haskell instead. Something close to dependently typed programming is possible in Haskell thanks to GADTs. And other extensions will be helpful too,

```{-# LANGUAGE GADTs, TypeOperators, Rank2Types #-}
```

My idea is to represent a DAG as a list of nodes. Nodes have a list of children, where each child is a reference to an element later in the list.

For example, the DAG would be represented as

```[Node "a" [1,2,2,4], Node "b" [3,3], Node "c" [3,4], Node "d" [], Node "e" []]
```

## Data types

To make the above representation safe, we need to ensure two things:

• Each child-reference is greater than the index of the parent.
• Each child-reference refers to an actual node, so it must be smaller than the size of the list.

The first condition is easily satisfied, by making the reference relative to the current position and using natural numbers. So the representation would be

```[Node "a" [0,1,1,3], Node "b" [1,1], Node "c" [0,1], Node "d" [], Node "e" []]
```

For the second condition we need dependent types. In particular the type Fin n of numbers smaller than n.

```data Zero
data Succ n

data Fin n where
Fin0 :: Fin (Succ n)
FinS :: Fin n -> Fin (Succ n)
```

A node then holds a label of type a and a list of numbers less than n.

```data Node a n where
Node :: a -> [Fin n] -> Node a n
deriving (Eq,Show)
```

For the list of nodes we will use a dependently typed vector,

```data Vec f n where
Empty :: Vec f Zero
(:::) :: f n -> Vec f n -> Vec f (Succ n)
infixr 5 :::
```

A value of Vec f n is a list of the form [] or [x0::f 0] or [x__1::f 1, x__0::f 0] or [x2::f 2, x1::f 1, x0::f 0] etc., with a length equal to the parameter n. That is exactly what we need for DAGs:

```type DAG a = Vec (Node a)
```

## Instances

I would like to define Eq and Show instances for these datatypes. But the instance for Vec would look something like

```instance (forall m. Eq (f m)) => Eq (Vec f n)
```

which is not valid Haskell, even with extensions. The solution is to use another class, Eq1

```class Eq1 f where
eq1 :: f n -> f n -> Bool
```

Now we can define

```instance Eq1 f => Eq (Vec f n) where
Empty == Empty = True
(x ::: xs) == (y ::: ys) = eq1 x y && xs == ys
```

The boring instances for Node and Fin are

```instance Eq a => Eq1 (Node a) where
eq1 = (==)
instance Eq1 Fin where
eq1 = (==)
instance Eq1 f => Eq1 (Vec f) where
eq1 = (==)

-- ghc can't derive this
instance Eq (Fin n) where
Fin0   == Fin0   = True
FinS i == FinS j = i == j
_      == _      = False
```

The same goes for Show

```class Show1 a where
showsPrec1 :: Int -> a n -> ShowS

-- instances ommitted, see source code
```

## Convert to tree

To show that these DAGs work, we can convert from a DAG to a tree by duplicating all nodes. The tree type is a simple rose tree, as those in Data.Tree:

```data Tree a = TNode a [Tree a]  deriving Show
```

To be able to make a dag into a tree, we need to know the root node. So we give the toTree a DAG n and an Fin n to indicate the root.

```-- Convert a DAG to a tree, using the given node index as root
toTree :: Fin n -> DAG a n -> Tree a
toTree Fin0 (Node x cs ::: ns) = TNode x [toTree c ns | c <- cs]
toTree (FinS i) (_ ::: ns) = toTree i ns -- drop the head until we reach the root
```

And for convenience, a function that assumes that the first node in the list is the root.

```toTree' :: DAG a (Succ n) -> Tree a
toTree' = toTree Fin0
```

Here is the example from above

```example = Node "a" [Fin0,FinS Fin0,FinS Fin0,FinS (FinS (FinS (Fin0)))]
::: Node "b" [FinS Fin0,FinS Fin0]
::: Node "c" [Fin0,FinS Fin0]
::: Node "d" []
::: Node "e" []
::: Empty
```
```λ> toTree' example
TNode "a" [TNode "b" [TNode "d" [],TNode "d" []]
,TNode "c" [TNode "d" [],TNode "e" []]
,TNode "c" [TNode "d" [],TNode "e" []]
,TNode "e" []]
```

As an image: .

## Convert from a tree

More interesting is the conversion from a tree to a DAG, in such a way that we share identical nodes. For that we first of all need to be able to search a DAG to see if it already contains a particular node.

Let's do this a bit more generic, and define a search over any Vec f.

```findVec :: (Eq1 f, Pred1 f) => f n -> Vec f n -> Maybe (Fin n)
```

What is that Pred1 class? And why do we need it? When you have a value of type f n, and you want to compare it to the elements of a vector, you will quickly discover that these elements have different types, f m with m < n. So, we need to either convert the f n to the f m or vice-versa.

I'll go with the former, because that means the search can stop early. If a node refers to a child Fin0, that means it points to the first node in the DAG. So there is no point in looking if it is duplicated anywhere in vector, because other nodes can't possibly refer to earlier ones.

What the Pred1 class does is tell you: "if this item occurred one place later in the vector, what would it look like?". And if it can not occur in later places return Nothing:

```class Pred1 f where
pred1 :: f (Succ n) -> Maybe (f n)

instance Pred1 Fin where
pred1 Fin0 = Nothing
pred1 (FinS i) = Just i

instance Pred1 (Node a) where
pred1 (Node x cs) = Node x `fmap` mapM pred1 cs
```

Now the search becomes relatively straight forward:

```findVec x (y ::: ys) = case pred1 x of
Just x' | eq1 x' y  -> Just Fin0
| otherwise -> FinS `fmap` findVec x' ys
Nothing -> Nothing
findVec _ _ = Nothing
```

The nice thing about GADTs is that it becomes almost impossible to make mistakes, because the typechecker will complain if you do.

## Lifting boxes

When converting a Tree to a DAG, we do not know beforehand how many nodes that DAG is going to have. Therefore, we need to put the produced DAG into an existential box, that hides the parameter n.

That is fine for the end result, but it will not work when incrementally constructing a DAG. Suppose you wanted to add two nodes to a DAG. Adding the first node is fine, but then you need to ensure that the children of the second node are still there. In addition, the second node will need to be adjusted: all child references have to be incremented, to skip the first added node.

That adjusting is done with the the counterpart to Pred1, the Succ1 class

```class Succ1 f where
succ1 :: f n -> f (Succ n)

instance Succ1 Fin where
succ1 = FinS

instance Succ1 (Node a) where
succ1 (Node x cs) = Node x (map FinS cs)
```

Our box will come with the ability to 'lift' any succable value into it:

```data Box f n where
Box :: (forall g. Succ1 g => g n -> g m) -> f m -> Box f n
```

You can think of Box f n as a value of f m where m >= n. This allows turning any g n into a g m, which can be combined with the value in the box. Before we can see Box in action, we will first need some functors to store things:

```-- product functor
data (:*:) f g a = (:*:) { fst1 :: f a, snd1 :: g a }
-- functor composition
newtype (:.:) f g a = Comp { getComp :: f (g a) }
```

Now when adding a node we check if it is already in the DAG, and if so, return the index. If the node is not yet in the DAG, then add it. By adding the node the DAG becomes 1 larger, from a DAG n we get a DAG (Succ n). Therefore, we need one level of succ.

```consNode :: Eq a => Node a n -> DAG a n -> Box (Fin :*: DAG a) n
consNode n dag = case findVec n dag of
Just i  -> Box id    (i :*: dag)
Nothing -> Box succ1 (Fin0 :*: (n ::: dag))
```

Now the ugly part: converting an entire node.

```fromTree :: Eq a => Tree a -> DAG a n -> Box (Fin :*: DAG a) n
fromTree (TNode x cs) dag0
= case fromForest cs dag0 of
Box to1 (Comp cs1 :*: dag1) ->
case consNode (Node x cs1) dag1 of
Box to2 ans -> Box (to2 . to1) ans
```

And a forest, aka. a list of trees:

```fromForest :: Eq a => [Tree a] -> DAG a n -> Box (([] :.: Fin) :*: DAG a) n
fromForest [] dag = Box id \$ Comp [] :*: dag
fromForest (x:xs) dag0
= case fromForest xs dag0 of
Box to1 (Comp xs1 :*: dag1) ->
case fromTree x dag1 of
Box to2 (x2 :*: dag2) ->
Box (to2 . to1) (Comp (x2 : map to2 xs1) :*: dag2)
```

At the top level we start with an empty DAG, and ignore the index of the root (which will always be Fin0).

```fromTree' :: Eq a => Tree a -> Box (DAG a) Zero
fromTree' x
= case fromTree x Empty of
Box to1 (_ :*: dag1) ->
Box to1 dag1
```

To understand these functions, you should ignore the Box constructors, what you are left with is

```fromTreepseudo (TNode x cs) dag
= let (cs',dag') = fromForestpseudo cs dag
in constNode (Node x cs') dag
fromForestpseudo []     dag = dag
fromForestpseudo (x:xs) dag
= let (ns,dag') = fromForestpseudo xs
(n,dag'') = fromTreepseudo x dag'
in (n:ns,dag'')
```

Here is a test that shows that we are able to recover the sharing that was removed by toTree':

```λ> fromTree' (toTree' example)
Node "a" [0,1,1,3] ::: Node "b" [1,1] ::: Node "c" [0,1]
::: Node "d" [] ::: Node "e" [] ::: Empty
λ> fromTree' (toTree' example)
== Box (succ1 . succ1 . succ1 . succ1 . succ1) example
True
```

All this wrapping and unwrapping of Box is really ugly. It should also remind you of something. That something is a monad. And Box is indeed a monad, just not a normal Haskell one. Instead it is (surprise, surprise) a 'Monad1':

```class Monad1 m where
return1 :: f a -> m f a
(>>>=) :: m f a -> (forall b. f b -> m g b) -> m g a

return1 = Box id
Box l x >>>= f = case f x of
Box l' y -> Box (l' . l) y
```

Combine this with two utility functions:

```-- Lift a value y into a Box
boxLift :: Succ1 g => Box f n -> g n -> Box (f :*: g) n
boxLift (Box l x) y = Box l (x :*: l y)
```
```-- Apply one level of succ before putting things into a Box
boxSucc :: Box f (Succ n) -> Box f n
boxSucc (Box l x) = Box (l . succ1) x
```

And one more Succ1 instance:

```instance (Functor f, Succ1 g) => Succ1 (f :.: g) where
succ1 (Comp x) = Comp (fmap succ1 x)
```

Now we can write this slightly less ugly code

```fromTreem :: Eq a => Tree a -> DAG a n -> Box (Fin :*: DAG a) n
fromTreem (TNode x cs) dag0
= fromForestm cs dag0
>>>= \(Comp cs1 :*: dag1) ->
consNode (Node x cs1) dag1

fromForestm :: Eq a => [Tree a] -> DAG a n -> Box (([] :.: Fin) :*: DAG a) n
fromForestm [] dag
= return1 \$ Comp [] :*: dag
fromForestm (x:xs) dag0
= fromForestm xs dag0
>>>= \(xs1 :*: dag1) ->
fromTreem x dag1 `boxLift` xs1
>>>= \(x2 :*: dag2 :*: Comp xs2) ->
return1 \$ Comp (x2 : xs2) :*: dag2
```

This might be even nicer when we add in a state monad for the DAG, but I'll leave that for (maybe) another time.

## Bonus: alternative definition of Box

If you don't like existential boxes, then here is another way to define the Box monad.

```data Box' f n where
Box0 :: f n -> Box' f n
BoxS :: Box' f (Succ n) -> Box' f n

return1 = Box0
Box0 x >>>= y = y x
BoxS x >>>= y = BoxS (x >>>= y)

boxSucc' :: Box' f (Succ n) -> Box' f n
boxSucc' = BoxS

boxLift' :: Succ1 g => Box' f n -> g n -> Box' (f :*: g) n
boxLift' (Box0 x) y = Box0 (x :*: y)
boxLift' (BoxS x) y = BoxS (boxLift' x (succ1 y))
```

The two types are equivalent, as shown by

```equiv1 :: Box' f n -> Box f n
equiv1 (Box0 x) = return1 x
equiv1 (BoxS x) = boxSucc (equiv1 x)

equiv2 :: Box f n -> Box' f n
equiv2 (Box l x) = runUnBox' (l (UnBox' id)) (Box0 x)

newtype UnBox' f m n = UnBox' {runUnBox' :: Box' f n -> Box' f m}
instance Succ1 (UnBox' r f) where
succ1 f = UnBox' (runUnBox' f . BoxS)
``` Ben SinclairDate: 2012-03-20T00:09Z

Nice article, from memory Agda's Data.Graph library does something similar for finite graphs. A slight error - in the second and third code blocks there are two "d" nodes, the second node should be "e". Christophe PoucetDate: 2012-03-20T09:19Z

Could you explain why Fin is numbers smaller than n?

I read that class, probably incorrectly as the axioms For Fin (succ n) is true If Fin n is true then Fin (succ n) is true. seems like all numbers bigger than n.

What am I missing? Twan van LaarhovenDate: 2012-03-20T09:32Z
I read that class, probably incorrectly as the axioms For Fin (succ n) is true If Fin n is true then Fin (succ n) is true. seems like all numbers bigger than n.

You shouldn't read it as an implication. If you make the n type parameter explicit, you get:

```-- for all natural numbers n, Fin0 is a 'number less than (n+1)'
Fin0 :: n -> Fin (Succ n)
-- if x is a 'number less than n', then FinS x is a 'number less than (n+1)'
FinS :: n -> Fin n -> Fin (Succ n)
``` Christophe PoucetDate: 2012-03-20T10:02Z

I see what you are saying, I admit that I am still missing your intuition. I don't see how you go from the type-definition to your verbal explanation.

Trying to clarify this more mechanically:

If I understand correctly, for type 'Fin n': Fin0 says: This is a data-constructor with as type 'Fin (Succ n)'. How does this ever terminate? FinS says: I am a function wrapping my own type and making a type 'Fin (Succ n)'.

Is this correct? How does that get to your intuition? How does 'Zero' ever enter the story? Twan van LaarhovenDate: 2012-03-20T12:41Z
If I understand correctly, for type 'Fin n': Fin0 says: This is a data-constructor with as type 'Fin (Succ n)'. How does this ever terminate? FinS says: I am a function wrapping my own type and making a type 'Fin (Succ n)'.

Termination doesn't become an issue, because it is up to the user of the type what n is in Fin (Succ n). Just like a regular type, say [a] is not a problem. It can be instantiated to [Int], but also to [[[[Int]]]].

Together the two constructors mean that a value of type Fin m can either be Fin0, but only if m ~ Succ n; or it can look like FinS x, but only if x :: Fin n and m ~ Succ n. If you have trouble understanding how this makes Fin n the type of numbers less than n, try to work out for yourself what the possible values of Fin (Succ (Succ Zero)) are.

Is this correct? How does that get to your intuition? How does 'Zero' ever enter the story?

It doesn't really, except in the sense that Fin0 can be of type Fin0 :: Succ Zero or Fin0 :: Succ (Succ Zero), etc. Unfortunately in Haskell there is no kind system to prevent you from writing Fin0 :: Succ Banana. applicativeDate: 2012-03-20T16:01Z

I fiddled about transposing this to use the new 'kind promotion' in ghc-7.4, so you just use `data Nat = Zero | Succ Nat` This excludes `Fin0 :: Fin (Succ Banana)` but permits `Fin0 :: Fin ('Succ 'Zero)` (using an apostrophe to mark the 'promotion' of constructors into the type system). It only took a few changes before everything compiled. I didn't get to the point of reasoning whether it would suggest a different approach to `fromTree` and `toTree` and so on. applicativeDate: 2012-03-20T16:02Z

Forgot to paste the link: http://hpaste.org/65615 Gabor GreifDate: 2012-03-21T04:34Z

End of last year I started implementing a graph unification algorithm with underlying Hamana-style typed search trees that can represent sharing and cycles. Added the mildly interesting feature that any non-pointer node represents a fresh {symbol, variable}. The code is here: http://code.google.com/p/omega/source/browse/mosaic/Unify.lhs Jorge AdrianoDate: 2012-03-21T18:44Z
Unfortunately in Haskell there is no kind system to prevent you from writing Fin0 :: Succ Banana.

You can restrict it using Type Classes.

```data Zero
data Succ n

class    NAT n    where
instance NAT Zero where
instance NAT n => NAT (Succ n) where

data Fin n where
Fin0 :: NAT n => Fin (Succ n)
FinS :: NAT n => Fin n -> Fin (Succ n)
``` applicativeDate: 2012-03-21T19:41Z

Jorge, you can do with with the kind system of ghc-7.4 as I show in my simple-minded paste, http://hpaste.org/65615 -- with the addition of data Banana = Banana

```*Main> let a = Fin0 :: Fin ('Succ 'Zero)
*Main> :t a
a :: Fin (Succ 'Zero)
*Main> let a = Fin0 :: Fin ('Succ 'Banana)

<interactive>:11:28:
Kind mis-match
The first argument of `Succ' should have kind `Nat',
but `Banana' has kind `Banana'
In an expression type signature: Fin (Succ Banana)
In the expression: Fin0 :: Fin (Succ Banana)
In an equation for `a': a = Fin0 :: Fin (Succ Banana)
``` Jorge AdrianoDate: 2012-03-21T23:15Z

Yes applicative. That was quite interesting, I didn't know about kind promotion. But I thought I'd present my solution in any case, since it relies on a much more well established extension.

Although I have to be honest and say that, as a first impression of kind polymorphism, it seems a bit confusing to see value constructors as types and types as kinds, on what is not supposed to be a language with dependent types. But I'll have to use it a bit before making up my mind.