Saturday 20 August 2016

On type level programming in Haskell: Storing constraints

In the previous article, I described the difference between types and constraints in the Haskell type system, with the aiming of being able to create a value container, similar to the standard Dynamic type except that we can use it with a type class (which is a constraint not a type) without needing to extract the value itself (which we can only do if we know its type). Essentially, we need a way to take a constraint and store it (or, to use the term commonly used for this operation, reify it). But in order to understand how we can work with Haskell type constraints more directly, we first need to examine what they actually are at a lower level.

Type classes and dictionaries

As we discovered in the last article, a type class is a constraint on a type, requiring it to implement a certain set of functions. These functions' types are adapted to the arguments of the type class, for instance when we provide an instance of the Eq type class for a new type MyType, the (==) operator's type changes from Eq a => a -> a -> Bool* to the much simpler MyType -> MyType -> Bool. But the question that remains is how we find those functions when we need to invoke them. Consider this code:

equalYesOrNo :: Eq a => a -> a -> String
equalYesOrNo first second = case first == second of
                               True -> "Yes"
                               False -> "No"

This is a polymorphic function that will work with any argument type as long as it has an implementation of the Eq type class. But each of those types has its own implementation of (==), so how do we know which one to call? One possible answer is to create a new version of the function for each type that it will be used with, but in a language with recursive type constructors, this isn't really an option, as we can't necessarily know what types will exist in the program in advance; consider for example the use of a type data Nat=Z|S Nat which (with the -XDataKinds extension) can be used to build arbitrarily large types in a program like this:

{-# LANGUAGE DataKinds, ScopedTypeVariables, KindSignatures #-}

import Data.Proxy

data Nat = Z | S Nat

class NatToInt (nat :: Nat) where
    natToInt :: proxy nat -> Int

instance NatToInt Z where
    natToInt _ = 0
instance NatToInt n => NatToInt (S n) where
    natToInt _ = 1 + natToInt (Proxy :: Proxy n)

f :: forall (nat :: Nat) . NatToInt nat => Proxy nat -> IO Int
f p = readLn >>= \ x -> if x == 1 then f (Proxy :: Proxy (S nat))
                                  else return $ natToInt p

So we need a way of parameterising the type of a value when passing it to a polymorphic function so that the function can select the right instance when calling its class functions. The way Haskell does this is to use a structure called a dictionary that contains references to the appropriate functions; one such dictionary is passed for each class constraint on a function's types as an extra parameter to the function.

Dictionaries get passed around and stored whenever a polymorphic type value is passed around or stored. So you can put one into a data structure like this (using GADT syntax, -XGADTs):

data ShowBox where
  ShowBox :: Show a => a -> ShowBox

You can then use pattern matching on the structure to implicitly import the dictionary into a context:

showShowBox :: ShowBox -> String
showShowBox (ShowBox a) = show a

Haskell still doesn't know the type of the value 'a' in that last function, but it doesn't need to because it has the type's dictionary for its Show instance stored away quietly in the ShowBox value.

Note that this fails:

data EqBox where
  EqBox :: Eq a => a -> EqBox
boxesEqual :: EqBox -> EqBox -> Bool
boxesEqual (EqBox a) (EqBox b) = a == b

The reason for this is that Haskell doesn't know whether or not the values 'a' and 'b' have the same type, so can't send either of them to the dictionary defined for the other.

So, by putting a constraint on a structure, we can store the constraint for later use. But there's one more thing that we'd like to be able to do: define a structure that can store any constraint.

Using ConstraintKinds

The ConstraintKinds extension allows us to do this. Using ConstraintKinds, we can turn a constraint into a type (sort-of, at least: it's a type that is not allowed to be applied to values, but it can be tracked by the Haskell type system, stored in type variables, and so on). Specifically, a type class constraint like Show becomes a type constructor of kind * -> Constraint, which shows that it is applied to a type that is applicable to values, i.e. kind '*', to give a resulting type 'Constraint'; anything that has type 'Constraint' can go into a context, i.e. on the left of '=>' in a type signature. So we can now extend our Box ADT to accept any type class constraint:

data AnyBox (c :: * -> Constraint) where
  AnyBox :: c a => a -> AnyBox

This now allows us to arbitrarily store and use any type class constraint we want. The 'showBox' example could now become:

showAnyBox :: AnyBox Show -> String
showAnyBox (AnyBox a) = show a

Now there's only one thing left to consider before we can replicate Dynamic with the ability to store a type class constraint for the value: how do we get the value back out of a Dynamic if the Haskell compiler forgets about the type when it gets stored? We'll look at that next time.

No comments:

Post a Comment