generics-eot tutorial¶
This tutorial is meant to be read alongside with the haddock comments in Generics.Eot. Its source is a compiled haskell file, so we have to get some language pragmas and imports out of the way first:
{-# LANGUAGE DefaultSignatures #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MonoLocalBinds #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE UndecidableInstances #-}
module Generics.Eot.Tutorial where
import Data.Char
import Data.List
import Data.Typeable
import Generics.Eot
generics-eot allows roughly three different kinds of operations:
- Accessing meta information about ADTs (
datatypefor names,ProxyandEotfor field types). Example: Generation of database schemas for ADTs. - Deconstructing values generically (
toEot). Example: Serialization to a binary format. - Constructing values of an ADT generically (
fromEot). Example: Deserialization from a binary format.
Sometimes only one of the three forms is used but often multiple have to be
combined. For example serialization to JSON usually requires both datatype
and toEot.
1st Example: Meta Information Without Types: Field Names¶
This simple function extracts the names of all field selectors and returns them as a list:
namesOfFields :: HasEot a => Proxy a -> [String]
namesOfFields proxy =
nub $
concatMap (fieldNames . fields) $
constructors $ datatype proxy
where
fieldNames :: Fields -> [String]
fieldNames fields = case fields of
Selectors names -> names
_ -> []
And here’s proof that it works (using doctest):
data A = A1 {
foo :: String,
bar :: Int
}
| A2 {
bar :: Int,
baz :: Bool
}
deriving (Generic, Show)
-- $ >>> namesOfFields (Proxy :: Proxy A)
-- ["foo","bar","baz"]
The Generic instance: Don’t forget!!!¶
To be able to use generic functions that are written with generics-eot you
need to derive an instance for GHC.Generics.Generic (using DeriveGeneric)
for your ADTs. This will automatically give you an instance for HasEot.
When the instance for GHC.Generics.Generic is missing the type error messages
are unfortunately very confusing and unhelpful. They go something like this:
Couldn't match type ‘GHC.Generics.Rep WithoutGeneric’
with ‘GHC.Generics.D1 c f’
The type variables ‘c’, ‘f’ are ambiguous
In the expression: namesOfFields (Proxy :: Proxy WithoutGeneric)
So don’t forget: you need a Generic instance.
Eot: Isomorphic representations¶
Part of the type class HasEot is the type-level function Eot that maps ADTs
to isomorphic types. These isomorphic types are always a combination of
Eithers, tuples and the uninhabited type Void. For example this type:
data B = B1 Int | B2 String Bool | B3
deriving (Generic)
would be mapped to:
Either (Int, ()) (Either (String, (Bool, ())) (Either () Void))
Tip: Here’s how you can execute the type-level function Eot in ghci:
-- $ >>> :kind! Eot B
-- Eot B :: *
-- = Either (Int, ()) (Either ([Char], (Bool, ())) (Either () Void))
For the exact rules of this mapping see here: Eot.
If we have an ADT a then we can convert values of type a to this isomorphic
representation Eot a with toEot and we can convert in the other direction
with fromEot. Generic functions always operate on these isomorphic
representations and then convert from or to the real ADTs with fromEot and
toEot.
These generic isomorphic types are referred to as “eot” – short for
“Eithers of tuples”.
2nd Example: Deconstructing Values: Serialization¶
We start by writing a function that operates on the eot representations. The eot representations follow simple patterns and always look similar, but they don’t look exactly the same for different ADTs. For this reason we have to use a type class:
class EotSerialize eot where
eotSerialize :: Int -- ^ The number of the constructor being passed in
-> eot -- ^ The eot representation
-> [Int] -- ^ A simple serialization format
Now we need to write instances for the types that occur in eot types. Usually these are:
Either this next:- If as eot value we get
Left thisit means that the original value was constructed with the constructor that corresponds tothis. In this case we put the number of the constructor into the output and continue with serializing the fields of typethis. - If we get
Right restit means that one of the following constructors was the one that the original value was built with. We continue by increasing the constructor counter and serializing the value of typerest.
Note that this results in
EotSerializeclass constraints for boththisandrest. If we write the correct instances for all eot types these constraints should always be fulfilled.- If as eot value we get
instance (EotSerialize this, EotSerialize next) =>
EotSerialize (Either this next) where
eotSerialize n (Left fields) = n : eotSerialize n fields
eotSerialize n (Right next) = eotSerialize (succ n) next
Void: We need this instance to make the compiler happy, but it’ll never be used. If you look at the type you can also see that: an argument of typeVoidcannot be constructed. (Often the functionData.Void.absurdcomes in handy for implementing these cases.generics-eotre-exports both the typeData.Void.VoidandData.Void.absurdfor convenience.)
instance EotSerialize Void where
eotSerialize _n void = absurd void
(x, xs): Right-nested 2-tuples are used to encode all the fields for one specific constructor. Soxis the current field andxsare the remaining fields. To serialize this we serializex(usingserialize) and also write the length of the resulting list into the output. This will allow deserialization.Note: We could use
EotSerializeto serialize the fields. But that would be a bit untrue to the spirit, since the fields are not eot types. Apart from that we might want to encode a field of e.g. typeEither a bdifferently than the eot typeEither a b. So we use a very similar but distinct type class calledSerialize.The value of type
xscontains the remaining fields and will be encoded recursively witheotSerialize:
instance (Serialize x, EotSerialize xs) => EotSerialize (x, xs) where
eotSerialize n (x, xs) =
let xInts = serialize x
in length xInts : xInts ++ eotSerialize n xs
(): Finally we need an instance for the unit type that marks the end of the fields encoded in 2-tuples. Since()doesn’t carry any information, we can encode it as the empty list:
instance EotSerialize () where
eotSerialize _ () = []
This is the class Serialize:
class Serialize a where
serialize :: a -> [Int]
We give serialize a default implementation, but please ignore that for now.
It’ll be explained later in the section about
DefaultSignatures:
default serialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
serialize = genericSerialize
Serialize is used to serialize every field of the used ADTs, so we need
instances for all of them:
instance Serialize Int where
serialize i = [i]
instance Serialize String where
serialize = map ord
instance Serialize Bool where
serialize True = [1]
serialize False = [0]
instance Serialize () where
serialize () = []
To tie everything together we provide a function genericSerialize that
converts a value of some ADT into an eot value using toEot and then uses
eotSerialize to convert that eot value into a list of Ints.
genericSerialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
genericSerialize = eotSerialize 0 . toEot
And it works too:
-- $ >>> genericSerialize (A1 "foo" 42)
-- [0,3,102,111,111,1,42]
-- >>> genericSerialize (A2 23 True)
-- [1,1,23,1,1]
3rd Example: Constructing Values: Deserialization¶
Deserialization works very similarly. It differs in that the functions turn
lists of Ints into eot values.
Here’s the EotDeserialize class with instances for:
Either this nextVoid(x, xs)()
class EotDeserialize eot where
eotDeserialize :: [Int] -> eot
instance (EotDeserialize this, EotDeserialize next) =>
EotDeserialize (Either this next) where
eotDeserialize (0 : r) = Left $ eotDeserialize r
eotDeserialize (n : r) = Right $ eotDeserialize (pred n : r)
eotDeserialize [] = error "invalid input"
instance EotDeserialize Void where
eotDeserialize _ = error "invalid input"
instance (Deserialize x, EotDeserialize xs) =>
EotDeserialize (x, xs) where
eotDeserialize (len : r) =
let (this, rest) = splitAt len r
in (deserialize this, eotDeserialize rest)
eotDeserialize [] = error "invalid input"
instance EotDeserialize () where
eotDeserialize [] = ()
eotDeserialize (_ : _) = error "invalid input"
And here’s the Deserialize class plus all instances to deserialize the
fields:
class Deserialize a where
deserialize :: [Int] -> a
instance Deserialize Int where
deserialize [n] = n
deserialize _ = error "invalid input"
instance Deserialize String where
deserialize = map chr
instance Deserialize () where
deserialize [] = ()
deserialize (_ : _) = error "invalid input"
instance Deserialize Bool where
deserialize [0] = False
deserialize [1] = True
deserialize _ = error "invalid input"
And here’s genericDeserialize to tie it together. It uses
eotDeserialize to convert a list of Ints into an eot value and then
fromEot to construct a value of the wanted ADT.
genericDeserialize :: (HasEot a, EotDeserialize (Eot a)) => [Int] -> a
genericDeserialize = fromEot . eotDeserialize
Here you can see it in action:
-- $ >>> genericDeserialize [0,3,102,111,111,1,42] :: A
-- A1 {foo = "foo", bar = 42}
-- >>> genericDeserialize [1,1,23,1,1] :: A
-- A2 {bar = 23, baz = True}
And it is the inverse of genericSerialize:
-- $ >>> (genericDeserialize $ genericSerialize $ A1 "foo" 42) :: A
-- A1 {foo = "foo", bar = 42}
4th Example: Meta Information with types: generating SQL schemas¶
Accessing meta information including the types works very
similarly to deconstructing or constructing values. It uses the same
structure of type classes and instances for the eot-types. The difference is:
since we don’t want actual values of our ADT as input or output we operate on
Proxys of our eot-types.
As an example we’re going to implement a function that generates SQL statements that create tables that our ADTs would fit into. To be able to use nice names for the table and columns we’re going to traverse the type-less meta information (see 1st Example) at the same time.
(Note that the generated SQL statements are targeted at a fictional
database implementation that magically understands Haskell types like
Int and String, or rather [Char].)
Again we start off by writing a class that operates on the eot-types. Besides
the eot-type the class has an additional parameter, meta, that will be
instantiated by the corresponding types used for untyped meta information.
class EotCreateTableStatement meta eot where
eotCreateTableStatement :: meta -> Proxy eot -> [String]
Our first instance is for the complete datatype. eot is instantiated to
Either fields Void. Note that this instance only works for ADTs with
exactly one constructor as we don’t support types with multiple constructors.
meta is instantiated to Datatype which is the type for meta information
for ADTs.
instance EotCreateTableStatement [String] fields =>
EotCreateTableStatement Datatype (Either fields Void) where
eotCreateTableStatement datatype Proxy = case datatype of
Datatype name [Constructor _ (Selectors fields)] ->
"CREATE TABLE " :
name :
" COLUMNS " :
"(" :
intercalate ", " (eotCreateTableStatement fields (Proxy :: Proxy fields)) :
");" :
[]
Datatype _ [Constructor name (NoSelectors _)] ->
error ("constructor " ++ name ++ " has no selectors, this is not supported")
Datatype name _ ->
error ("type " ++ name ++ " must have exactly one constructor")
The second instance is responsible for creating the parts of the SQL
statements that declare the columns. As such it has to traverse the fields
of our ADT. eot is instantiated to the usual (x, xs). meta is
instantiated to [String], representing the field names. The name of the
field type is obtained using typeRep, therefore we need a Typeable x
constraint.
instance (Typeable x, EotCreateTableStatement [String] xs) =>
EotCreateTableStatement [String] (x, xs) where
eotCreateTableStatement (field : fields) Proxy =
(field ++ " " ++ show (typeRep (Proxy :: Proxy x))) :
eotCreateTableStatement fields (Proxy :: Proxy xs)
eotCreateTableStatement [] Proxy = error "impossible"
The last instance is for (). It’s needed as the base case for
traversing the fields and as such returns just an empty list.
instance EotCreateTableStatement [String] () where
eotCreateTableStatement [] Proxy = []
eotCreateTableStatement (_ : _) Proxy = error "impossible"
createTableStatement ties everything together. It obtains the meta
information through datatype passing a Proxy for a. And it creates a
Proxy for the eot-type Proxy :: Proxy (Eot a). Then it calls
eotCreateTableStatement and just concats the resulting snippets.
createTableStatement :: forall a . (HasEot a, EotCreateTableStatement Datatype (Eot a)) =>
Proxy a -> String
createTableStatement proxy =
concat $ eotCreateTableStatement (datatype proxy) (Proxy :: Proxy (Eot a))
As an example, we’re going to use Person:
data Person
= Person {
name :: String,
age :: Int
}
deriving (Generic)
And here’s the created SQL statement:
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy Person)
-- CREATE TABLE Person COLUMNS (name [Char], age Int);
If we try to use an ADT with multiple constructors, we get a type error due to a missing instance:
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy A)
-- <BLANKLINE>
-- ...
-- • No instance for (EotCreateTableStatement
-- Datatype
-- (Either ([Char], (Int, ())) (Either (Int, (Bool, ())) Void)))
-- arising from a use of ‘createTableStatement’
-- ...
If we try to use it with an ADT with a single constructor but no selectors, we get a runtime error:
data NoSelectors
= NotSupported Int Bool
deriving (Generic)
-- $ >>> putStrLn $ createTableStatement (Proxy :: Proxy NoSelectors)
-- *** Exception: constructor NotSupported has no selectors, this is not supported
-- ...
DefaultSignatures¶
There is a GHC language extension called DefaultSignatures. In itself it has little to do with generic programming, but it makes a good companion.
How DefaultSignatures work¶
Imagine you have a type class called ToString which allows to convert
values to Strings:
class ToString a where
toString :: a -> String
You can write instances manually, but you might be tempted to give the
following default implementation for toString:
toString = show
The idea is that then you can just write down an empty ToString instance:
instance ToString A
and you get to use toString on values of type A for free, because A has
a Show instance.
But that default implementation doesn’t work, because in the class declaration
we don’t have an instance for Show a. ghc says:
Could not deduce (Show a) arising from a use of ‘show’
from the context (ToString a)
One solution would be to make ToString a subclass of Show, but then we
cannot implement ToString instances manually anymore for types that don’t
have a Show instance. DefaultSignatures provide a better solution. The
extension allows you to further narrow down the type for your default
implementation for class methods:
class ToString2 a where
toString2 :: a -> String
default toString2 :: Show a => a -> String
toString2 = show
Then writing down empty instances works for types that have a Show instance:
instance ToString2 Int
-- $ >>> toString2 (42 :: Int)
-- "42"
Note: if you write down an empty ToString2 instances for a type that
does not have a Show instance, the error message looks like this:
No instance for (Show NoShow)
This might be confusing especially since haddock docs don’t list the default
signatures or implementations and users of the class might be wondering why
Show comes into play at all.
How to use DefaultSignatures for generic programming¶
DefaultSignatures are especially handy when doing generic programming.
Remember the type class Serialize from the
second example? In that
example we used it to serialize the fields of our ADTs in the generic
serialization through genericSerialize and EotSerialize. We just assumed
that we would have a manual implementation for all field types. But we also
gave it a default implementation for serialize in terms of
genericSerialize:
default serialize :: (HasEot a, EotSerialize (Eot a)) => a -> [Int]
serialize = genericSerialize
Note that the default implementation has the same class constraints as
genericSerialize.
Now we can write empty instances for custom ADTs:
data C
= C1 Int String
deriving (Generic)
instance Serialize C
You could say that by giving this empty instance we give our blessing to
use genericSerialize for this type, but we don’t have to actually implement
anything. And it works:
-- $ >>> serialize (C1 42 "yay!")
-- [0,1,42,4,121,97,121,33]
Important is that we still have the option to implement instances manually
by overwriting the default implementation. This is needed for basic types
like Int and Char that don’t have useful generic representations. But it
also allows us to overwrite instances for ADTs manually. For example you may
want a certain type to be serialized in a special way that deviates from the
generic implementation or you may implement an instance manually for
performance gain.