How Long is your List?

Posted on January 15, 2022 by Jack Kelly

Tags: haskell, coding

Quite a few of my favourite Haskell idioms involve the Foldable instance for Maybe:

-- This is reimplemented all over the place as `whenJust`.
-- Pass our `a` to the function, if we have one,
-- and ignore its result; return `pure ()` otherwise.
for_ :: (Foldable t, Applicative f) => t a -> (a -> f b) -> f ()
for_ @Maybe :: Applicative f => Maybe a -> (a -> f b) -> f ()

-- Equivalent to `Data.Maybe.fromMaybe mempty`:
-- Return the `m` if we have one; otherwise, return `mempty`.
fold :: (Foldable t, Monoid m) => t m -> m
fold @Maybe :: Monoid m => Maybe m -> m

-- Equivalent to `maybe mempty`:
-- Pass our `a` to our function, if we have one;
-- otherwise, return `mempty`.
foldMap :: (Foldable t, Monoid m) => (a -> m) -> t a -> m
foldMap @Maybe :: Monoid m => (a -> m) -> Maybe a -> m

Some of these confuse people more than I think they should, so this post aims to help with that. Instead of looking at Maybe a as “just-a-or-nothing”, the key is to become comfortable with Maybe as “list of zero or one elements”. We’ll also go looking for other types which can be seen as “lists of some number of elements”.

In Haskell, it’s often not practical or ergonomic to track exact lengths of lists at the type level. Let’s instead reflect on some ancient wisdom, and think about lists that have at {least,most} {zero,one,infinitiely many} elements. There are six sensible cases, and most of them exist in base:

Proxy is a list of exactly zero elements,
Maybe is a list of exactly zero or one elements,
[] (list) is a list of at least zero and at most infinity elements,
Identity is a list of exactly one element,
NonEmpty is a list of at least one and at most infinity elements, and
Infinite streams are lists of infinitely elements.

Historical aside: the “zero/one/infinity” principle

The “zero/one/infinity” principle comes from Dutch computer pioneer Professor Willem van der Poel, and is preserved on the C2 Wiki via the Jargon File:

Allow none of foo, one of foo, or any number of foo. … The logic behind this rule is that there are often situations where it makes clear sense to allow one of something instead of none. However, if one decides to go further and allow N (for N > 1), then why not N+1? And if N+1, then why not N+2, and so on? Once above 1, there’s no excuse not to allow any N; hence, infinity.

Having paid respects to our forefathers, let us now examine each data type in turn, in an order that lets us use our existing intuitions about lists.

The Lists

1 ≤ length ≤ ∞: `NonEmpty`

The NonEmpty type is defined in Data.List.NonEmpty:

data NonEmpty a = a :| [a]

This has been in base since base-4.9.0.0 (GHC 8.0.1). Knowing that your list has at least one element gives you a lot of power:

A whole slew of functions which may fail on [] are much safer on NonEmpty: head, tail, minimum, maximum, foldr1, foldl1, &c.
All of the Foldable operations can work over Semigroup instead of Monoid, as witnessed by the Data.Semigroup.Foldable.Foldable1 class in the semigroupoids package:
```
foldMap1 :: (Foldable1 t, Semigroup g) => (a -> g) -> t a -> g
```
Similarly, functions from Data.Foldable that needed Applicative or Alternative have weakened versions that only need Apply or Alt, respectively.
You get a Comonad instance, if you’re into that sort of thing,

length = ∞: infinite streams

It’s not in base, but it’s easy to write a type representing an infinite stream of as, which is the same as [] with the “nil” case removed:

data Stream a = Stream a (Stream a)

Since you have at least one element, you get a lot of the same things as NonEmpty: safe head/tail/…, instances for Foldable1 and Comonad, &c.
You have to be careful that the semigroups/monoids you use when folding are lazy enough to terminate.
For take :: Integer -> Stream a -> [a], you gain the property that length . take n = n, which is cute.

0 ≤ length ≤ 1: `Maybe`

data Maybe a = Nothing | Just a

Maybe a is conventionally taught as a better answer to NULL — “either you have the thing, or you don’t” — but it’s also valid to consider Maybe a a list of exactly zero or one as.

This is a really useful perspective to stick in your mind, especially when writing do-expressions: it shows that “do I have this thing? if so, do x” and “do a thing with each element of the collection (a for-each loop)” are in fact the same concept.

length = 1: `Identity`

The Identity type is defined in Data.Functor.Identity:

newtype Identity a = Identity { runIdentity :: a }

This has been in base since base-4.8.0.0 (GHC 7.10.1). While it’s not a very exciting type on its own, it’s useful as a “add no special structure” option when you need to provide a “thing” of kind Type -> Type.

length = 0: `Proxy`

The Proxy type is defined in Data.Proxy:

data Proxy a = Proxy

This has been in base since base-4.7.0.0 (GHC 7.8.1). It was first introduced as a safe way to pin polymorphic functions to specific types. For example, servant-server uses it in the serve function to select the type of the API being served:

serve :: HasServer api '[] => Proxy api -> Server api -> Application

Proxy has (trivial) instances for Foldable, Traversable, and all kinds of other typeclasses, which makes it useful as a “do nothing” option if you need to provide a “thing” of kind Type -> Type.

Polymorphism over `Foldable`s

Once you’re used to thinking of all these types as lists, you can parameterise your structures over some Foldable and get some useful results. Here is a recent example from my work on amazonka, the Haskell AWS SDK:

To make a request to AWS, you need to provide an environment, which almost always contains credentials used to sign requests. (It’s possible to exchange a web identity token for temporary credentials, which involves an unsigned request.) We seek a solution with the following properties:

Type-level information about whether or not we have credentials,
Library users should statically know whether their Env has credentials or not,
Library users should statically know whether a function requires credentials or is indifferent to their presence, and
Not too many type system extensions.

A Maybe Auth inside Env would satisfy none of the first three properties. The solution currently in amazonka looks something like this:

Parameterise the Env by some withAuth :: Type -> Type, and set up type aliases:

data Env' withAuth = Env {
  auth :: withAuth Auth
  -- other fields omitted
}

type Env = Env' Identity
type EnvNoAuth = Env' Proxy

Return Env from functions which guarantee the presence of credentials, and EnvNoAuth for functions which lack them. This gives us property (2).
In function arguments, specify the environment we want as follows:
- accept Env where we require credentials,
- Env' withAuth where we are indifferent to their presence, or
- Foldable withAuth => Env' withAuth where we want to branch on whether or not they’re available.
This gives us property (3).

If we don’t know the type of withAuth, we can use Foldable to give us the “first” Auth, if one exists:

-- Essentially `headMay :: [a] -> Maybe a`, but written as a
-- `foldr` from the `Foldable` typeclass.
envAuthMaybe :: Foldable withAuth => Env' withAuth -> Maybe Auth
envAuthMaybe = foldr (const . Just) Nothing . auth

If you squint, you might be able to see that instead of storing a Maybe inside the Env structure, we’ve done so at the type level. Instead of the value constructors Just and Nothing, we have the Foldables Identity and Proxy.

Exercise: Write a pair of functions which witness the isomorphism between Maybe a and Sum Proxy Identity a. (Sum comes from Data.Functor.Sum).

Solution

{-# LANGUAGE LambdaCase #-}

import Data.Functor.Identity (Identity (..))
import Data.Functor.Sum (Sum (..))
import Data.Proxy (Proxy (..))

from :: Maybe a -> Sum Proxy Identity a
from = maybe (InL Proxy) (InR . Identity)

to :: Sum Proxy Identity a -> Maybe a
to = \case
  InL Proxy -> Nothing
  InR (Identity a) -> Just a

Conclusions

For nearly any sensible combination of “list with at {least,most} {zero,one,infinitely many} elements”, there exists a type in base whose structure ensures those guarantees.
If you internalise this idea for Maybe in particular, you’ll see that many ad-hoc “handle the Nothing” operations can be replaced with functions that work on any Foldable.
By parameterising fields in your data types over some Foldable f, you can offer changing guarantees about what values are available when, without needing type-level programming.

How Long is your List?

The Lists

1 ≤ length ≤ ∞: NonEmpty

length = ∞: infinite streams

0 ≤ length ≤ 1: Maybe

length = 1: Identity