Andrue's Lab

Notes on Chapter 3

Two core principles of functional programming are that we don't update variables and we don't modify mutable data structures. This begs a few questions: what data structures can we use like this?, how do we define them in TypeScript?, and how do we operate on them?.

This chapter is all about answering these questions with the concept of functional data structures. We'll learn about how to define data types, how to use pattern matching to discern data types, and how to write generalized pure functions.

Functional Data Structures

What exactly is a pure functional data structure? Well, it's exactly what it sounds like: data and structure. In fact it's exactly the same data and structure as in other languages, the only difference is in the implementation.

Okay, great, but what exactly is different? Well, the implementation of a data structure is typically some definition of the structure, such as a type, class, or struct, and some definition of the means of constructing, changing, and extracting the data, such as functions, methods, or procedures.

With pure functional data structures, we take the approach of using types to define the structure of the data and pure functions to construct, change, and extract the data.

This leads to the very desirable trait of immutability. Every data structure remains constant. Much like how numbers don't change when you add them, now our complex data structures will remain the same when we use functions on them! Imagine a complex data structure such as a tree or a graph in which calling insert leaves the original untouched, returning a new fresh copy [1]. Think of how easy it would be to reason about the code, how easy it would be to debug, how easy it would be to serialize, or to create a command pattern! How easy it would be to save a history of the structure to the disk! Time-travel debugging! Immutable data structures are a powerful tool.

[1]: A copy is not typically made in a literal sense in the underlying generated machine code as that would be very inefficient. Don't worry about machine representations, let the compiler do its job. This might be a challenge for you if you've got experience in other languages with operational semantics that translate to "doing things" like "returning values" and "calling functions." but keep in mind that pure functional programming is about describing your algorithms and data structures as compositions of pure functions. Imagine yourself like Neo from The Matrix: this is your "try to realize the truth...there is no spoon" moment.

A long-winded aside on mainstream appeal

At this point you may be asking yourself "Why hasn't this been done before?" or "Why isn't this the default mode of programming?" or "Why isn't my favorite social media personality using this?" and while there are many approaches to take to answer these questions, I'd like to specifically tackle the social aspect.

The truth is, each of us makes decisions based entirely on our own personal experience, preferences, and circumstances and unfortunately we don't know what we don't know. This is especially true for things (like functional programming) that are abstract and require training, education, or context to understand. As you've probably seen at this point, functional programming is one of the least accessible paradigms to newcomers due to the up-front cost of acquiring the foundational knowledge to be able to apply it.

For example, this means that, as you're developing in your programming career, even though you're likely to be introduced to the concepts behind functional programming, you're very unlikely to be able to understand and contextualize them. The value of modeling your code as a composition of pure functions, enabling things like referential transparency and the substitution model (and the qualities they bring such as reusability, modularity, testability, maintainability, etc) isn't necessarily apparent until you've already invested significant time and energy into learning more common paradigms. The psychology of the sunk cost fallacy, increasing responsibility as one ages, having less time dedicate to learning, and numerous other circumstances both personal and societal, makes it very unlikely that we ever have the time or desire to learn a new programming paradigm. After all, if the paradigm you've got ain't broke, don't fix it, right?

At the end of the day, programming is about configuring a machine's behavior to solve a problem, and it's entirely possible (and certainly requires less up-front investment) to do that by providing it a list of increasingly detailed instructions. It's much faster to get started programming imperatively, but the skill ceiling is much lower. All you really need to get started is a reference for the commands the machine can take, no abstract math required! Once you gain some experience, you start doing the simulation in your head, predicting what the machine will do, able to move even faster. Unfortunately this is where you hit the ceiling. Our human brains have a hard cap of 7-9 things we can hold in short-term memory. This one fact means many programming paradigms have a very low maximum efficacy. Functional programming, however, is not limited by short-term memory, as you don't need to simulate the state of the machine or keep track of the value of variables over time due to immutability and referential transparency. Every part of a functional program can be understood in isolation. We're able to avoid the hard limit of short-term memory because we don't need to simulate the program in our heads, we don't need to hold the value of variables in our mind while trying to predict what the machine will do. In other words, functional programming is a real life zero-cost abstraction. All of our brain's processing power goes toward the solution, none of it goes toward simulating the machine. It's the ultimate realization of the promise of high-level languages.

A ubiquitous data structure made functional

Getting back to functional data structures, we're going to take a look at one of the most ubiquitous data structures and see how it can be implemented in a purely functional way. We're going to implement a singly-linked list, but we're going to have to take a bit of a detour first to learn about a functional programming technique called pattern matching.

If you're coming from a more procedural or imperative programming language, you could think of pattern matching is a way of "branching" based on the structure of a data type. A more functional way to think of it is as the declaration of a logical implication.

Aside: One benefit of programming in a functional style (or any high-level paradigm really) is that we sufficiently abstract ourselves from lower-level implementation details such that the compiler is able to generate efficient code for us. We want to be able to write code that describes the algorithmic solution to our computational problem rather than concerning ourselves with specifics of how or where the code will run.

An example of this abstraction in TypeScript would be using the array prototype functions (map, filter, reduce, etc) to describe how to transform an array as opposed to using mutable variables and for-loops. Language devs can improve the implementation of the array prototype functions and we reap the benefits without requiring any changes to our code because these are pure, declarative functions that describe a computation e.g. "this array is that array with each element multiplied by 2" or "this array is that array with all elements greater than 10 removed" or "this array is the sum of all elements in that array."

If we were to implement the algorithms using loops and mutation, there would be no possibility to take advantage of transparent algorithmic optimizations because they describe the how of the computation rather than the what. For the most part, loops and mutation are as fast as they can be. Mutation prevents the possibility of (direct) parallelization, unlike with pure functions.

Pattern Matching in TypeScript

Pattern matching is a way of branching based on the structure of a data type. In its simplest form, this requires two things: (1) branching and (2) a way to discern the structure of a data type.

We'll start with the simplest case: a pattern that matches any structure. For this we will create a Case type to accept a parameter which can identify the "anything" pattern. This will allow cases to identify that we should match anything in that position in the data structure:

type Case = ( anythingPattern: any ) => [typeof anythingPattern, (value: any) => any]; const match = (...cases: Case[]) => (value: any): any => { if (cases.length === 0) return null; const anythingPattern = {}; const [firstCase, ...restCases] = cases; const [firstPattern, firstFunc] = firstCase(anythingPattern); if (firstPattern === anythingPattern) return firstFunc(value); return match(...restCases)(value); }; const multiFunc = match((_) => [_, (value) => typeof value]); console.log( multiFunc("hello!"), // => "string" multiFunc(3), // => "number" multiFunc(true), // => "boolean" multiFunc({ message: "FP is cool" }) // => "object" );

Next we'll want to be able to match against a specific value. This is pretty simple, we'll just add a check for equality:

const match = (...cases: Case[]) => (value: any): any => { if (cases.length === 0) return null; const anythingPattern = {}; const [firstCase, ...restCases] = cases; const [firstPattern, firstFunc] = firstCase(anythingPattern); if (firstPattern === anythingPattern) return firstFunc(value); if (firstPattern === value) return firstFunc(value); // <== Added check for equality return match(...restCases)(value); };

We're getting somewhere now, but this is still basically just a fancy switch statement. Matching a specific value is a good start, but we want to be able to match against complex data structures. In TypeScript, these data structures typically come in the form of arbitrarily nested objects and / or arrays.

Let's continue toward this goal by implementing matching against objects. To start, we should refactor a bit. We'll extract the check for a matching pattern into its own function:

const matchPattern = (anythingPattern: any) => (pattern: any, value: any) => { // Any if (pattern === anythingPattern) return true; // Exact Value if (pattern === value) return true; // Not equal + Not an object = Not a match if (typeof pattern !== "object") return false; if (typeof value !== "object") return false; // Null pattern or value is never a match if (pattern === null) return false; if (value === null) return false; // We're not handling arrays yet if (Array.isArray(pattern)) return false; // Objects const matchKeys = (keys: string[]): boolean => { if (keys.length === 0) return true; const [firstKey, ...restKeys] = keys; if (!(firstKey in value)) return false; if (!matchPattern(anythingPattern)(pattern[firstKey], value[firstKey])) return false; return matchKeys(restKeys); }; return matchKeys(Object.keys(pattern)); };

Next we just need to update the match function and we should be good to go:

const match = (...cases: Case[]) => (value: any): any => { if (cases.length === 0) return null; const anythingPattern = {}; const [firstCase, ...restCases] = cases; const [firstPattern, firstFunc] = firstCase(anythingPattern); if (matchPattern(anythingPattern)(firstPattern, value)) // ^== Use our new function to determine a pattern match return firstFunc(value); else return match(...restCases)(value); };

Perfect! Now we're really cooking with fire. We can match arbitrary values, specific values, and any object-shaped value. We're almost there, but we still need to be able to match against arrays.

Let's update our matchPattern function to check for matches with arrays. The implementation will look very similar to the check for an object pattern because an object is basically an array of key/value pairs.

const matchPattern = (anythingPattern: any) => (pattern: any, value: any) => { ... // Arrays const matchItems = (patternItems: any[], valueItems: any[]): boolean => { if (!Array.isArray(valueItems)) return false; if (valueItems.length < patternItems.length) return false; if (patternItems.length === 0) return true; const [firstPatternItem, ...restPatternItems] = patternItems; const [firstValueItem, ...restValueItems] = valueItems; if (!matchPattern(anythingPattern)(firstPatternItem, firstValueItem)) return false; return matchItems(restPatternItems, restValueItems); }; if (Array.isArray(pattern)) return matchItems(pattern, value); ... };

And because we abstracted the check for matching a pattern, we don't need to make any changes to our match function. Now THIS is pattern matching. We can match arbitrary values, specific values, any (arbitrarily nested) object-shaped value, and any (arbitrarily nested) array-shaped value. We're done! We've implemented pattern matching in TypeScript.

Now let's take a look at how we can use this to implement functions on a singly-linked list in a purely functional way. Let's start with the definition of the singly-linked list data structure:

type List<T> = { head: T; tail: List<T>; };

Notice that in the same way that a function can be polymorphic or "taking many forms", a data type can also be polymorphic via generic type parameters. In our example, the data type List<T> could be a list of numbers, strings, booleans, etc, or no list at all!

const nil: List<never> = { head: undefined as never, tail: undefined as never, };

The no list at all case is represented in TypeScript by the type List<never>. TypeScript's never type is what is known in type theory as a "bottom type" because it is a subtype of every other type. In other words, a never is a T but a T is not a never, and never itself cannot be constructed. This is why we had to cast head and tail in our nil value.

In our nil value we don't actually need a value in the head or tail properties (hence undefined) because the properties won't actually be accessed. The point of our nil value is that it acts as a link between the type system and the runtime system so we can branch appropriately (by comparing referential equality) in our functional algorithms while still maintaining type-level correctness.

Next we'll declare a data constructor cons (short for construct) to construct a list of an arbitrary type:

const cons = <T>(head: T, tail: List<T>): List<T> => ({ head, tail, });

For example:

const numberList: List<number> = cons(1, nil); // (1 -> nil) const stringList: List<string> = cons("hello", cons("world", nil)); // ("hello" -> ("world" -> nil)) const booleanList: List<boolean> = nil; // nil

Notice that our empty list nil is able to be assigned to a List<boolean> because never is a subtype of all other types.

Now let's put our pattern matching implementation to use by implementing a few list functions. First we'll implement a function which takes the sum of a list of numbers:

const sum = match( (_) => [nil, () => 0], (_) => [cons(_, _), (x) => x.head + sum(x.tail)] );

As one might expect, the sum function states that the sum of an empty list (nil) is zero, and the sum of a non-empty list is the first element (x.head) plus the sum of the remaining elements (sum(x.tail)).

Pretty neat! Let's see what a definition of the product of a list of numbers looks like:

const product = match( (_) => [nil, () => 1], (_) => [cons(0, _), () => 0], (_) => [cons(_, _), (x) => x.head * product(x.tail)] );

Likewise, the product function states that the product of an empty list is one, the product of a list starting with zero is zero, and the product of any other nonempty list is the first element multiplied by the product of the remaining elements.

Note that these are recursive definitions, just like the definition of the data type. It should become obvious that most computing tasks can be conceptualized in this way, it's only a matter of understanding the general principles and learning how to apply them.

These definitions are perfectly workable, but there's one more thing we can do to make pattern matching a little bit easier to work with and reduce duplicated code. You'll notice how in our result expressions we still have to access x.head and x.tail to use the values of the lists. This is undesirable because it means in general we need to know the implementation details of the structure we're matching against in order to make effective use of our implementation of pattern matching. As a fix, we will implement what I call "pattern marking", which is a way to give a name to parts of your pattern and reference those parts in the result expression.

With it, we can do something like the following and avoid the need to know implementation details like the specific names of the properties of a data structure:

const sum = match( (_) => [nil, () => 0], (_) => [cons(_`x`, _`xs`), ({ x, xs }) => x + sum(xs)] );

In this example we create a mark "x" and a mark "xs" and pass them into the list constructor to receive the head and the tail of the list (which is where those marks ended up due to our implementation of the list constructor) inside the result expression.

Let's take a look at how we can implement this in TypeScript. We'll start with a simple function that gets the value at a given path:

const getValueAtPath = (path: string[] | null, obj: any): any => { if (path === null) return null; if (path.length === 0) return obj; if (typeof obj !== "object") return null; const [head, ...tail] = path; return getValueAtPath(tail, obj[head]); };

In abstract terms you can think of this as a basic graph traversal algorithm for a graph with edges identified by strings, it simply follows the given edges and returns the value of the node at the end. We will use this in our pattern matching implementation in conjunction with the following function to extract the values we've marked:

const pathTo = (value: any, obj: any): string[] | null => { if (value === obj) return []; if (typeof obj !== "object") return null; if (obj === null) return null; if (Array.isArray(obj)) { const pathToInArray = (items: any[], index: number): string[] | null => { if (items.length === 0) return null; const [head, ...tail] = items; const path = pathTo(value, head); if (path) return [String(index)].concat(path); return pathToInArray(tail, index + 1); }; return pathToInArray(obj, 0); } const pathToInObject = (keys: string[]): string[] | null => { if (keys.length === 0) return null; const [head, ...tail] = keys; const path = pathTo(value, obj[head]); if (path) return [head].concat(path); return pathToInObject(tail); }; return pathToInObject(Object.keys(obj)); };

Again this can be thought of in abstract terms as a graph algorithm, it's a depth-first search returning a set of edges identified by strings. This function provides a way to lookup the path to a given mark so we can use that path to extract the values from the value passed into the match function.

Finally, we'll implement our "marking" function which allows us to create marks and populate an object with values from a search object at keys given by a marked object:

const markStructure = (obj: any) => { const marks: any[] = []; const createMark = (name: string) => { const mark = { name }; marks.push(mark); return mark; }; const lookupMarks = (markedObject: any) => { const getMarkValues = (items: any[], result: any): any => { if (items.length === 0) return result; const [head, ...tail] = items; const pathToMark = pathTo(head, markedObject); const valueAtMark = getValueAtPath(pathToMark, obj); return getMarkValues(tail, { ...result, [head.name]: valueAtMark }); }; return getMarkValues(marks, {}); }; return [createMark, marks, lookupMarks] as const; };

Aside: To be effective with functional programming, it will help to be familiar with concepts like closure and recursion.

Now all we need to do is update our matching implementation to support marks. We'll treat the createMark function as our anythingPattern to get the same syntax for matching "anything" as before. Because the original implementation was using an object and a referential equality check, this should behave exactly the same way. We also need to treat all of the marks as anything patterns to avoid false negatives. We'll pass in the list of marks and use includes() instead of passing in the anythingPattern:

const matchPattern = (marks: any[] = []) => (pattern: any, value: any) => { // Any if (marks.includes(pattern)) return true; ... // nothing else changed in this function }

Lastly, we'll need to update match to initialize the createMark and lookupMarks functions by calling markStructure with the value it receives, and then to pass createMark and the marks to the newly-updated matchPattern function. When it finds a match, we'll want it to transfer the marked values into the result object and pass it to the result expression:

const match = (...cases: Case[]) => (value: any): any => { if (cases.length === 0) return null; const [firstCase, ...restCases] = cases; const [createMark, marks, lookupMarks] = markStructure(value); // <== init createMark() and lookupMarks() const [firstPattern, firstFunc] = firstCase(createMark); if (matchPattern([createMark, ...marks])(firstPattern, value)) if (marks.length > 0) // ^== pass marks into updated markPattern() return firstFunc( lookupMarks(firstPattern) ); // ^== found a match, pass mark values into result expression else return firstFunc(value); else return match(...restCases)(value); };

There you have it, pattern matching (+ marking) in TypeScript!

Let's see what our updated sum and product definitions look like using this new call structure:

const sum = match( (_) => [nil, () => 0], (_) => [cons(_`x`, _`xs`), ({ x, xs }) => x + sum(xs)] ); const product = match( (_) => [nil, () => 1], (_) => [cons(0, _), () => 0], (_) => [cons(_`x`, _`xs`), ({ x, xs }) => x * product(xs)] );

Excellent! Exactly what we wanted. Let's create a little utility for building a list out of a variadic function so we don't have to write nested cons(cons(cons())) and then see some examples of sum and product in action:

const apply = (...list: any[]): List<any> => { if (list.length === 0) return nil; const [head, ...tail] = list; return cons(head, apply(...tail)); };

And then:

const list1 = apply(1, 2, 3, 4); // => (1 -> (2 -> (3 -> (4 -> nil)))) const result1 = sum(list1); // => (1 + 2 + 3 + 4) = 10 const list2 = apply(2, 2, 2); // => (2 -> (2 -> (2 -> nil))) const result2 = product(list2); // => (2 * 2 * 2) = 8

Perfect. Let's look at a few more examples of pattern matching:

match((_) => [_, () => 42])(apply(1, 2, 3)); // => 42

Here we're using the "anything pattern" here to match any expression and return the constant value 42.

match((_) => [cons(_`h`, _), ({ h }) => h])(apply(1, 2, 3)); // => 1

Here we're using a data constructor pattern in conjunction with marking to capture a subexpression of the target (the list (1 -> (2 -> (3 -> nil)))).

match((_) => [cons(_, _`t`), ({ t }) => t])(apply(1, 2, 3)); // => (2 -> (3 -> nil))

Here again we use a data constructor pattern with marking to capture a subexpression of the target, namely the 'tail' of the list.

match((_) => [nil, () => 42])(apply(1, 2, 3)); // => null

This time we get null as the result because there was no match.

Data Sharing in Functional Data Structures

You might be wondering, if the data is immutable, how do we write functions that, for example, add or remove elements from a list? It's pretty straightforward actually: when adding an element to a list all we need to do is create a new list with the element appended, for example:

const el = 1; const existingList = apply(2, 3, 4); const withEl = cons(el, existingList); // => (1 -> (2 -> (3 -> (4 -> nil))))

Since our lists are immutable, we don't actually need to copy existingList; we just reuse it. This is called data sharing. Sharing immutable data often lets us implement functions more efficiently - we can always return immutable data structures without having to worry about subsequent code modifying our data. There will be no need to include the idea of making defensive copies in our algorithms (e.g. to avoid modification or corruption) because our data structures are immutable!

Defensive copying can become a problem in large mutable programs. Each component of the program (e.g. procedures, classes) that the mutable data passes through needs to make its own copy of the data because other components can make modifications. Immutable data, on the other hand, is always safe to share and does not require defensive copying. On a sufficiently large scale, functional programming can be much more efficient than approaches that rely on side effects due to data sharing and reduced computation.

In the same was adding an item to a list, we can remove an element from the front of a list by simply returning its tail. There's not actually any destruction or removal going on:

const list = apply(1, 2, 3, 4); // => (1 -> (2 -> (3 -> (4 -> nil)))) const listWithout1 = list.tail; // => (2 -> (3 -> (4 -> nil)))

The term for this property of immutable data structures is persistent. We say that functional data structures are persistent, meaning that existing references are untouched when doing operations on the data structure.

The Efficiency of Data Sharing

A surprising example of data sharing is the following function that adds all the elements of one list to the end of another list:

const append = (list1, list2) => match( (_) => [nil, () => list2], (_) => [cons(_`h`, _`t`), ({ h, t }) => cons(h, append(t, list2))] )(list1); const l1 = apply(1, 2); // => (1 -> (2 -> nil)) const l2 = apply(3, 4); // => (3 -> (4 -> nil)) const result = append(l1, l2); // => (1 -> (2 -> (3 -> (4 -> nil))))

As you can see, the append function takes two lists and appends the elements of the second one to the first. The interesting part here is that the algorithm only makes copies equal to the number of elements in list1! When it matches nil in list1 it simply refers to list2 as the tail of the resulting list. Pretty neat.

If we were to implement this function with arrays, we'd have to copy every element of each array into the resulting array. In this case, the immutable linked list is much more efficient in both space and computational complexity.

With a singly-linked list, any time we want to replace a tail of the list, even if it's the last tail of the list, we have to make a copy of every item in the list. Writing efficient purely functional data structures is all about finding ways to exploit data sharing. In TypeScript, you can generally tell if data is being shared vs copied by whether you're creating a reference or a value. Our list implementation above uses a reference for the tail and a value for head - the tail is using data sharing while the head is copied.

It's entirely possible to create efficient implementations of sequences, vectors, trees, graphs, and so on in a purely functional way but we aren't going to cover these here.

Recursion Over Lists and Generalizing to Higher-Order Functions

Let's take another look at the definition of our sum and product functions. If we simplify the product implementation just a bit to avoid the short-circuiting logic of checking for zero, we get two very similar definitions:

const sum = match( (_) => [nil, () => 0], (_) => [cons(_`x`, _`xs`), ({ x, xs }) => x + sum(xs)] ); const product = match( (_) => [nil, () => 1], (_) => [cons(_`x`, _`xs`), ({ x, xs }) => x * product(xs)] );

The only differences here are the value to return in case the list is empty, and the operation applied to the head and tail of the lists. Any time we see duplication like this we can generalize it away by pulling out subexpressions into function arguments. We clearly need to parameterize two things: (1) the result given when matching an empty list and (2) the function to apply to combine the head and tail of the list.

Let's take a look at how that works:

const foldRight = <A, B>( list: List<A>, emptyResult: B, foldFunc: (a: A, b: B) => B ): B => match( (_) => [nil, () => emptyResult], (_) => [ cons(_`x`, _`xs`), ({ x, xs }) => foldFunc(x, foldRight(xs, emptyResult, foldFunc)), ] )(list);

Normally our match function implicitly takes the list we're operating on as the parameter, with foldRight we need to specify it and pass it through.


Why "foldRight"?

It's an analogy; something that allows a transfer of understanding from one domain to another.

Let's break it down into the two components of the name, "fold" and "right" and examine why we call it that:

"Fold"

"Fold" is an analogy to the folding of paper, it invokes the idea of the shape and orientation of paper and the motion of adding a crease and layering it upon itself. Consider how paper starts flat and when folded, becomes stacked. Also consider how the folding of paper causes a delineation between the two sides, one side leaving its original position to be stacked upon the other.

Now take this knowledge and imagine what it might mean to "fold" a list. There is already a clear delineation to our List structure: head and tail, so clearly the fold will be made there, but what is the analogous component in our list domain to the stacking of the paper after the fold? Well, once you complete a fold, with paper, there ceases to be two identifiable parts. The crease disappears and you're left with a single piece of stacked paper. Clearly, this stacking effect must be analogous to the "stacking" effect of the list-fold operation; in our example case, "stacking" numbers is simply adding them together, creating a "taller" "stacked" number. So in simple terms, folding the paper is analogous to adding the numbers. The important part is understanding how this analogy can be applied to any operation, not just addition.

"Right"

Hopefully it's clear what the "fold" part of the analogy means now, but what about "right"? Why are we saying the fold happens "right"? Isn't that the opposite of what's happening? Doesn't the head of the list exist to the left while the tail of the list exists to the right? Doesn't it make more sense to fold the dangling value than to fold the rest of the list? Doesn't our fold function only work on single items?

You're correct. When we say "fold right" it's not an instruction of which direction to "bring the paper for the fold", it's actually a description of how the fold works. We're not "folding to the right", we're "folding starting from the (far) right."

You see, it is true that we start at the beginning of the list with the first item, e.g. the head, but the problem is that we don't have the rest of the list in foldable terms, all we have is another list: tail. How do we solve this? Well, taking a look at our definition of foldRight, the answer is clear: to complete the fold, we have to first fold the list on the right! If you follow the execution of the function, you'll see that we actually "pause" folding each head to "start with" the tail of the list because we're immediately going into a recursive call to get the value! While it may look like we're folding from the left, in fact, we're starting the fold on the (far) right and bringing each of those folds to the left into the second argument slots. This is why we call it "fold right": the first folds happen on the right. It's a bit counter-intuitive, but it makes sense once you realize what's happening. The key is understanding the order of evaluation and recognizing the recursion.


Now that we have that generalized function, let's take a look at what our new implementations of sum and product look like using it:

const sum = (list: List<number>) => foldRight(list, 0, (a, b) => a + b); const product = (list: List<number>) => foldRight(list, 1, (a, b) => a * b);

Beautiful. Let's implement a few more list functions.

A function that takes the top n items:

const take = <A>(list: List<A>, n: number): List<A> => match( (_) => [nil, () => nil], (_) => [ cons(_`h`, _`t`), ({ h, t }) => (n === 0 ? nil : cons(h, take(t, n - 1))), ] )(list);

A function that returns the longest prefix of items that pass the predicate function:

const takeWhile = <A>(list: List<A>, f: (a: A) => boolean): List<A> => match( (_) => [nil, () => nil], (_) => [ cons(_`h`, _`t`), ({ h, t }) => (f(h) ? cons(h, takeWhile(t, f)) : nil), ] )(list);

A function that returns true if all elements of the list pass the predicate function:

const forall = <A>(list: List<A>, f: (a: A) => boolean) => foldRight(list, true, (a, b) => f(a) && b);

A function that returns true if any element of the list passes the predicate function:

const exists = <A>(list: List<A>, f: (a: A) => boolean) => foldRight(list, false, (a, b) => f(a) || b);

A function like foldRight except it returns the list of partial results, rather than just the final value:

const scanRight = <A, B>( list: List<A>, defaultValue: B, f: (a: A, b: B) => B ): List<B> => match( (_) => [nil, () => defaultValue], (_) => [ cons(_`x`, _`xs`), ({ x, xs }) => cons( foldRight(cons(x, xs), defaultValue, f), scanRight(xs, defaultValue, f) ), ] )(list);

Loss of efficiency in naive list functions

One of the problems with List is that despite being able to express operations and algorithms in very simple, elegant terms, the resulting implementation isn't always efficient. For example, when reusing functions (to avoid duplicating the same patterns over and over) we might end up making multiple passes over the same input.

Algebraic Data Types

List is just one example of what's called an Algebraic Data Type. Algebraic data types or ADTs are just data types defined by one or more data constructors, each of which may contain zero or more arguments. We say that the data type is the sum of its data constructors, and each data constructor is the product of its arguments, hence the name algebraic data type.

Here are these parts in our List example:

// A data type: type List<T> = { head: T, tail: List<T> } // Constructors of that data type: const nil = (): List<never> => ({ head: undefined as never, tail: undefined as never }) const cons = <T,>(head: T, tail: List<T>): List<T> => ({ head, tail }) // Arguments: (head: T, tail: List<T>)

Notice the absence of the class keyword. This is intentional. In functional programming, when we say constructor we are not referring to a language feature, we are referring to the ability of a function to construct data types.

Pairs and tuples of other arities are also algebraic types, given the definition we provided, and can also be used as data in pattern matching. This should be obvious in TypeScript since we have no distinction between arrays and tuples.

Trees

We can define other data structures as algebraic data types. Let's see what an example of a binary tree would look like:

interface Tree<A> {} interface Leaf<A> extends Tree<A> { value: A; } interface Branch<A> extends Tree<A> { left: Tree<A>; right: Tree<A>; }

Nice! Although, if we take a look at our definition of an algebraic data type, we'll notice we can simplify this example: "A data type which is just the sum (or union) of its data constructors, each of which are the product of their arguments"

Here's the simplified example:

type Tree<A> = Leaf<A> | Branch<A>; type Leaf<A> = A; type Branch<A> = [Tree<A>, Tree<A>];

Perfect. As an exercise, let's try to write a few functions for this data type!

A function size which counts the number of nodes (leaves and branches) in a tree:

const size = match( (_) => [ [_`left`, _`right`], ({ left, right }) => 1 + size(left) + size(right), ], (_) => [_, () => 1] ); const sizeExampleTree1: Tree<number> = 123; const sizeExampleTree2: Tree<number> = [123, 123]; const sizeExampleTree3: Tree<number> = [[123, 123], 123]; const sizeExampleResult1 = size(sizeExampleTree1); // Root node: 1 const sizeExampleResult2 = size(sizeExampleTree2); // Root node + Branches: 3 const sizeExampleResult3 = size(sizeExampleTree3); // Root Node + Branches + Sub-Branches: 5

A function maximum which returns the maximum value in a Tree<number>:

const maximum = match( (_) => [ [_`left`, _`right`], ({ left, right }) => Math.max(maximum(left), maximum(right)), ], (_) => [_`value`, ({ value }) => value] ); const maxExampleTree1: Tree<number> = 1; const maxExampleTree2: Tree<number> = [1, 2]; const maxExampleTree3: Tree<number> = [[4, [8, 2]], 1]; const maxExampleResult1 = maximum(maxExampleTree1); // => Max: 1 const maxExampleResult2 = maximum(maxExampleTree2); // => Max: 2 const maxExampleResult3 = maximum(maxExampleTree3); // => Max: 8

A function depth which returns the maximum path length from the root of the tree to any leaf:

const depth = match( (_) => [ [_`left`, _`right`], ({ left, right }) => 1 + Math.max(depth(left), depth(right)), ], (_) => [_, () => 1] ); const depthExampleTree1: Tree<number> = 1; const depthExampleTree2: Tree<number> = [1, 1]; const depthExampleTree3: Tree<number> = [[1, 1], 1]; const depthExampleResult1 = depth(depthExampleTree1); // => Root node: 1 const depthExampleResult2 = depth(depthExampleTree2); // => Root node + branch layer: 2 const depthExampleResult3 = depth(depthExampleTree3); // => Root node + branch layer + branch layer: 3

A function map analogous to the function of the same name on List, modifying each element in the tree with a given function:

const map = <A, B>(tree: Tree<A>, f: (t: A) => B): Tree<B> => match( (_) => [ [_`left`, _`right`], ({ left, right }) => [map(left, f), map(right, f)], ], (_) => [_`value`, ({ value }) => f(value)] )(tree); const addOne = (a: number) => a + 1; const mapExampleTree1: Tree<number> = 1; const mapExampleTree2: Tree<number> = [1, 2]; const mapExampleTree3: Tree<number> = [[2, 3], 1]; const mapExampleResult1 = map(mapExampleTree1, addOne); // => 2 const mapExampleResult2 = map(mapExampleTree2, addOne); // => [2, 3] const mapExampleResult3 = map(mapExampleTree3, addOne); // => [[3, 4], 2]

And finally, a function fold that generalizes over all these functions:

const fold = <A, B>( fBranch: (left: Tree<A>, right: Tree<A>) => Tree<B>, fLeaf: (a: A) => Tree<B> ) => match( (_) => [[_`left`, _`right`], ({ left, right }) => fBranch(left, right)], (_) => [_`value`, ({ value }) => fLeaf(value)] );

Well, that's all I've got for you on Algebraic Data Types in TypeScript. In the next chapter we'll look into how to handle errors without exceptions!