Mary Rose Cook

A practical introduction to functional programming

Many functional programming articles teach abstract functional techniques. That is, composition, pipelining, higher order functions. This one is different. It shows examples of imperative, unfunctional code that people write every day and translates these examples to a functional style.

The first section of the article takes short, data transforming loops and translates them into functional maps and reduces. The second section takes longer loops, breaks them up into units and makes each unit functional. The third section takes a loop that is a long series of successive data transformations and decomposes it into a functional pipeline.

The examples are in Python, because many people find Python easy to read. A number of the examples eschew pythonicity in order to demonstrate functional techniques common to many languages: map, reduce, pipeline. All of the examples are in Python 2.

A guide rope

When people talk about functional programming, they mention a dizzying number of “functional” characteristics. They mention immutable data1, first class functions2 and tail call optimisation3. These are language features that aid functional programming. They mention mapping, reducing, pipelining, recursing, currying4 and the use of higher order functions. These are programming techniques used to write functional code. They mention parallelization5, lazy evaluation6 and determinism7. These are advantageous properties of functional programs.

Ignore all that. Functional code is characterised by one thing: the absence of side effects. It doesn’t rely on data outside the current function, and it doesn’t change data that exists outside the current function. Every other “functional” thing can be derived from this property. Use it as a guide rope as you learn.

This is an unfunctional function:

a = 0
def increment():
    global a
    a += 1

This is a functional function:

def increment(a):
    return a + 1

Don’t iterate over lists. Use map and reduce.

Map

Map takes a function and a collection of items. It makes a new, empty collection, runs the function on each item in the original collection and inserts each return value into the new collection. It returns the new collection.

This is a simple map that takes a list of names and returns a list of the lengths of those names:

name_lengths = map(len, ["Mary", "Isla", "Sam"])

print name_lengths
# => [4, 4, 3]

This is a map that squares every number in the passed collection:

squares = map(lambda x: x * x, [0, 1, 2, 3, 4])

print squares
# => [0, 1, 4, 9, 16]

This map doesn’t take a named function. It takes an anonymous, inlined function defined with lambda. The parameters of the lambda are defined to the left of the colon. The function body is defined to the right of the colon. The result of running the function body is (implicitly) returned.

The unfunctional code below takes a list of real names and replaces them with randomly assigned code names.

import random

names = ['Mary', 'Isla', 'Sam']
code_names = ['Mr. Pink', 'Mr. Orange', 'Mr. Blonde']

for i in range(len(names)):
    names[i] = random.choice(code_names)

print names
# => ['Mr. Blonde', 'Mr. Blonde', 'Mr. Blonde']

(As you can see, this algorithm can potentially assign the same secret code name to multiple secret agents. Hopefully, this won’t be a source of confusion during the secret mission.)

This can be rewritten as a map:

import random

names = ['Mary', 'Isla', 'Sam']

secret_names = map(lambda x: random.choice(['Mr. Pink',
                                            'Mr. Orange',
                                            'Mr. Blonde']),
                   names)

Exercise 1. Try rewriting the code below as a map. It takes a list of real names and replaces them with code names produced using a more robust strategy.

names = ['Mary', 'Isla', 'Sam']

for i in range(len(names)):
    names[i] = hash(names[i])

print names
# => [6306819796133686941, 8135353348168144921, -1228887169324443034]

(Hopefully, the secret agents will have good memories and won’t forget each other’s secret code names during the secret mission.)

My solution:

names = ['Mary', 'Isla', 'Sam']

secret_names = map(hash, names)

Reduce

Reduce takes a function and a collection of items. It returns a value that is created by combining the items.

This is a simple reduce. It returns the sum of all the items in the collection.

sum = reduce(lambda a, x: a + x, [0, 1, 2, 3, 4])

print sum
# => 10

x is the current item being iterated over. a is the accumulator. It is the value returned by the execution of the lambda on the previous item. reduce() walks through the items. For each one, it runs the lambda on the current a and x and returns the result as the a of the next iteration.

What is a in the first iteration? There is no previous iteration result for it to pass along. reduce() uses the first item in the collection for a in the first iteration and starts iterating at the second item. That is, the first x is the second item.

This code counts how often the word 'Sam' appears in a list of strings:

sentences = ['Mary read a story to Sam and Isla.',
             'Isla cuddled Sam.',
             'Sam chortled.']

sam_count = 0
for sentence in sentences:
    sam_count += sentence.count('Sam')

print sam_count
# => 3

This is the same code written as a reduce:

sentences = ['Mary read a story to Sam and Isla.',
             'Isla cuddled Sam.',
             'Sam chortled.']

sam_count = reduce(lambda a, x: a + x.count('Sam'),
                   sentences,
                   0)

How does this code come up with its initial a? The starting point for the number of incidences of 'Sam' cannot be 'Mary read a story to Sam and Isla.' The initial accumulator is specified with the third argument to reduce(). This allows the use of a value of a different type from the items in the collection.

Why are map and reduce better?

First, they are often one-liners.

Second, the important parts of the iteration - the collection, the operation and the return value - are always in the same places in every map and reduce.

Third, the code in a loop may affect variables defined before it or code that runs after it. By convention, maps and reduces are functional.

Fourth, map and reduce are elemental operations. Every time a person reads a for loop, they have to work through the logic line by line. There are few structural regularities they can use to create a scaffolding on which to hang their understanding of the code. In contrast, map and reduce are at once building blocks that can be combined into complex algorithms, and elements that the code reader can instantly understand and abstract in their mind. “Ah, this code is transforming each item in this collection. It’s throwing some of the transformations away. It’s combining the remainder into a single output.”

Fifth, map and reduce have many friends that provide useful, tweaked versions of their basic behaviour. For example: filter, all, any and find.

Exercise 2. Try rewriting the code below using map, reduce and filter. Filter takes a function and a collection. It returns a collection of every item for which the function returned True.

people = [{'name': 'Mary', 'height': 160},
          {'name': 'Isla', 'height': 80},
          {'name': 'Sam'}]

height_total = 0
height_count = 0
for person in people:
    if 'height' in person:
        height_total += person['height']
        height_count += 1

if height_count > 0:
    average_height = height_total / height_count

    print average_height
    # => 120

If this seems tricky, try not thinking about the operations on the data. Think of the states the data will go through, from the list of people dictionaries to the average height. Don’t try and bundle multiple transformations together. Put each on a separate line and assign the result to a descriptively-named variable. Once the code works, condense it.

My solution:

people = [{'name': 'Mary', 'height': 160},
          {'name': 'Isla', 'height': 80},
          {'name': 'Sam'}]

heights = map(lambda x: x['height'],
              filter(lambda x: 'height' in x, people))

if len(heights) > 0:
    from operator import add
    average_height = reduce(add, heights) / len(heights)

Write declaratively, not imperatively

The program below runs a race between three cars. At each time step, each car may move forwards or it may stall. At each time step, the program prints out the paths of the cars so far. After five time steps, the race is over.

This is some sample output:

-
--
--

--
--
---

---
--
---

----
---
----

----
----
-----

This is the program:

from random import random

time = 5
car_positions = [1, 1, 1]

while time:
    # decrease time
    time -= 1

    print ''
    for i in range(len(car_positions)):
        # move car
        if random() > 0.3:
            car_positions[i] += 1

        # draw car
        print '-' * car_positions[i]

The code is written imperatively. A functional version would be declarative. It would describe what to do, rather than how to do it.

Use functions

A program can be made more declarative by bundling pieces of the code into functions.

from random import random

def move_cars():
    for i, _ in enumerate(car_positions):
        if random() > 0.3:
            car_positions[i] += 1

def draw_car(car_position):
    print '-' * car_position

def run_step_of_race():
    global time
    time -= 1
    move_cars()

def draw():
    print ''
    for car_position in car_positions:
        draw_car(car_position)

time = 5
car_positions = [1, 1, 1]

while time:
    run_step_of_race()
    draw()

To understand this program, the reader just reads the main loop. “If there is time left, run a step of the race and draw. Check the time again.” If the reader wants to understand more about what it means to run a step of the race, or draw, they can read the code in those functions.

There are no comments any more. The code describes itself.

Splitting code into functions is a great, low brain power way to make code more readable.

This technique uses functions, but it uses them as sub-routines. They parcel up code. The code is not functional in the sense of the guide rope. The functions in the code use state that was not passed as arguments. They affect the code around them by changing external variables, rather than by returning values. To check what a function really does, the reader must read each line carefully. If they find an external variable, they must find its origin. They must see what other functions change that variable.

Remove state

This is a functional version of the car race code:

from random import random

def move_cars(car_positions):
    return map(lambda x: x + 1 if random() > 0.3 else x,
               car_positions)

def output_car(car_position):
    return '-' * car_position

def run_step_of_race(state):
    return {'time': state['time'] - 1,
            'car_positions': move_cars(state['car_positions'])}

def draw(state):
    print ''
    print '\n'.join(map(output_car, state['car_positions']))

def race(state):
    draw(state)
    if state['time']:
        race(run_step_of_race(state))

race({'time': 5,
      'car_positions': [1, 1, 1]})

The code is still split into functions, but the functions are functional. There are three signs of this. First, there are no longer any shared variables. time and car_positions get passed straight into race(). Second, functions take parameters. Third, no variables are instantiated inside functions. All data changes are done with return values. race() recurses3 with the result of run_step_of_race(). Each time a step generates a new state, it is passed immediately into the next step.

Now, here are two functions, zero() and one():

def zero(s):
    if s[0] == "0":
        return s[1:]

def one(s):
    if s[0] == "1":
        return s[1:]

zero() takes a string, s. If the first character is '0', it returns the rest of the string. If it is not, it returns None, the default return value of Python functions. one() does the same, but for a first character of '1'.

Imagine a function called rule_sequence(). It takes a string and a list of rule functions of the form of zero() and one(). It calls the first rule on the string. Unless None is returned, it takes the return value and calls the second rule on it. Unless None is returned, it takes the return value and calls the third rule on it. And so forth. If any rule returns None, rule_sequence() stops and returns None. Otherwise, it returns the return value of the final rule.

This is some sample input and output:

print rule_sequence('0101', [zero, one, zero])
# => 1

print rule_sequence('0101', [zero, zero])
# => None

This is the imperative version of rule_sequence():

def rule_sequence(s, rules):
    for rule in rules:
        s = rule(s)
        if s == None:
            break

    return s

Exercise 3. The code above uses a loop to do its work. Make it more declarative by rewriting it as a recursion.

My solution:

def rule_sequence(s, rules):
    if s == None or not rules:
        return s
    else:
        return rule_sequence(rules[0](s), rules[1:])

Use pipelines

In the previous section, some imperative loops were rewritten as recursions that called out to auxiliary functions. In this section, a different type of imperative loop will be rewritten using a technique called a pipeline.

The loop below performs transformations on dictionaries that hold the name, incorrect country of origin and active status of some bands.

bands = [{'name': 'sunset rubdown', 'country': 'UK', 'active': False},
         {'name': 'women', 'country': 'Germany', 'active': False},
         {'name': 'a silver mt. zion', 'country': 'Spain', 'active': True}]

def format_bands(bands):
    for band in bands:
        band['country'] = 'Canada'
        band['name'] = band['name'].replace('.', '')
        band['name'] = band['name'].title()

format_bands(bands)

print bands
# => [{'name': 'Sunset Rubdown', 'active': False, 'country': 'Canada'},
#     {'name': 'Women', 'active': False, 'country': 'Canada' },
#     {'name': 'A Silver Mt Zion', 'active': True, 'country': 'Canada'}]

Worries are stirred by the name of the function. “format” is very vague. Upon closer inspection of the code, these worries begin to claw. Three things happen in the same loop. The 'country' key gets set to 'Canada'. Punctuation is removed from the band name. The band name gets capitalized. It is hard to tell what the code is intended to do and hard to tell if it does what it appears to do. The code is hard to reuse, hard to test and hard to parallelize.

Compare it with this:

print pipeline_each(bands, [set_canada_as_country,
                            strip_punctuation_from_name,
                            capitalize_names])

This code is easy to understand. It gives the impression that the auxiliary functions are functional because they seem to be chained together. The output from the previous one comprises the input to the next. If they are functional, they are easy to verify. They are also easy to reuse, easy to test and easy to parallelize.

The job of pipeline_each() is to pass the bands, one at a time, to a transformation function, like set_canada_as_country(). After the function has been applied to all the bands, pipeline_each() bundles up the transformed bands. Then, it passes each one to the next function.

Let’s look at the transformation functions.

def assoc(_d, key, value):
    from copy import deepcopy
    d = deepcopy(_d)
    d[key] = value
    return d

def set_canada_as_country(band):
    return assoc(band, 'country', "Canada")

def strip_punctuation_from_name(band):
    return assoc(band, 'name', band['name'].replace('.', ''))

def capitalize_names(band):
    return assoc(band, 'name', band['name'].title())

Each one associates a key on a band with a new value. There is no easy way to do this without mutating the original band. assoc() solves this problem by using deepcopy() to produce a copy of the passed dictionary. Each transformation function makes its modification to the copy and returns that copy.

Everything seems fine. Band dictionary originals are protected from mutation when a key is associated with a new value. But there are two other potential mutations in the code above. In strip_punctuation_from_name(), the unpunctuated name is generated by calling replace() on the original name. In capitalize_names(), the capitalized name is generated by calling title() on the original name. If replace() and title() are not functional, strip_punctuation_from_name() and capitalize_names() are not functional.

Fortunately, replace() and title() do not mutate the strings they operate on. This is because strings are immutable in Python. When, for example, replace() operates on a band name string, the original band name is copied and replace() is called on the copy. Phew.

This contrast between the mutability of strings and dictionaries in Python illustrates the appeal of languages like Clojure. The programmer need never think about whether they are mutating data. They aren’t.

Exercise 4. Try and write the pipeline_each function. Think about the order of operations. The bands in the array are passed, one band at a time, to the first transformation function. The bands in the resulting array are passed, one band at a time, to the second transformation function. And so forth.

My solution:

def pipeline_each(data, fns):
    return reduce(lambda a, x: map(x, a),
                  fns,
                  data)

All three transformation functions boil down to making a change to a particular field on the passed band. call() can be used to abstract that. It takes a function to apply and the key of the value to apply it to.

set_canada_as_country = call(lambda x: 'Canada', 'country')
strip_punctuation_from_name = call(lambda x: x.replace('.', ''), 'name')
capitalize_names = call(str.title, 'name')

print pipeline_each(bands, [set_canada_as_country,
                            strip_punctuation_from_name,
                            capitalize_names])

Or, if we are willing to sacrifice readability for conciseness, just:

print pipeline_each(bands, [call(lambda x: 'Canada', 'country'),
                            call(lambda x: x.replace('.', ''), 'name'),
                            call(str.title, 'name')])

The code for call():

def assoc(_d, key, value):
    from copy import deepcopy
    d = deepcopy(_d)
    d[key] = value
    return d

def call(fn, key):
    def apply_fn(record):
        return assoc(record, key, fn(record.get(key)))
    return apply_fn

There is a lot going on here. Let’s take it piece by piece.

One. call() is a higher order function. A higher order function takes a function as an argument, or returns a function. Or, like call(), it does both.

Two. apply_fn() looks very similar to the three transformation functions. It takes a record (a band). It looks up the value at record[key]. It calls fn on that value. It assigns the result back to a copy of the record. It returns the copy.

Three. call() does not do any actual work. apply_fn(), when called, will do the work. In the example of using pipeline_each() above, one instance of apply_fn() will set 'country' to 'Canada' on a passed band. Another instance will capitalize the name of a passed band.

Four. When an apply_fn() instance is run, fn and key will not be in scope. They are neither arguments to apply_fn(), nor locals inside it. But they will still be accessible. When a function is defined, it saves references to the variables it closes over: those that were defined in a scope outside the function and that are used inside the function. When the function is run and its code references a variable, Python looks up the variable in the locals and in the arguments. If it doesn’t find it there, it looks in the saved references to closed over variables. This is where it will find fn and key.

Five. There is no mention of bands in the call() code. That is because call() could be used to generate pipeline functions for any program, regardless of topic. Functional programming is partly about building up a library of generic, reusable, composable functions.

Good job. Closures, higher order functions and variable scope all covered in the space of a few paragraphs. Have a nice glass of lemonade.

There is one more piece of band processing to do. That is to remove everything but the name and country. extract_name_and_country() can pull that information out:

def extract_name_and_country(band):
    plucked_band = {}
    plucked_band['name'] = band['name']
    plucked_band['country'] = band['country']
    return plucked_band

print pipeline_each(bands, [call(lambda x: 'Canada', 'country'),
                            call(lambda x: x.replace('.', ''), 'name'),
                            call(str.title, 'name'),
                            extract_name_and_country])

# => [{'name': 'Sunset Rubdown', 'country': 'Canada'},
#     {'name': 'Women', 'country': 'Canada'},
#     {'name': 'A Silver Mt Zion', 'country': 'Canada'}]

extract_name_and_country() could have been written as a generic function called pluck(). pluck() would be used like this:

print pipeline_each(bands, [call(lambda x: 'Canada', 'country'),
                            call(lambda x: x.replace('.', ''), 'name'),
                            call(str.title, 'name'),
                            pluck(['name', 'country'])])

Exercise 5. pluck() takes a list of keys to extract from each record. Try and write it. It will need to be a higher order function.

My solution:

def pluck(keys):
    def pluck_fn(record):
        return reduce(lambda a, x: assoc(a, x, record[x]),
                      keys,
                      {})
    return pluck_fn

What now?

Functional code co-exists very well with code written in other styles. The transformations in this article can be applied to any code base in any language. Try applying them to your own code.

Think of Mary, Isla and Sam. Turn iterations of lists into maps and reduces.

Think of the race. Break code into functions. Make those functions functional. Turn a loop that repeats a process into a recursion.

Think of the bands. Turn a sequence of operations into a pipeline.

1 An immutable piece of data is one that cannot be changed. Some languages, like Clojure, make all values immutable by default. Any “mutating” operations copy the value, change it and pass back the changed copy. This eliminates bugs that arise from a programmer’s incomplete model of the possible states their program may enter.

2 Languages that support first class functions allow functions to be treated like any other value. This means they can be created, passed to functions, returned from functions and stored inside data structures.

3 Tail call optimisation is a programming language feature. Each time a function recurses, a new stack frame is created. A stack frame is used to store the arguments and local values for the current function invocation. If a function recurses a large number of times, it is possible for the interpreter or compiler to run out of memory. Languages with tail call optimisation reuse the same stack frame for their entire sequence of recursive calls. Languages like Python that do not have tail call optimisation generally limit the number of times a function may recurse to some number in the thousands. In the case of the race() function, there are only five time steps, so it is safe.

4 Currying means decomposing a function that takes multiple arguments into a function that takes the first argument and returns a function that takes the next argument, and so forth for all the arguments.

5 Parallelization means running the same code concurrently without synchronization. These concurrent processes are often run on multiple processors.

6 Lazy evaluation is a compiler technique that avoids running code until the result is needed.

7 A process is deterministic if repetitions yield the same result every time.

 

Little Lisp interpreter

Little Lisp is an interpreter that supports function invocation, lambdas, lets, ifs, numbers, strings, a few library functions, and lists. I wrote it for a lightning talk at the Recurse Center to show how easy it is to write an interpreter. The code is 116 lines of JavaScript. I will explain how it works.

First, let’s learn some Lisp.

An atom, the simplest Lisp form:

1

Another atom. This time, a string:

"a"

An empty list:

()

A list containing an atom:

(1)

A list containing two atoms:

(1 2)

A list containing an atom and another list:

(1 (2))

A function invocation. This comprises a list where the first element is the function and the rest of the elements are the arguments. first takes one argument, (1 2), and returns 1.

(first (1 2))
  => 1

A lambda. That is: a function definition. The lambda takes a parameter, x, and just returns it.

(lambda (x)
  x)

A lambda invocation. This comprises a list where the first element is a lambda and the rest of the elements are the arguments. The lambda takes one argument, "Lisp", and returns it.

((lambda (x)
   x)
  "Lisp")

  => "Lisp"

Writing a Lisp interpreter is really easy.

The code for Little Lisp has two parts: the parser and the interpreter.

The parser.

Parsing has two phases: tokenizing and parenthesizing.

tokenize() takes a string of Lisp code, puts spaces around every parenthesis and splits on whitespace. For example, it takes something like ((lambda (x) x) "Lisp"), transforms it into ` ( ( lambda ( x ) x ) “Lisp” ) ` and transforms that into ['(', '(', 'lambda', '(', 'x', ')', 'x', ')', '"Lisp"', ')'].

var tokenize = function(input) {
  return input.replace(/\(/g, ' ( ')
              .replace(/\)/g, ' ) ')
              .trim()
              .split(/\s+/);
};

parenthesize() takes the tokens produced by tokenize() and produces a nested array that mimics the structure of the Lisp code. Each atom in the nested array is labelled as an identifier or a literal. For example, ['(', '(', 'lambda', '(', 'x', ')', 'x', ')', '"Lisp"', ')'] is transformed into:

[[{ type: 'identifier', value: 'lambda' }, [{ type: 'identifier', value: 'x' }],
  { type: 'identifier', value: 'x' }],
  { type: 'literal', value: 'Lisp' }]

parenthesize() goes through the tokens, one by one. If the current token is an opening parenthesis, it starts building a new array. If the current token is an atom, it labels it with its type and appends it to the current array. If the current token is a closing parenthesis, it stops building the current array and continues building the enclosing array.

var parenthesize = function(input, list) {
  if (list === undefined) {
    return parenthesize(input, []);
  } else {
    var token = input.shift();
    if (token === undefined) {
      return list.pop();
    } else if (token === "(") {
      list.push(parenthesize(input, []));
      return parenthesize(input, list);
    } else if (token === ")") {
      return list;
    } else {
      return parenthesize(input, list.concat(categorize(token)));
    }
  }
};

When parenthesize() is first called, the input parameter contains the array of tokens returned by tokenize(). For example:

['(', '(', 'lambda', '(', 'x', ')', 'x', ')', '"Lisp"', ')']

When parenthesize() is first called, the list parameter is undefined. Lines 2-3 run and parenthesize() recurses with list set to an empty array.

In the recursion, line 5 runs and removes the first opening parenthesis from input. Line 9 starts a new, empty list by recursing with a new, empty array.

In the recursion, line 5 runs and removes another opening parenthesis from input. Line 9 starts another new, empty list by recursing with another new, empty array.

In the recursion, input is ['lambda', '(', 'x', ')', 'x', ')', '"Lisp"', ')']. Line 14 runs with token set to lambda. It calls categorize() and passes lambda as the input argument. Line 7 of categorize() runs and returns an object with type set to identifier and value set to lambda.

var categorize = function(input) {
  if (!isNaN(parseFloat(input))) {
    return { type:'literal', value: parseFloat(input) };
  } else if (input[0] === '"' && input.slice(-1) === '"') {
    return { type:'literal', value: input.slice(1, -1) };
  } else {
    return { type:'identifier', value: input };
  }
};

Line 14 of parenthesize() appends to list the object returned by categorize() and recurses with the rest of the input and list.

var parenthesize = function(input, list) {
  if (list === undefined) {
    return parenthesize(input, []);
  } else {
    var token = input.shift();
    if (token === undefined) {
      return list.pop();
    } else if (token === "(") {
      list.push(parenthesize(input, []));
      return parenthesize(input, list);
    } else if (token === ")") {
      return list;
    } else {
      return parenthesize(input, list.concat(categorize(token)));
    }
  }
};

In the recursion, the next token is a parenthesis. Line 9 of parenthesize() starts a new, empty list by recursing with an empty array. In the recursion, input is ['x', ')', 'x', ')', '"Lisp"', ')']. Line 14 runs with token set to x. It makes a new object with a value of x and a type of identifier. It appends this object to list and recurses.

In the recursion, the next token is a closing parenthesis. Line 12 runs and returns the completed list: [{ type: 'identifier', value: 'x' }].

parenthesize() continues recursing until it has processed all of the input tokens. It returns the nested array of typed atoms.

parse() is the successive application of tokenize() and parenthesize():

var parse = function(input) {
  return parenthesize(tokenize(input));
};

Given a starting input of ((lambda (x) x) "Lisp"), the final output of the parser is:

[[{ type: 'identifier', value: 'lambda' }, [{ type: 'identifier', value: 'x' }],
  { type: 'identifier', value: 'x' }],
  { type: 'literal', value: 'Lisp' }]

The interpreter.

After parsing is complete, interpreting begins.

interpret() receives the output of parse() and executes it. Given the output from the parsing example above, interpret() would construct a lambda and invoke it with the argument "Lisp". The lambda invocation would return "Lisp", which would be the output of the whole program.

As well as the input to execute, interpret() receives an execution context. This is the place where variables and their values are stored. When a piece of Lisp code is executed by interpret(), the execution context contains the variables that are accessible to that code.

These variables are stored in a hierarchy. Variables in the current scope are at the bottom of the hierarchy. Variables in the enclosing scope are in the level above. Variables in the scope enclosing the enclosing scope are in the level above that. And so on. For example, in the following code:

((lambda (a)
  ((lambda (b)
      (b a))
    "b"))
 "a")

On line 3, the execution context has two active scopes. The inner lambda forms the current scope. The outer lambda forms an enclosing scope. The current scope has b bound to "b". The enclosing scope has a bound to "a". When line 3 runs, the interpreter tries to look up b in the context. It checks the current scope, finds b and returns its value. Still on line 3, the interpreter tries to look up a. It checks the current scope and does not find a, so it tries the enclosing scope. There, it finds a and returns its value.

In Little Lisp, the execution context is modeled with an object made by calling the Context constructor. This takes scope, an object that contains variables and their values in the current scope. And it takes parent. If parent is undefined, the scope is the top, or global, or outermost one.

var Context = function(scope, parent) {
  this.scope = scope;
  this.parent = parent;

  this.get = function(identifier) {
    if (identifier in this.scope) {
      return this.scope[identifier];
    } else if (this.parent !== undefined) {
      return this.parent.get(identifier);
    }
  };
};

We have seen how ((lambda (x) x) "Lisp") gets parsed. Let us see how the parsed code gets executed.

var interpret = function(input, context) {
  if (context === undefined) {
    return interpret(input, new Context(library));
  } else if (input instanceof Array) {
    return interpretList(input, context);
  } else if (input.type === "identifier") {
    return context.get(input.value);
  } else {
    return input.value;
  }
};

The first time interpret() is called, context is undefined. Lines 2-3 are run to make an execution context.

When the initial context is instantiated, the constructor function takes the library object. This contains the functions built in to the language: first, rest and print. These functions are written in JavaScript.

interpret() recurses with the original input and the new context.

input contains the full example output from the parsing section:

[[{ type: 'identifier', value: 'lambda' }, [{ type: 'identifier', value: 'x' }],
  { type: 'identifier', value: 'x' }],
  { type: 'literal', value: 'Lisp' }]

Because input is an array and context is defined, lines 4-5 are run and interpretList() is called.

var interpretList = function(input, context) {
  if (input.length > 0 && input[0].value in special) {
    return special[input[0].value](input, context);
  } else {
    var list = input.map(function(x) { return interpret(x, context); });
    if (list[0] instanceof Function) {
      return list[0].apply(undefined, list.slice(1));
    } else {
      return list;
   }
  }
};

In interpretList(), line 5 maps over the input array and calls interpret() on each element. When interpret() is called on the lambda definition, interpretList() gets called again. This time, the input argument to interpretList() is:

[{ type: 'identifier', value: 'lambda' }, [{ type: 'identifier', value: 'x' }],
 { type: 'identifier', value: 'x' }]

Line 3 of interpretList() gets called, because lambda, the first element in the array, is a special form. special.lambda() is called to create the lambda function.

var special = {
  lambda: function(input, context) {
    return function() {
      var lambdaArguments = arguments;
      var lambdaScope = input[1].reduce(function(acc, x, i) {
        acc[x.value] = lambdaArguments[i];
        return acc;
      }, {});

      return interpret(input[2], new Context(lambdaScope, context));
    };
  }
};

special.lambda() takes the part of the input that defines the lambda. It returns a function that, when invoked, invokes the lambda on some arguments.

Line 3 begins the definition of the lambda invocation function. Line 4 stores the arguments passed to the lambda invocation. Line 5 starts creating a new scope for the lambda’s invocation. It reduces over the part of the input that defines the parameters of the lambda: [{ type: 'identifier', value: 'x' }]. It adds a key/value pair to the lambda scope for each lambda parameter in input and argument passed to the lambda. Line 10 invokes the lambda by calling interpret() on the lambda body: { type: 'identifier', value: 'x' }. It passes in the lambda context that contains the lambda’s scope and the parent context.

The lambda is now represented by the function returned by special.lambda().

interpretList() continues mapping over the input array by calling interpret() on the second element of the list: the "Lisp" string.

var interpret = function(input, context) {
  if (context === undefined) {
    return interpret(input, new Context(library));
  } else if (input instanceof Array) {
    return interpretList(input, context);
  } else if (input.type === "identifier") {
    return context.get(input.value);
  } else {
    return input.value;
  }
};

This runs line 9 of interpret() which just returns the value attribute of the literal object: 'Lisp'. The map operation on line 5 of interpretList() is complete. list is:

 [function(args) { /* code to invoke lambda */ },
 'Lisp']

Line 6 of interpretList() runs and finds that the first element of list is a JavaScript function. This means that the list is an invocation. Line 7 runs and invokes the lambda, passing the rest of list as arguments.

var interpretList = function(input, context) {
  if (input.length > 0 && input[0].value in special) {
    return special[input[0].value](input, context);
  } else {
    var list = input.map(function(x) { return interpret(x, context); });
    if (list[0] instanceof Function) {
      return list[0].apply(undefined, list.slice(1));
    } else {
      return list;
    }
  }
};

In the lambda invocation function, line 8 calls interpret() on the lambda body, { type: 'identifier', value: 'x' }.

function() {
  var lambdaArguments = arguments;
  var lambdaScope = input[1].reduce(function(acc, x, i) {
    acc[x.value] = lambdaArguments[i];
    return acc;
  }, {});

  return interpret(input[2], new Context(lambdaScope, context));
};

Line 6 of interpret() finds that input is an identifier atom. Line 7 looks up the identifier, x, in context and returns 'Lisp'.

var interpret = function(input, context) {
  if (context === undefined) {
    return interpret(input, new Context(library));
  } else if (input instanceof Array) {
    return interpretList(input, context);
  } else if (input.type === "identifier") {
    return context.get(input.value);
  } else {
    return input.value;
  }
};

'Lisp' is returned by the lambda invocation function, which is returned by interpretList(), which is returned by interpret(), and that’s it.

Go to the GitHub repository to see all the code. And look at lis.py, the dazzlingly simple Scheme interpreter that Peter Norvig wrote in Python.

 

Coquette

I’ve just released Coquette, a micro framework for JavaScript games. It handles collision detection, the game update loop, keyboard input and canvas rendering. I wrote it in a panic when, two hours before the deadline of April’s Ludum Dare, I discovered that entries cannot use closed source library code.

 

A pop quiz purporting to be about scope in JavaScript that is actually a polemic about why modules are a good idea

This is the blog post version of a lightning talk I gave at the Recurse Center last week.

I will ask you some questions. For each question, there will be some JavaScript files loaded into an HTML page. In these JavaScript files, there will be some console.log() invocations. Your job is to predict what happens after each invocation.

For this first question, there is one JavaScript file, a.js, loaded into this HTML page:

<!doctype html>
<html>
  <head>
    <script src="a.js"></script>
  </head>
</html>

One

// a.js
var blah = "blah";
console.log(blah); // Option 1: ReferenceError: blah is not defined
                   // Option 2: print 'blah' to the console

blah will be printed. The variable, blah, is in the same scope as the console.log() invocation.

For this rest of the questions, there will be two JavaScript files, a.js and b.js, loaded into this HTML page:

<!doctype html>
<html>
  <head>
    <script src="a.js"></script>
    <script src="b.js"></script>
  </head>
</html>

Two

// a.js
blah = "blah";

// b.js
console.log(blah); // Option 1: ReferenceError: blah is not defined
                   // Option 2: print 'blah' to the console

blah will be printed. It doesn’t matter that blah was declared in a different file. All files in JavaScript are loaded into a shared scope.

Three

// a.js
var blah = "blah";

// b.js
console.log(blah); // Option 1: ReferenceError: blah is not defined
                   // Option 2: print 'blah' to the console

blah will be printed. It doesn’t matter that the var keyword was used to declare blah. The declaration was in a shared scope, and that scope is global, so blah becomes a global variable.

Four

// a.js
var blahFn = function() {
  return "blah";
};

// b.js
console.log(blahFn()); // Option 1: ReferenceError: blah is not defined
                       // Option 2: print 'blah' to the console

blah will be printed. All the logic shown so far applies to functions.

Five

// a.js
function blahFn() {
  var blah = "blah";
};

// b.js
blahFn();
console.log(blah); // Option 1: ReferenceError: blah is not defined
                   // Option 2: print 'blah' to the console

There will be a reference error. The blah variable is created inside a function, in the function’s own, private scope. It is not available in the global scope.

Six

// a.js
function blahFn() {
  blah = "blah";
};

// b.js
blahFn();
console.log(blah); // Option 1: ReferenceError: blah is not defined
                   // Option 2: print 'blah' to the console

blah will be printed. With the var keyword omitted, the blah variable goes into the global scope.

Seven

// a.js
;(function() {
  function blahFn() {
    return "blah";
  };

  console.log(blahFn()); // Option 1: ReferenceError: blahFn is not defined
                         // Option 2: print 'blah' to the console
})();

a.js is loaded. The anonymous function at the top is invoked, creating our first JavaScript module. The module instantiates blahFn. The console.log() invocation prints blah because the code inside a module has a shared scope.

// b.js
console.log(blahFn()); // Option 1: ReferenceError: blahFn is not defined
                       // Option 2: print 'blah' to the console

When console.log() is invoked in b.js, a ReferenceError is thrown. The variables inside the module in a.js are locked away. The global scope is not polluted.

Eight

// a.js
;(function(exports) {
  var Blah = {};

  Blah.blahFn = function() {
    return "blah";
  };

  exports.Blah = Blah;
})(this);

// b.js
console.log(Blah.blahFn()); // Option 1: ReferenceError: Blah is not defined
                            // Option 2: print 'blah' to the console

When the console.log() is invoked, blah is printed. The module has made itself available as an object in the global scope. It has also made available one of its variables on this object.

To do this, it made an empty object, Blah, and attached blahFn to it as an attribute. The anonymous function enclosing the module was passed this. Because the anonymous function was invoked in the global context, this is the global object, also referenced with window. The module attached Blah to the exports/global/window object, making it available to the console.log() invocation.

Nine

// a.js
(function woo() {
  return function() {
    console.log("woo")
  };
})

// b.js
(function() {
  console.log("boo")
})();

What is printed? woo or boo?

It’s boo. The code in a.js makes a function called woo and does nothing with it.

In questions seven and eight, both modules were preceded by a semicolon. Why is this?

Often, a module is put in its own JavaScript file. Often, JavaScript files that are part of a website are joined together. This is part of a process to reduce download time. If you join the a.js and b.js together and then execute that, woo will be printed. The operation of the boo module was changed by the random code preceding it in the a.js file. Because the random code in a.js had no semicolon at the end, the parentheses around the boo module invoked the woo function. This returned the anonymous function inside the woo function which was then invoked by the parentheses sitting at the end of b.js.

Modules protect themselves from these sorts of nightmares with a leading semicolon.

 

The stabbing in Shadow of the Colossus

You finally realise how to get off the ground and onto the legs of the colossus. Maybe you need to shoot arrows into the soles of its feet so it kneels in pain. Maybe you need to hide under an archway, wait for it to stoop down low and make a leap for its beard.

You enact your plan, haul yourself up and begin climbing the body of the colossus. As you go, you might occasionally find part-time horizontal surfaces - armour, shackles - upon which you can rest and regain your strength. Or, you might plan your route so you can shimmy up as fast as possible, avoiding the limbs that sometimes flail and force you to stop climbing and just cling on.

You reach the top. The shoulders of a biped. The back of a quadruped. This moment is magic. You stagger across this great roiling mass that feels like moving rock. You cling on as it rears up. You creep over it, whispering, “stay calm”, praying it will not buck.

Being at the top is relatively safe. Though you must cling on when the colossus bridles, there is always a reprieve when you can stand and regain your strength. Your aim is now to reach one of the glowing tattoos on the colossus’s body. To get to a tattoo, you must leave your perch and climb to the crown of a swaying head, or a flank that can only be reached from above. These forays are a commitment. Once you set out, you have a ration of strength that cannot be replenished until you return. If your strength runs out during your expedition, you will lose your grip, fall to the ground and possibly die.

You reach a tattoo. To kill the colossus, you must repeatedly stab it here. You raise your sword. The longer you hold it aloft, the more powerful the stab will be. The more powerful the stab is, the fewer stabs you will need.

This piece of the game is particularly beautifully designed. If the colossus bucks while your sword is raised, you must lower your sword and cling on. This means you have wasted grip strength. Having your sword raised is a tense balancing act where you want to wait as long as possible, but no longer, before you strike. There is an elegant mapping between the sword and your controller. You press and release X to raise your sword. You press X again to stab. A more intuitive system would have you press and hold X to keep the sword raised. But the way the controls actually work is better. Stabbing the tattoo is such a cathartic, primal screaming moment, to attach it to the release of a button would diminish its power. The way it is, you jam your finger back down on X like you really are astride the colossus’s neck, like it is about to throw you, like it is the only thing stopping you from winning back your girlfriend.

 

Abendland

A documentary by Nikolaus Geyrhalter. Notionally, it shows a series of things that happen at night: monitoring of borders, conventions of EU sub-committees, care for the elderly and prematurely born, parties, news broadcasting, pornographic broadcasting, food production, policing, counseling for the suicidal. Reviewers have noted different themes. Surveillance. Protection. The service industry. But these notes tell you more about the commentator than the theme of the film.

My own conclusions on the theme changed as I watched: being on or off stage, activities that happen at night, people who are working surrounded by people who are not, the services people provide to others.

By the end, I had arrived at: jobs that must be done at night. But distilling it down to that is a trivialisation. It is more worthwhile to think, “There are jobs that can’t wait until morning”, and take that as a jumping off point. Some general questions arise. What do these jobs entail? Why can’t they wait? Who does them? How? Then, more specific questions arise. Why does the night nurse clean the hand rails in the care home as well as attending to the patients? How does one talk to a stranger who is suicidal? Why are parcels sorted separately from letters?

In the film, there are many moments of distance and alienation. But, there are moments of human connection, too - a nurse tenderly feeding an old woman from a beaker, two people taking turns to shower after having sex, the film’s poster showing a network of street lights strung through the darkness of Europe.

Unsleeping people connected in the night.

 

The Mars Volta

Here, according to Last.fm, are the bands I have listened to the most over the last seven years. I have put in bold the bands that feel like a part of my identity. When I think about these bands, I get a warm glow. When I talk about them, I say “I fucking love”.

Dear and the Headlights
Sunset Rubdown
Bob Dylan
Des Ark
Women
The Paper Chase
The Mars Volta
Animal Collective
Meshuggah
The Blood Brothers
Converge
The Acorn
Xiu Xiu
Joanna Newsom
Beach House
Arcade Fire
Hurray for the Riff Raff
A Silver Mt. Zion

These identity bands have some common traits.

One. I love at least two of their records. Dear and the Headlights: both of their albums. Sunset Rubdown: all of their albums. Women: both of their albums. Des Ark: all of their live acoustic records. The Paper Chase: Now You Are One of Us and God Bless Your Black Heart. A Silver Mt. Zion is the exception, here. Horses in the Sky is one of my top ten records, ever. The rest of their albums probably aren’t even in my top fifty.

Two. I admire the aesthetic they achieve. Dear and the Headlights. The most beautiful melodies by anyone, ever. Bob Dylan. The deeper you go into his lyrics, the more you find. Women. The Beach Boys as a noise rock band. The Paper Chase. Find the most beautiful melody, then destroy it with how hard you feel it. Des Ark. Equate the art with the artist. Aimée Argote puts no distance between her manifestation in real life and her manifestation in her art. She makes music that, when heard, might be detrimental to her actual life.

Three. I find pieces of their music very meaningful. It is hard for me to cite examples, because music is so far from words. Not only is it impossible for me to explain why I love Bob Dylan’s line, “Let me dance beneath the diamond sky with one hand waving free”, but it’s precisely because of its wordless meaning that I like it, and it was Dylan who taught me that alogical emotion is worthy of trust.

Four. I admire (my conception of) the song-writer as a person. Sunset Rubdown’s Spencer Krug has a solo project, Moonface. The Pitchfork review of the Moonface record, Organ Music Not Vibraphone Like I’d Hoped, said Sunset Rubdown’s Dragonslayer is “A personal record about the toll that the worlds inside someone’s head takes on his relationships.” Des Ark. Aimée Argote sings about being a broken, high-functioning person. She sings about immovable, incidental, unpolitical queerness. Bob Dylan. He constantly moves on from his previous successes. As seen in the documentaries, Dont Look Back, and No Direction Home: he lives this life on the road half in love with Joan Baez, half in love with his own head.

But, most of the entries in that list are unbolded. These others are the bands that I simply listen to a lot. And I have listened to De-Loused in the Comatorium and Frances the Mute an awful lot.

I can’t identify with The Mars Volta’s music. Cedric Bixler-Zavala’s lyrics mean nothing. He chooses words that are one, impressionist remove from their actual meaning, rather than assembling a rich, internally consistent network from a few versatile base operators. Omar Rodríguez-López’s approach to arrangement is to chuck everything in. By saying everything, you say nothing.

I really like the Tony Scott film, Enemy of the State. It is a thriller that is brilliantly thrilling. In the same way, The Mars Volta’s music cuts in on a primal level and makes me want to sing and dance, which is kind of what music is supposed to be all about.

 

Spelunky, stories and trying to relate to other people

In Spelunky, you go through a learning experience that is like the learning experience you go through in the real world. First, you become familiar with the behaviour of the elements of the game. The arc described by a projectile. The beat walked by a cave man. The activation perimeter of the spikes in a totem pole. Second, you learn strategies for overcoming these elements. Land your jump on top of that totem pole. Strike a cave man three times with your boomerang to kill him. Third, you learn back-up strategies for the moments when your primary strategies won’t work. Activate the totem pole spikes by passing near them. Just as they are withdrawing, jump, hang onto the edge of the pole and haul yourself up. Jump on the cave man’s head to stun him. Throw him somewhere else. Fourth, you learn to combine the behaviours of elements. Whip a cave man until he sees stars. Throw him at the foot of a totem pole. Watch him get spiked to death.

But Spelunky is better than the real world, because it is simpler. There are fewer active elements, and no redundant elements. This makes the rules of the world easier to understand. It makes it easier to concoct a plan. It makes it easier to understand why the plan went wrong.

In Spelunky, you are always having fun. What type of fun? You are always on the edge of your seat because you lose everything when you die. You are never frustrated because the levels are randomly generated, so you need never replay any part of the game. You are always in an intellectually stimulating situation. Each level gives you a new combination of obstacles. And you can make the game as hard as you like. You can go for the riskily placed bars of gold, or the valuable artifacts that set off traps when you pick them up, or the items that you are too poor to buy and that must be stolen.

In Spelunky, you must balance risk and reward. Do you jump into that pit to try and grab those bars of gold, knowing that you might get unlucky, miss the jump out and get eaten by that man trap? This decision has multiple dimensions. The amount of gold you have determines your score, which might be important to you. But it also determines what you can buy. What you can buy affects your options for dealing with situations. You can buy more ropes and bombs. You can buy items with specific uses like ice guns, springs for your shoes, parachutes, and sticky gloves. Having extra things makes you more adaptable.

And what you can buy affects concerns outside your current life. Each death resets everything, besides the experience and skills you (not your avatar) have acquired on this go-round. But, very occasionally, there will be an artifact of your life that survives you. There are four worlds. Surviving the first leads you to the second. Surviving the second leads you to the third. Usually, death means starting right back at the beginning of the first world. But, if you make it through a world bearing particular gifts - a sizeable amount of gold, a bomb, a rope, it depends - someone called the Tunnel Man will dig you a permanent short cut to the beginning of that world. And purchase is one of the ways you can get hold of these gifts.

I have spent most of my game time this Christmas trying to make it to the end of world two with a shotgun for the Tunnel Man. To do this, I must battle through the jungle, at some point lay my hands on a shotgun, then make it to the end of the world alive.

Purchasing a shotgun is one of the options at my disposal. But shotguns are expensive and rarely in stock. In practice, I find myself making my way through the jungle, hoping to stumble across one of the two beings who carry shotguns: the shop keeper and the man guarding treasure. As I go, I try and accrue gold, bombs, extra health and some sort of projectile weapon.

If I see a mark, I do one of a range of colossally stupid things that will, almost certainly, end in my death. The least terrible plan is to use bombs to blast my way down into the shop or treasure trove, then keep throwing bombs in, hoping to kill the target. If I don’t have enough bombs, and I have a boomerang, I might try and stun the keeper/hoarder and then kill him with his own shotgun. If I have no boomerang, I will resort to doing the stunning with a lobbed rock, skull or damsel in distress.

The day before yesterday, I managed to get a shotgun. Afterwards, I proceeded very carefully. I made a beeline for each level exit. I collected no gold. I stayed back and shotgunned enemies.

I reached a row of vines hanging above spikes (instant death). These vines separated me from the world exit and the Tunnel Man. I drew a preparatory breath (in my actual lungs, not my explorer avatar’s lungs - breathing is not modelled in Spelunky) and thought, “Now, Mary, be careful. You’re not very good at these vines. You sometimes forget you have to hold up on the controller to cling on.” I leapt out over the pit, forgot to hold up and died.

I’ve just told you a gaming story, a monologue that tries to connect a series of disparate events into a causal, meaningful narrative. People attempt to turn the events in their own lives into narratives in the same way. We try and find causality in a world far too complex for our tiny brains to understand. I say, “I keep on moving from city to city, from country to country, because I can’t stand easiness.” We try to find meaning in causality. I say, “I can’t stand easiness because I’m searching for something that I feel is missing.”

Connecting events together into a narrative to find causality takes us a step from alarming randomness into delicious patterns. Using narrative devices - revelation, symmetry, ellipsis - to create meaning takes us another reassuring step into even more delicious, higher level patterns.

The Pitchfork review of Moonface’s record, Organ Music Not Vibraphone Like I’d Hoped, said Sunset Rubdown’s record, Dragonslayer, is “A personal record about the toll that the worlds inside someone’s head take on his relationships.”

Spencer Krug, the songwriter for Moonface and Sunset Rubdown, expresses patterns in his art that he cannot adequately communicate to the people in his life. This type of communication failure can also occur when reacting to art by another. I’ve tried so many times to explain to people what I like about Sunset Rubdown, how the corridors of the basement of the House of Leaves make me feel, what I see in Al Pacino’s eyes when Frank Serpico realises that his girlfriend is about to leave. But I can’t make myself understood.

With games, it’s even worse. I have some hope that I could make the people I care for understand why I like Hoop Dreams, because they know how documentaries work. But the structure and tropes of games are a mystery to my beloveds. We do not even have a common ground from which to start. So the things I learn from games, the feelings I get, the experiences I have, are locked up inside me, inexpressible.