SICP exercise 2.06
From Drewiki
Problem
In case representing pairs as procedures wasn't mind-boggling enough, consider that, in a language that can manipulate procedures, we can get by without numbers (at least insofar as non-negative integers are concerned) by implementing 0 and the operation of adding 1 as
(define zero (lambda (f) (lambda (x) x))) (define (add-1 n) (lambda (f) (lambda (x) (f ((n f) x)))))
This representation is known as Church numerals, after its inventor, Alonzo Church, the logician who invented the λ calculus.
Define one and two directly (not in terms of zero and add-1). (Hint: Use substitution to evaluate (add-1 zero)). Give a direct definition of the addition procedure + (not in terms of repeated application of add-1).
Solution
Start by looking closely at the two given definitions. zero is a procedure of one argument named f. When zero is applied to an argument, it returns another procedure of one argument named x. That procedure simply returns the value of its argument, i.e., the value of x.
From the definition of add-1, it's clear that the argument f in the definition of zero represents a procedure of one argument, since that's how it's used in the body of add-1. So zero is a procedure which takes a procedure argument and applies it to another argument zero times; and add-1 returns a procedure which applies f once to the composition of n and f. Therefore, (add-1 zero) should return a procedure which applies its argument f exactly once to another argument x.
Let's use the hint provided in the text and evaluate (add-1 zero) using substitution... but first, for clarity, let's rename the arguments in zero:
(define zero (lambda (g) (lambda (y) y))) (add-1 zero) (add-1 (lambda (g) (lambda (y) y))) (lambda (f) (lambda (x) (f (((lambda (g) (lambda (y) y)) f) x))))
We can further simplify this lambda expression by evaluating the sub-expression buried inside it:
(((lambda (g) (lambda (y) y)) f) x)
Start by evaluating its inner-most combination:
((lambda (g) (lambda (y) y)) f)
This is in fact the same expression as (zero f), which evaluates to
(lambda (y) y)
Substituting this lambda expression back into the sub-expression, we get:
((lambda (y) y) x)
This combination simplifies to just x.
Now, substituting the result of this evaluation back into the full expression we get:
(lambda (f) (lambda (x) (f x)))
This procedure looks a lot like the definition of zero, except that this one applies f exactly once to the argument x. So this is our definition of one:
(define one (lambda (f) (lambda (x) (f x))))
And it follows that two is:
(define two (lambda (f) (lambda (x) (f (f x)))))
We can verify the definition of two through evaluation by substitution of the expression (add-1 (add-1 zero)). We already know the value of (add-1 zero): it's one, so we can skip ahead to this expression:
(add-1 (lambda (f) (lambda (x) (f x))))
Once again, let's use the trick of renaming the bound variables in this expression, f->g and x->y. Then continue evaluating by substitution:
(add-1 (lambda (g) (lambda (y) (g y)))) (lambda (f) (lambda (x) (f (((lambda (g) (lambda (y) (g y))) f) x))))
And again, starting with the inner-most combinations and working outwards:
((lambda (g) (lambda (y) (g y))) f) (lambda (y) (f y))
And then substituing this expression back into the larger combination:
((lambda (y) (f y)) x) (f x)
Finally, substitute this result back into the full expression:
(lambda (f) (lambda (x) (f (f x))))
which is exactly what we predicted as the value of two.
Now let's tackle the direct definition of +. We'll start with that powerful strategy of synthesis, wishful thinking. Let's pretend we have a procedure + that will add two Church numerals. Here are the first and last steps, respectively, of the evaluation by substitution of (+ two two):
(+ (lambda (f) (lambda (x) (f (f x)))) (lambda (f) (lambda (x) (f (f x))))) (lambda (f) (lambda (x) (f (f (f (f x))))))
The resulting lambda expression is the Church numeral representation of four. It looks exactly like the Church numeral representation of 2 except that x in the body of the lambda expression is replaced by (f (f x)), which is the body of the lambda expression of the Church numeral representation of two. Our implementation of + must transform (+ two two) into this expression.
Since the result of the application of + to two arguments will be a Church numeral, the result will always have the form
(lambda (f) (lambda (x) (...)))
Let's try using the same form as procedure add-1, except we'll replace the expression (n f) with (b f), and apply a to that result:
(define (+ a b) (lambda (f) (lambda (x) (a ((b f) x)))))
Note the similarity to add-1:
(define (add-1 n) (lambda (f) (lambda (x) (f ((n f) x)))))
Now let's test this definition by substitution:
(+ two two) (+ (lambda (f) (lambda (x) (f (f x)))) (lambda (f) (lambda (x) (f (f x)))))
Before continuing, let's rename the bound variables for clarity:
(+ (lambda (g) (lambda (y) (g (g y)))) (lambda (h) (lambda (z) (h (h z))))) (lambda (f) (lambda (x) ((lambda (g) (lambda (y) (g (g y)))) (((lambda (h) (lambda (z) (h (h z)))) f) x))))
Now simplify the inner combinations, starting with the right-most:
((lambda (h) (lambda (z) (h (h z)))) f) (lambda (z) (f (f z))) ((lambda (z) (f (f z))) x) (f (f x))
This result looks promising: it has reduced the 2nd argument b to the inner-most combination of our representation of two. Continue now with the next combination in the full expression:
((lambda (g) (lambda (y) (g (g y)))) (f (f x)))
Unfortunately, we can see right away that this expression is going to fail. We'll be substituting (f (f x)) for g in the lambda expression, when we really want it substituted for y. We want plain-old f substituted for g. But we can fix this problem by creating a new lambda expression that does what we want, again via wishful thinking. Let's start with g. To substitute f for g, we apply the (lambda (g) ...) part of the expression shown above to f:
((lambda (g) (lambda (y) (g (g y)))) f) (lambda (y) (f (f y)))
Now we can apply this expression to the value (f (f x)), substituting that value for y in the expression above, to get exactly what we want:
((lambda (y) (f (f y))) (f (f x))) (f (f (f (f x))))
So it appears that the part of our definition for + which applies b to f and the result of that application to x is correct; i.e., this part of the definition
((b f) x)
is good. But we should not apply argument a directly to the result of that combination. First we have to apply a to f, just as we do for b, i.e.:
((a f) ((b f) x))
which makes sense in an intuitive way.
Here's the new definition of +:
(define (+ a b) (lambda (f) (lambda (x) ((a f) ((b f) x)))))
Let's evaluate by substitution using our test case again (using the renamed bound variables for each instance of two):
(+ two two) (+ (lambda (g) (lambda (y) (g (g y)))) (lambda (h) (lambda (z) (h (h z))))) (lambda (f) (lambda (x) (((lambda (g) (lambda (y) (g (g y)))) f) (((lambda (h) (lambda (z) (h (h z)))) f) x))))
Simplifying the inner combinations, right to left, starting with the application of argument b to </tt>f:
((lambda (h) (lambda (z) (h (h z)))) f) (lambda (z) (f (f z)))
Then this value applied to x:
((lambda (z) (f (f z))) x) (f (f x))
Now the application of argument a to f:
((lambda (g) (lambda (y) (g (g y)))) f) (lambda (y) (f (f y)))
And finally the application of this value to the result of evaluating ((b f) x):
((lambda (y) (f (f y))) (f (f x))) (f (f (f (f x))))
And finally the full expression:
(lambda (f) (lambda (x) (f (f (f (f x))))))
It works!
Now that we've crafted the proper definition of +, we can reason about why it works. Here's the definition again:
(define (+ a b) (lambda (f) (lambda (x) ((a f) ((b f) x)))))
For comparision, here's add-1 again:
(define (add-1 n) (lambda (f) (lambda (x) (f ((n f) x)))))
The definitions are similar. The primary difference is that in +, we're applying the result of the combination (a f), instead of just f. Note that when a in the expression (+ a b) is one, it's the same as (add-1 b).
In the definition of (+ a b), the ((b f) x) combination "strips off" the (lambda (h) (lambda (z) ...) part of the definition of b, making it a suitable argument for the procedure returned by evaluating (a f). In other words,
((a f) ...)
has the same form as
((b f) x)
only with b replaced with a and x replaced with the "stripped" Church numeral representation of b. Then the application of the result of (a f) to that representation strips off the (lambda (g) (lambda (y) ...) part of that numeral, leaving just the composition of a and b using procedure f and argument x. The final (lambda (f) (lambda (x) ...) in the body of the definition of + wraps that representation so that it can be used in subsequent Church numeral operations.

