And now for something completely different

Posted in Uncategorized by Beta on the July 29th, 2008

With this cite of Monthy Python, I will move into another family of calculi. What’s wrong with our good ol’ λ-calculus? Nearly nothing, except if you want to implement it in a real PC.

In one of the firsts posts I’ve mentioned the α-congruence. The idea is that terms that only differs in the name of the variables are “the same”. For instance, take

T1 = (λx. x)
T2 = (λy. y)

We said T1 ≡ T2 to say that T1 and T2 are in the same class of equivalence modulo α. In other word, they are the same.

Suppose that we want to store a set of terms with its results. Something like a dictionary of terms used as a cache. Suppose that in that dictionary we have

key: (λx. x) (λx. x), value: (λx. x)

That is, given the identity term applied to himself, the result will be the identity term.

Suppose now that we are computing the result of the term

T = (λy. y) (λy. y) z

It will be a good idea to use the information in the cache, in order to make less computations. But the term in the cache differs from the term T in that the variables aren’t the same. So we need to rename them in order to compare them.

A guy named Nicolaas Govert de Bruijn invented a way to avoid this renaming of variables. The idea is quite simple, but yet hard to grasp. Instead of naming the variables, enumerate them by the depth of that variable in the term (respect to the λ operator that bind it). So, for instance, the identity term will be

(λ1)

That is, the 1 refers to the the first λ, looking from the 1 point of view. Another example: (λx.λy. x) will be

(λ(λ2))

Another example: (λx.λy. x y) z will be

(λ(λ2 1)) 1

Note that the free variable z is called 1 in this calculus. Why? Because it can be bounded to an invisible outer λ:

z.(λx.λy. x y) z)

Exercise: In the following term, which index (that’s the new name for variables) corresponds to which lambda? I will omit some parenthesis.

(λλλ1 2 3 5)

Answer (don’t look before thinking!):

(λλλ1 2 3 5)

Another exercise:

λλ(1 2 3) 1

Answer:

(λλ1 2 3) 1

Note that the gray numbers refers to the same variable in a lambda not shown (i.e. a free variable).

Formalizing, the terms in λDB (that’s the name of λ-de Bruijn) are:

T ::= n | λT | T T’

where n is a natural number.

We now how to write λDB terms. Now we need to make them calculate something. Let’s start by the β-reduction:

(λa) b → a{1 ← b}

That’s “if I have a function, with the body a, and I execute it with the parameter b, I have to replace every occurrence of 1 in a by b“. Why 1? Because that’s our variable for the λ we are taking off.

Let’s try to define a{1 ← b} by a colored example.

(λλ1 2 3) 1 2 → (λ1 2 3){1 ← 1} 2

Suppose that we said:

(λa){1 ← b} = (λa{1 ← b})

we are making a big mistake: (λ1 2 3){1 ← 1} = λ(1 2 3){1 ← 1}

Now we are changing the red 1, instead of the yellow 2! In order to avoid that, we have to increment the variable we want to replace each time we cross a λ:

(λa){i ← b} = (λa{i+1 ← b})

Let’s continue with our example:

(λ1 2 3){1 ← 1} 2 = λ(1 2 3){2 ← 1} 2

Now we have to distribute the replace function:

λ(1 2 3){2 ← 1} 2 = λ(1{2 ← 1} 2{2 ← 1} 3{2 ← 1}) 2

which gives us another rule:

(a b){i ← c} = a{i ← c} b{i ← c}

Now we reach the variables, so we have to make the replacement. Suppose that we just put the 1 when we found a 2:

λ(1{2 ← 1} 2{2 ← 1} 3{2 ← 1}) 2 = λ(1 1 3) 2

Something does not looks right. Take a look at the grey numbers. We know that both refers to the same variable. But why they are different? The two of them must to be changed in order to points to the first free variable. First, let’s take a look at the 1. Remember that we have to cross a lambda, so now the free variable to what it refers is one lambda away. So instead of just put the term we are replacing, we have to update it to increase the variables for all the lambda we crossed. For that, I will introduce a new operator, called “U” (Update). Having fixed the 1, let’s fix the 3. The three right now points to the second free variable, because it has lost one lambda. So, we have to subtract one in each free variable. So, the rules for replacing a variable are:

i{i ←b} = U(0, i, b)

i{j ←b} = i-1   if j > i

i{j ←b} = i   if j < i

In our example:

λ(1{2 ← 1} 2{2 ← 1} 3{2 ← 1}) 2 = λ(1 U(0,2,1) 2) 2

The U operator is simple. The first parameters count the lambda it crosses, in order to take into account the bounded variables (i.e. the variables that don’t need updating).

U(i, j, (a b)) = U(i, j, a) U(i, j, b)
U(i, j, (λa)) = λU(i+1, j, a)
U(i, j, n) = n + j - 1   if n > i
U(i, j, n) = n   if n <= i

Continuing with our example:

λ(1 U(0,2,1) 2) 2 = λ(1 2 2) 2

Now we can execute the beta reduction:

λ(1 2 2) 2 → (1 2 2){1 2} = 1{1 2} 2{1 2} 2{1 2} = U(0, 1, 2) 1 1 = 2 1 1

Note that the expression U(0, 1, x) = x for all term x.

Something I’m not going to show is that there is a bijection between the lambda calculus classic and this calculus. By bijection I mean that they computes the same things in the same way. If we define a translation between terms T (suppose that if t is a lambda term, then T(t) is a lambda DB term, an viceversa).

So, given a term t either in lambda or lambda DB,

t → … → t’

implies

T(t) → … → T(t’)

Well, that’s it for lambda DB. Until next post!

How to create “recursive” functions in λ-calculus

Posted in Uncategorized by Beta on the June 13th, 2008

As you probably already now, recursive function are those that call them self. For instance, you can define the factorial function as:

fact(0) = 1
fact(n) = n*fact(n-1)

This doesn’t look like something you can say with the small syntaxis of the λ-calculus. That is right, but I will prove that you can say something a bit different -but with the same meaning. By doing so, I will prove that it’s possible to create recursive functions in this calculus.

Let’s try to transform the factorial function into λ-calculus sintax.
First of all, we have to get rid of the different cases. That’s easy:

fact(n) = if (n = 0) then 1 else n*fact(n-1)

I invite the reader to read the post “Arithmetics with λ-calculus”. Because in λ-calculus there is no if, nor booleans, we have to create them. I will use the same technique for the numbers:

true = (λx y. x)
false = (λx y. y)

if = (λx.x)

true is the function that given two parameters, returns the first one. false is similar to true, but it returns the second parameter. That sounds much like the if and else part of the if structure, respectively. That’s why if is just the identity function. So
if condition s t → condition s t
and if condition → … → true
then true s t → s

(and the oposite for false)

Just for fun, let’s make some examples:
and = (λx y. if x y false)
or = (λx y. if x true y)
not = (λx y z. if x z y)

Why does not receive more parameters than and??? Because for and we need just to return the correct boolean, and the boolean will make the “if” thing. But for the not, we’ve to change the order of the parameters in the “if”.

So know we’ve got the booleans. Now, let’s define a useful function for our factorial:

iszero = (λx. x alwaysfalse true)
alwaysfalse = (λx.false)

Remember that a number n is the n-ary application of a function in a parameter. In this case, if the number x is greater than 0, then it will apply x times the function that always returns false. So

iszero 1 → … → false

and

iszero 0 → … → true

So we have almost everything for our factorial function. We have the functions mult, iszero, pred, if.

fact = (λn. if (iszero n) 1 (mult n (fact (pred n))))

Now, as I said before, when we use the name iszero, we are meaning the function written above, with all the parenthesis, λ’s, etc. So, we still have something to change in our fact expression. Supose that we change the fact label in the body of the fact definition… it will be changed by another expression with a fact inside it, so we change it again… And we continue until we hang ourselves, or until we got killed by bore.

As the bride says: “how do we solve this dilema”. And, as Bill answers “Well, It just happens I’ve got the solution!”.

If we find a function (let’s call it Fix) like this:

Fix f → … → f (Fix f)

then we are done. Let’s change the fact function so it also take a function as a parameter:

fact = (λf n. if (iszero n) 1 (mult n (f (pred n))))

So, (for simplicity, I will write fact instead of the term above)

Fix fact 3 → … → fact (Fix fact) 3 = (λf n. if (iszero n) 1 (mult n (f (pred n)))) (Fix fact) 3

putting the (Fix fact) parameter inside the term:

→ (λn. if (iszero n) 1 (mult n ((Fix fact) (pred n)))) 3

putting the 3 parameter inside the term:

→ if (iszero 3) 1 (mult 3 ((Fix fact) (pred 3))))

iszero 3 is false, and pred 3 is 2, so

→ mult 3 ((Fix fact) 2))

And, again Fix fact is fact (Fix fact)

→ mult 3 (fact (Fix fact) 2))

We now how the story ends: →…→ mult 3 (mult 2 (mult 1 1))) → … → 8

No, just checking if you’re still with me. The number is, of course, 6.
Again, if you don’t understand anything, try writting it down in a paper, and follow the ideas.

You can see the Fix function as something that feeds the original function (fact) with the next step to execute. Fix exists:

Fix = (λf. (λx. f (x x)) (λx. f (x x)))

Fix g = (λf. (λx. f (x x)) (λx. f (x x))) g → (λx. g (x x)) (λx. g (x x))

→ g ((λx. g (x x)) (λx. g (x x))))

and that exactly:

g (Fix g)

That’s it for pure λ-calculus. The next step is to see different calculus that will address implementation problems that arises when trying to make a λ interpreter. ‘Till next post!

A little summary of what we’ve learned

Posted in Uncategorized by Beta on the May 16th, 2008

This new post is intended to summarize all what is written in this page.

First of all, an errata from the previous post: I encourage the reader to define other functions, like the predecesor of a number… Well, I can’t remember if I planned to be that evil, or if I was just too drunk or what, but the fact is that that was an open question for many years. So if you made it (without cheating), you can say out-loud “I’m a Master in lambda calculus!!!”. But nobody will care, to be honest.

The λ-calculus formalism

I’ve learned how to write lambda in the posts! I will present the λ-calculus in a more formal way. Let V be a (countable) set of variables. The terms T in the calculus are defined in a semantic way as:

T :== x | (T T’) | (λx.T)

(A term is a variable, or an application, or an abstraction. Recall to the 3rd post.)

The β-reduction is the rule

(λx.T) R → T[x/R]

where the substitution T[x/R] is defined as:

  1. x[x/R] = R
  2. y[x/R] = y
  3. (T T’)[x/R] = T[x/R] T’[x/R]
  4. (λy.T)[x/R] = (λy.T[x/R])

The idea is quite simple: we iterate through the terms until we find a variable. If the variable is what we are looking for, then we replace the variable for the new term we want in that place. This works the same as the find/replace function of any text processor. A little tricky thing is that we are assuming that the variable names are different in each abstraction. This is like what we’ve done in the previous post by coloring the variables in order to distinguish them. If not additional rules are needed, and a condition is required in rule 4 (do not pay too much attention to this):

4. (λy.T)[x/R] = (λy.T[x/R]) if y does not appear in R
5. (λx.T)[x/R] = (λx.T)
6. (λy.T)[x/R] = (λz.T[y/z][x/R]) if y does appears in R and z does not appear in any of the terms.

Rule 6 makes a name change to avoid a variable capture (we don’t want the y variable in R to be “captured” by the new λ).

Examples

I was being criticized for the lack of examples. So here are a few. Maybe I add some others later. Again, I highly recommend to follow them in paper.

1) A function (let’s call it ‘K’) that given two parameters, returns the first one:

K = (λx.(λy.x))

K u v = (λx.(λy.x)) u v → (λy.x)[x/u] v = (λy.x[x/u]) v = (λy.u) v → u[y/v] = u

Note: I’ve to explain the different use of ‘=’, and the difference with ‘→’ in this example. The first ‘=’ in the definition of K, is a way of saying “whenever I say ‘K’, I’m meaning the term (λx.(λy.x)))”. So in ‘K u v = …’ I’m just using the definition of K. Then we have the → for the β-reduction. Then we have ‘=’ for the substitution. Why ‘=’? Because it is an operation that have to take the first term to the subtituted term “in one step”. I mean, I can write (λy.x)[x/u] = (λy.u). It is an atomic operation.

2) A function (let’s call it ‘S’) that given three parameters, distributes the third parameters into the other two. That is, “f(x,y,z) = x(z) y(z)

S = (λx.(λy.(λz. (x z) (y z))))

S u v w = (λx.(λy.(λz. (x z) (y z)))) u v w → (λy.(λz. (x z) (y z)))[x/u] v w = … = (λy.(λz. (u z) (y z))) v w → (λz. (u z) (y z))[y/v] w = … = (λz. (u z) (v z)) w → ((u z) (v z))[z/w] = … = (u w) (v w)

Arithmetic with Lambda Calculus

Posted in Uncategorized by Beta on the May 2nd, 2008

One simple way to make arithmetic expressions in lambda calculus, is just to extend the calculus with the arithmetic rules:

(λx.(λy. x+y)) 2 3 → (λy. 2+y) 3 → 2+3 = 5

But that’s pretty easy for us, and (as you problably have already noticed) we like to complicate what’s easy. So let’s try to make the sum with the base lambda calculus. Why? Because it’s fun!

First of all, a bit of notation: let’s say that we can write the expression (λx.(λy. T)) (i.e. the function that takes a parameter and returns the function that takes another parameter and returns the generalized term T) as λx y. T

Secondly, I highly recomend you to follow the examples in paper, trying to see why one step lead to another.

In the beginning there was a Beta reduction… But He (Alonso Church) says “Let there be the numerals”, and the numerals appears:

0: (λf x.x)

1: (λf x. f x)

2: (λf x. f (f x))

N: (λf x.f (… f (f x))

This is just a way of doing this. You can say “for me, the 0 is (λx.x), the 1 is (λx y. y), …”. Or “for me, all this stuff is pure crap. Got a life, you punk!”. I choose this way for one simple reason: it is in the wikipedia, and it looks good and easy to use.

So the numerals are functions. Why? What’s the meaning of that? Let’s try to answer. The 1 is saying “given two parameters, we apply the first one to the second one”. And the N says: “given two parameters, we apply N times the first one to the second one”. So it’s looks like a for loop expression in a normal programming language. In order to answer to the first question, let’s show how to sum:

SUM: (λn1 n2 f x. n1 f (n2 f x))

Note: because all the numerals takes two parameters, it is expected that the Sum function returns some numeral that also takes two numbers. That’s why we receive the two numerals (n1 and n2) and the two parameters every numeral needs (f and x). Let’s make a simple sum:

SUM 2 1: (λn1 n2 f x. n1 f (n2 f x)) (λf x. f (f x)) (λf x. f x) → we replace n1 by the first numeral: (λn2 f x. (λf x. f (f x)) f (n2 f x)) (λf x. f x) → we replace n2 by the second numeral → (λf x. (λf x. f (f x)) f ((λf x. f x) f x))

This is a mess of f and x, so let’s try to make a difference between them. I will put gay colors in it.

f x. (λf x. f (f x)) f ((λf x. f x) f x))

We can reduce the orange numeral with the fucsia variables:

f x. (λf x. f (f x)) f ((λf x. f x) f x)) → (λf x. (λf x. f (f x)) f ((λx. f x) x)) → (λf x. (λf x. f (f x)) f (f x))

Now we can reduce the green numeral with the fucsia f (we have to change f by f), and then we can replace the green x by the fucsia (f x) expression:

f x. (λf x. f (f x)) f (f x)) → (λf x. (λx. f (f x)) (f x)) → (λf x. f (f (f x)))

And now everything is girly fucsia! And what does this expression means? The numeral 3 of course!

If you want to play more with this arithmetic way of computations, you can try to define the other functions (multipication, predecesor, etc). Or you can take a look at

http://en.wikipedia.org/wiki/Lambda_calculus

Lambda Calculus

Posted in Uncategorized by Beta on the March 28th, 2008

From the Wikipedia: “calculus can refer to any method or system of calculation”, and that’s not an exception for the Lambda Calculus. First of all, lets define the terms of this calculus:

A Lambda Term is something with the form

  • x where x is in V a set of variables.
  • (u v) where u and v are terms. We use “v is applied to u” in this case.
  • (λx.u) where x is in V and u is a term. We use “an abstraction of x in u” in this case.

Here is one example: (λx.(λy.z) x)

But we cannot compute anything with just terms, so lets show the only rule use to transform these terms into a computation. This rule is quite famous (yeah, I know… famous in the Geeks Hall of Fame…), it is the “Beta reduction”:

(λx.u) v → u[x/v]

In order to avoid critics, my nick “Beta” doesn’t came from this rule, ok? I will not call my dog “Albert” nor stuff like that!

And now, let explain the meaning of u[x/v]. Given terms u, v, and a variable x, u[x/v] means “replace each free occurrence of x in u with v“. By free occurrence of a variable x in a term u we mean “each time x does not appear inside a term of the form (λx. …)“.

It is not as hard as it looks. Some examples:

  • x[x/y] = y
  • (x y)[z/(λf.f)] = (x y)
  • (x y)[x/(λf.f)] = ((λf.f) y)

And now some complete examples of lambda terms with beta reductions:

  • x does not reduces.
  • (λx.x) does not reduces also.
  • (λx.x) y reduce to x[x/y], which turns to be equal to y. As you can see, we are applying the term y to the function (λx.x). This function is called “identity”, because computes the identity function f(x) = x.
  • (λx.x x) (λx.x x) reduces to (λx.x x)[x/(λx.x x)] = (λx.x x) (λx.x x) that reduces to (λx.x x)[x/(λx.x x)] = (λx.x x) (λx.x x) and so on…

Another important concept is the alpha conversion. Alpha conversion means that we will treat all the terms that differs only in the name of the bounded variables (variables that are not free) as the same term. I.e. (λx.x) and (λy.y) will be the same term for us.

In the next post, I will show how to use this calculus to compute the arithmetics functions, so don’t miss it!

Basis of Term Rewriting Systems

Posted in Uncategorized by Beta on the March 15th, 2008

A Term Rewriting System (TRS) is a very powerful yet simple math tool used, for instance, for the creation of computational models. Given a set of function symbols S, and an (uncountable) set of variables V, I will define rewriting rule by example.

Suppose we have the “constant” function (function without parameters) 0 (representing a 0 in the natural numbers), the unary function S (as the successor of a number), and the binary function + (as the add operation). Let x and y be in V. These are rewriting rules to compute the add of two numbers:

0 + x → x
S(x) + y → x + S(y)

The arrow can be seen as a “rewrite as” relation. For example, S(S(0)) + S(0) rewrites by the second rule to S(0) + S(S(0)), then to 0 + S(S(S(0))), to achieve the final computation with the first rule: S(S(S(0))). And this is exacty what we expected: 2+1 = 3. Yeah, I know that it doesn’t look like a very big science discovery, but we will soon see that with more complex rules we are going to be able to compute anything.

My Master in Computer Science’s Thesis

Posted in Uncategorized by Beta on the March 13th, 2008

This blog will contain all the news about my progress in my master thesis, entirely sponsored by Manas Technology Solutions. The idea will be to translate a lambda calculus with explicit substitutions with named variables to a calculus à la de Bruijn. If you have no idea what I’m talking about, don’t be afraid, wait for more posts to come!