The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

Lisp in Your Language

Im a fan of Lisp programming languages, but theres an incredible conceptual elegance that struggles to materialise as readable elegance for many unfamiliar programmers. The underlying concepts are incredibly simple, but the learning curve can represent a disproportionate challenge.

Brief History

Lisp is a derivation of the phrase List Processing. The fundamental idea of the language is that you represent your ideas and constructs as data structures, rather than with structured syntax. Specifically you represent them as lists.

Constructs you may be used to seeing implemented with special syntax or keywords suddenly become similar to the example above above.

(if (= 5 5)
  (print "Sanity!")
  (print "Insanity!"))

if is just a special function that evaluates a condition, if that condition is found to be true, it evaluates the second argument otherwise it evaluates the third argument.

These functions are often known as special forms. Core bits of syntax are often implemented as special forms, but theres nothing particularly special about them. You can implement them yourself using macros. Clojurelike many Lispsimplements many of the core constructs with macros.

Weve been writing code to manipulate data for a long time now. When your code is also data, you can write code to manipulate code just as easily.

The essence of this wonder isnt Clojure though. Its not Racket or Scheme either. These are all just different incarnations of the code-as-data idea. These languages certainly arent the only ones with functions and lists!

What if we could write code-as-data in our language of choice?

An Experiment

Theres a Lisp hidden in many popular programming languages, although it may take a bit of work to uncover it. You may have to do things you wont be proud of, but if you can think of a programming language with lists and higher-order functions, then it will be there. Take Javascript, for example.

What is stopping us from simply translating the syntax from the above example into Javascript?

Nothing, except it doesnt do much. It returns an array that contains a function and a string. Just the way Lisp wants. But our Javascript runtimes arent expecting us to be writing code this way. If it was possible to ask them to try and execute all arrays as though they were functions, there would be chaos.

Were going to have to do a little bit of work to make this happen. Lets define an eval function which will interpret an expression.

function eval(expression) {
  // the first item is the function
  var fn = expression[0];

  // the remaining items are the arguments
  var args = expression.slice(1);

  // call the function with these arguments
  return fn.apply(null, args);
}

And to see it in action:

eval([alert, 'Hello world!']);
// alerts 'Hello world!'

Thats it, weve implemented a (very minimal) Lisp. We can try out some other built-in functions too. From now on, the call to eval will be omitted from examples for brevity.

[parseInt, '4.41'] // 4
[isNaN, 103]       // false
[btoa, 42]         // 'NDI='

Theres a good reason why our eval function wont work if you try it with console.log or document.write, so stick to alert for now.

Expressions All The Way Down

From here on, well refer to the lists in our code as expressions. This helps distinguish them from list data structures. What happens when we try and evaluate an expression that already contains another expression?

[alert, [prompt, "What is your name?"]]

We get an alert that tries to alert the inner expression as though it was an array. We need to make our eval function understand that if it finds an expression as an argument, it should evaluate it as code, not data.

function eval(expression) {
  // the first item is the function
  var fn = expression[0];

  // the remaining items are the arguments
  var args = expression
    .slice(1)
    .map(function(arg) {
      // if this argument is an expression, treat it as code
      if(arg instanceof Array) {
        return eval(arg);
      } else {
        return arg;
      }
    });

  // call the function with these arguments
  return fn.apply(null, args);
}

Now weve got some recursion in the mix, were getting somewhere. This function will evaluate every array it finds, no matter how deep into the structure.

[alert, [prompt, "What is your name?"]]

Syntax & Names

So far, so good, but how would we do Maths?

Like it or not, this is definitely going to give you a syntax error.

One of the genuine benefits of picking a language that already understands Lisp is that the simplicity of the syntax leaves an abundance of characters to use as identifier names. For instance, in Clojure + is just the name of a function that happens to be responsible for adding numbers.

When we want to borrow these transcendental concepts in our syntax heavy languages, we have to do some extra work.

function add(a, b) {
  return a + b;
}

[add, 5, 5] // 10

This is elegant for sure, but theres scope for more mischief here. Try this instead.

['+', 5, 5] // Error: '+' is not a function

Lets define some native functions.

var native = {
  '+': function(a, b) {
    return a + b;
  },
  '-': function(a, b) {
    return a - b;
  }
};

[native['+'], 5, 5] // 10

This ends up feeling verbose, but some tweaks can alleviate it. Pass your native object to eval as a second argument.

function eval(expression, native) {
  // the first item is the function or it's name
  var fnName = expression[0];

  // resolve the function from native if necessary
  var fn = typeof fnName === 'string' ? native[fnName] : fnName;

  // the remaining items are the arguments
  var args = expression
    .slice(1)
    .map(function(arg) {
      // if this argument is an expression, treat it as code
      if(arg instanceof Array) {
        return eval(arg, native);
      } else {
        return arg;
      }
    });

  // call the function with these arguments
  return fn.apply(null, args);
}

['+', 5, 5] // 10

Hopefully, youre wondering why this doesnt feel like the zen of simplicity that is associated with Lisps. And youre right. Its not. But if you wanted simple, then you should ask yourself what on earth are you doing reading about implementing a makeshift lisp in an already confused programming language?

This is a sandbox for us to do unreasonable things in. Missing out on these kinds of hacks would be a wasted opportunity. Go ahead and implement +, -, *, /, = and any other operators you think might be useful as native functions. Well use them later on.

Variables

A language without variables would be difficult, so well implement them.

function def(name, value) {
  window[name] = value;
}

[def, a, 5]

Our def function takes a variable name and a value to assign to it, then it binds it onto the window objectwhich is the global scope in Javascript. However, theres a real elephant in the expression. We arent responsible for resolving the values of variables within the expression. The Javascript implementation is going to do that for us.

It will try to resolve the value of a. We havent declared it, so it will throw an error. Or even worse, if we have declared it, but not initialised it, well end up with undefined as our name argument. Of course Javascript has an excellent way of dealing with this. Coerce undefined to a string, then use it as a key all the same (oh, Javascript).

Ah well. The obvious solution is to pass the name as a string instead.

[def, 'a', 5]
[alert, ['+', a, a]]

Great, except it still doesnt work. The second expression is evaluated by the runtime before we get a chance to interpret the first. How did we solve this last time? Use strings instead.

Scope

[def, 'a', 5]
[alert, ['+', 'a', 'a']]

Now we have to try and resolve every string argument as a variable. Were also going to have do the same with functions, so that we can use variables as the first item in lists.

Lets bite the bullet and introduce a simple scope, then have all strings refer to values within it. If a string doesnt refer to a value, then well just use its raw value.

Instead of accepting the native object as a second argument, accept a scope object instead. This way, we can pass our native object in as the root scope object and nothing will break.

function eval(rawExpr, scope) {

  // if the expression isn't a list, just return it
  if(!(rawExpr instanceof Array)) {
    return rawExpr;
  }

  // use existing local scope or create a new one
  scope = scope || {};

  // resolve all our new string names from our scope
  var expression = rawExpr.map(function(symbol) {
    if(symbol in scope) {
      return scope[symbol];
    } else {
      return symbol;
    }
  });

  // the first item is the function
  var fn = expression[0];

  // the remaining items are the arguments
  var args = expression
    .slice(1)
    .map(function(arg) {
      // if this argument is an expression, treat it as code
      if(arg instanceof Array) {
        return eval(arg, scope);
      } else {
        return arg;
      }
    });

  // call the function with these arguments
  // and expose scope as this
  return fn.apply(scope, args);
}

We used the first argument of .apply to expose the scope as this to each of our functions. Well define a new, native version of def to show this in action (excuse the pun).

var native = {
  def: function(name, value) {
    return this[name] = value;
  },
  print: console.log.bind(console)
};

We can also add a print method, just in case you were fed up of using alert. Lets test that out.

['print', ['def', 'a', 5]]

It may not be the most beautiful code youve ever seen, but it works.

Special Forms

Weve got evaluable expressions, but we dont have any way to control them. Theres no sense of a conditional statement, a function, or even a way to execute multiple expressions at once.

Our eval function currently tries to interpret every expression it sees. Well have to denote that some functions are special forms that will handle the evaluation of their own arguments.

function SpecialForm(fn) {
  fn.__isSpecialForm__ = true;
  return fn;
}

Then well tweak the eval function, to prevent it from evaluating expressions that are arguments to a special form.

// ...
// the first item is the function
var fn = expression[0];

// the remaining items are the arguments
var args = expression
  .slice(1)
  .map(function(arg) {
    // don't evaluate the expression if it is a special form!
    if(arg instanceof Array && (!fn.__isSpecialForm__) {
      return eval(arg, scope);
    } else {
      return arg;
    }
  });
// ...

Do

Lets test out our new special forms and implement do. It evaluates all of its arguments, which allows us to evaluate multiple expressions in series.

In traditional Lisp:

(do
  (print "Hello")
  (print "World!"))

Well add it as a new native function.

var native = {
  'do': SpecialForm(function() {
    var exprs = [].slice.call(arguments);
    return exprs.reduce(function(_, expr) {
      return eval(expr, this);
    }.bind(this), null);
  }
};

We can also do a nice trick with reduce to make sure that the value of the last expression is returned.

Lets translate the example above to our new syntax and watch it run.

['do',
  ['print', 'Hello'],
  ['print', 'World!']]

// Hello
// World!

If/Else

What good is a programming language without conditionals? The next challenge is implementing if statements. Howeverwith our new special formsit should be trivial.

var native = {
  if: SpecialForm(function(condition, success, failure) {
    var passed = eval(condition, native, this);
    return eval(passed ? success : failure, native, this);
  }
};

Thats it. if/else in 3 lines of code.

['if', ['=', 3, 3],
  ['print', 'true'],
  ['print', 'false']]

// true

If this is your first time implementing a Lisp, this should be a special moment. You have implemented conditional control flow as data.

Functions

Functions are the last hurdle between here and having a language that can actually do things. However, its quite a hurdle.

Heres what they look like in more conventional Lisps.

(def shout
  (fn [name planet]
    (print planet name)))

This is actually an anonymous function being bound to a local variable with def. We already have an implementation of def so all we need now is an implementation for fn.

Lets break down the arguments to fn.

The first one is an list of arguments and the second one is the expression (or function body).

var native = {
  fn: SpecialForm(function(defArgs, expr) {
    return function() {
      var callArgs = arguments;

      // create a distinct new scope
      var childScope = Object.create(this);

      // use the arguments definition to bind each call arg
      // to the appropriate scope variable.
      defArgs.forEach(function(argName, index) {
        childScope[argName] = callArgs[index];
      });

      // evalue the function body
      return eval(expr, code, childScope);
    }
  })
};

There it is. Dynamic binding into a lexical scope. Can we just take a moment to agree that prototypal inheritance rocks, too?

['do',
  ['def', 'shout',
    ['fn', ['planet', 'greeting'],
      ['print', 'greeting', 'planet']]],
  ['shout', 'hello', 'world']]

// hello world

This could definitely be less verbose, so we can take a hint from some other Lisps and create defn too.

var native = {
  defn: SpecialForm(function(name, args, expr) {
    var fn = native.fn.call(this, args, expr);
    return native.def.call(this, name, fn);
  })
};

We simply tie together our existing implementation of def with fn.

['do',
  ['defn', 'shout', ['planet', 'greeting'],
    ['print', 'greeting', 'planet']],
  ['shout', 'hello', 'world']]

// hello world

Much better.

Once a language has functions, the sky is the limit.

["defn", "fib", ["n"],
  ["if", [">", "n", 1],
    ["+",
      ["fib", ["-", "n", 1]],
      ["fib", ["-", "n", 2]]],
    1]]

No self-respecting functional programming demo comes without a horribly inefficient demo of a non-memoized recursive Fibonnaci implementation. This one is no exception.

["print", ["fib", 8]]
// 34

Considerations

You might have noticed that our code is completely JSON compliant. We use primitives and lists. This means we can actually use JSON as source files for our language.

What? You mean we can embed a language with first class functions inside JSON? Yeah, we can.

Our language is still very short on the ground in terms of a standard library. We havent really considered data structures, namespaces, exceptions, debugging or macros either.

Conclusion

Im putting together an implementation of this Javascript Lisp, along with a REPL and a growing set of native functions on Github. Feel free to use it as a reference. Its important to remember is that this is a toya sandbox for learning. Its not meant to be taken seriously and it certainly shouldnt be used in any real systems. Its inefficient and insecure.

Heres a short video of the REPL in action.

More than anything else, implementing a programming languageno matter how small or strangeis a great way to learn about the language you implement it in. Language design in general is a fairly eye-opening experience and hopefully this has also helped open your eyes to the simple, yet powerful nature of Lisps.

Ill revisit this language again in the future, to talk through the process of implementing macros, then well move as much native code as possible inside the language.

Now open your editor and do this again in another language, then tweet me when youre done!

Continue reading on danthedev.com