The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or click here to continue anyway

A Little Code Here and There: The evolution of a programming language over four years.

Four years ago, on a Sunday, I had the idea of creating a programming language. I wanted something statically-typed, but also interpreted. I had tinkered parsers and lexers previously, but had little experience with an ast or an emitter or a vm. "I'll figure it out as I go along".

Today, I'd like to share what I've done in the four years since I started. I've decided to break things down into half-year chunks. Each chunk will cover some major decisions/events, both good and bad. Finally, I'll wrap up with some thoughts on the whole thing.

In the beginning (0.0 - 0.5)

The core of the interpreter is established. The lexer is made, which tokenizes input and makes it available to the parser. The symtab provides a means of creating new vars, and soon, scoping of vars. There are no user-defined classes yet: Classes are instead hardcoded in. The last interpreter I worked on didn't have an ast, and I didn't know how to actually create an ast. So I had to look online for a tutorial, and that's where I got the initial structure of the ast. The emitter takes the ast and writes debugging code. I also created the basic structure of the vm and debug as well. Because I want everything to be fast, I decide to hand roll everything. No flex, yacc, or bison. I did know those were available, however.

The vm is 'address-based'. The bytecode is an array of 64-bit integers, which are actually addresses to vars in the symtab or intermediate storages created by the emitter. This had the advantage of being quick and easy to model as all opcodes would just target addresses. However, this also had problems that I would discover later: By using addresses, code was very, very large. Worse, all intermediate values were in a single pool, but shared between different calls. This is a problem that I didn't contend with until next year.

'show' is a key part of Lily. If you pass it a native function (a function defined within Lily), it tells the instructions that make it up. This was a really neat tool for me, because it made it possible to determine if a bug was the result of the emitter writing bad instructions, or if it was the result of the vm reading the instructions wrong. However, at this time it was something locked up within the interpreter's internals. It was still useful for me in building the interpreter though.

The symtab and lexer have no concept of multiple files. At this point, I was just glad to get everything working.

Around three months in, I decided to convert the entire interpreter to use local structures instead of global ones. This makes it possible to create a new instance of the interpreter and fire it off. This was the first big rewrite of the core, and also one of the most time-consuming. Every function in the core had to be rewritten to support that.

The raiser is created. This provides a means of raising an error throughout the interpreter. However, it's called the error object at this time.

There is some leakage of responsibilities at this point. The ast pool is responsible for determining that a call has the right number of arguments. The parser also has a function called 'determine_ast_type' that attempts to walk through an ast and determine what type it will come up with. Emitter has this info, but I couldn't yet figure out a way to have the ast pool store a string without actually allocating it per-field.

C's double type is provided as number, and the string is called str. It is also possible for any value to be nil (unset), but there is no way of checking that. All of these things are bad.

More of the basic stuff (0.5 - 1.0)

A class called object now works. This type can be thought of as a box that can have any value assigned to it. No, other values do not derive from object, making the name a poor choice.

A callable thing that is defined within Lily code is called a method. If it comes from C, it's called a function. Distinguishing between the two is a bad idea, something that will be uncovered later.

'if' support finally works. Lily's syntax is unique in this regard. You can use

if x: <expr>

and it works for a single expression. One of Lily's interesting syntax decisions is made here: If a curly brace appears after a block, then all entries of that block support multiple expressions. Example:

if a: {     expr1     expr2 else:     expr3     expr4 }

The meaning is, I feel, clear, but without having to put curly braces before and after the


. One might also notice the influence of Python in the


after conditions. Early on, my greatest influence was Python and the Python repl. Python's dis.dis was invaluable for determining if I had done correct evaluation of ast nodes. was created. This became -the- test that I would manually run after each change, unless I didn't. Also, became THE one test that I focused on. Sometimes I forgot to run it manually.

The first real class method, string.concat is created.

It's getting there... (1.0 - 1.5)

I experience a great deal of problems with merging asts. I did not write an ast dumper like I wrote an opcode dumper. By the time I realized both how to do so, and the usefulness of it, the ast pool became a solid piece of code and it wasn't needed.

printfmt is added. This is a simple fprintf-like function which I ended up using somewhat. printfmt was important, because it served as a test case for things being converted to class object when they need to be.

Both methods and functions need to provide a return type in their signature. A declaration might look like this:

method do_nothing(integer a, integer b):nil { }

If you wanted to pass a method as an argument, things got rather...interesting.

method take_method(method m(integer, integer):nil):nil { ... }

Remember how code is composed of addresses and temporary values are addresses too? It turns out that there's a few problems with that. One major one is that when calling a method, the caller needs to figure out what storages need to be saved. If they don't save all of them, then the target call may result in the caller's intermediate values being changed. This is, naturally, really really bad. I spent a great deal of time fixing this sort of issue.

Some time in October, 'aft' is introduced. Lily's allocation strategy when an allocation fails is to free up any values inside of the local function, and tell the raiser that there is no memory. aft exists to make sure this absolutely certain (It's short for allocation fail tester). It does this by running the interpreter with a script, but only allowing one malloc/realloc. Then again, with two. Again, with three. It continues this until the script exits successfully. A good side-effect of this is that a lot of Lily's core structure was still easily changed. I feel that aft made me more defensive (ex: Creating a new var automatically adds it), and thus improved Lily's later core by ensuring that code would never ever leak memory.

lily_error, the error object used to raise errors is renamed into lily_raiser. It retains that name to this day.

I mentioned that string's concat method was introduced earlier, but not how it was loaded. The interpreter assigns an identity to each class. Builtin classes have a defined order. The symtab is responsible for loading up the concat method, and does it by converting integer ID's into classes. String's concat is roughly defined as "two arguments, {STRING_ID, STRING_ID, STRING_ID}". This is then used to build the appropriate function signature (a function that takes two strings and returns a string).

Let's make some terrible decisions (1.5 - 2.0)

Lists are finally added to the interpreter. Combined with the 'object' type, it is now possible to create a value which circularly references itself inside of the interpreter.

Consider "a list of integers" (


). This type is given the class of list. However, that does not fully describe it. It is a list of integers, wherein only integer is allowed to be put inside. I created the signature, which works for both functions and lists. A signature is composed of a class, a list of sub-signatures, and a count of the sub-signatures. Some classes, such as integer, do not have subtypes. As such, their signature can be shared (only one signature describes an integer). For functions (and methods), the first subtype is the return type, and is NULL if the call does not return anything. Similarly, a call has NULL at position 1 in the sub-signature array if it does not take any arguments.

As a bonus, it's possible for two complex signatures to be made which describe the same thing. A whole function internal to the interpreter had to be made to fix this.

Signatures are refcounted. A few times, I had problems with forgetting to ref/deref a signature when using it. A lot of bugs were caused by this.

"This is where you made a garbage collector, right?" Nope. Lily works based off of refcounts. I assumed that, in the event that something was assigned that was circular, I could put a flag on the 'object' that was the culprit and offset the references. Not only was this a stupid idea, it cost me a lot of time both in adding new cases and un-breaking different things.

A lily value is composed of three things: The signature, the raw value (which is a union), and the flags. As a means of saving memory, lists would only contain the flags and the element's (with the signature coming from the parent lists' value). This had the nice side-effect of making subscript get and set A COMPLETE NIGHTMARE.

Refine a couple things (2.0 - 2.5 years)

Signatures being refcounted is beyond stupid. Store them in a linked list, and never free them until the interpreter is being torn down. is created. This is the precursor to the current, and has the purpose of running all tests and ensuring that they work. There are a handful of tests that should fail. There are essentially two passing tests:, and started to grow around this time, eventually reaching about 1K lines of code. This was, in retrospect, a very bad decision because it made pinpointing where the error occured very difficult.

The builtin command to show stuff is finally rewritten to be nicer-looking. It is now finally available for use within Lily code, instead of being a special command hidden within the interpreter.

Old: [0] assign [var #4] [literal #2] New: |____ [0] assign (at line 3) | ====> (integer) var a (from line 3). | <---- (integer) literal (10)

I made dumb decisions. It's time to fix that. (2.5 - 3.0)

The first commit within this window is absolutely massive. It is a complete rewrite of all of Lily's internals so that everything is now register-based. Every method now counts how many values it needs (temporary ones, as well as the vars), and the vm requests that many for the call. Opcodes now target locations that are registers, instead of global addresses. When entering a method, the full stack of living values is adjusted such that stack[0] will point to register #0 of a method.

The symtab is now put in charge of making sure that any signature created is unique. If something wants to make a signature describing 'a list of integers', it pokes symtab to make sure there isn't something that describes that already. This means that signature equality can be tested using good old pointer equality (==) instead of a complicated function.

The hash class is added, using siphash24 for the collision detection. Since the value of a hash can be an object, this should call for extending circle_buster (the horrible 'not a garbage collector' that I was using), except...

I finally put a garbage collector into the language. The symtab is now responsible for tagging every signature that is made with information about if it's possibly circular or not. This decision is easy: If it contains type object, it is considered possibly circular. If it contains something tagged as possibly circular, then it's possibly circular. The vm uses this information to make a gc marker for only the things that need it (so a list of integers would not get a tag, but a list of object would, for example). A new class is expected to provide a marker for doing gc passes, a destroy function (for destroying the entire contents of the value), and a deref function for destroying the contents (but not the actual value itself, to prevent invalid reads if there's a circular reference).

How Lily stores values is rewritten. Each element of a list holds a signature, a raw value, and flags. This is huge, because I could finally create 'lily_assign_value' as a means of 'put this into that'. That was previously impossible under the old design.

Previously, parser would try to determine what class that a method applied against. To do this, it had to walk the ast without actually evaluating things. This was very bug-ridden. This finally got an axe. I did so by creating a string pool. The parser inserts a name 'ex: concat' into the pool, and gets an index. It then creates a tree with that index. The emitter then acquires the name from the pool at the given index. This ensures that various trees won't have strings of varying sizes allocated in them. Trying to reduce memory is one of the bigger parts of Lily, which is why I don't do that.

list::append is created. It is the first function that uses generic types. It is initially described as

list::append(list[T], T value)

. This works...but only because generic functions assume that the first parameter holds all type information. User-defined generic methods are not possible yet.

Packages are introduced into the language, but...they're values. As such, some precautions are taken in the parser to make sure that a value representing a package is not assigned to type object. Because packages are values but not really values, this is awkward, and very limited. The first package made is 'sys', which contains argv as a list of strings.

I decided to add autotools as a means of compiling Lily, now that Lily had three things using the core (mod_lily, the standalone lily runner, and aft).

Feature explosion (3.0 - 3.5)

I posted to /r/programming to ask for feedback about the language. Many people have issues with my naming of classes. C's double is a number, strings are str, methods, methods, and nility. Considering the complaints that I COULD have gotten (ur language sux), I consider this a win. A few people mention the use of an option-like type like what Haskell has as a means of removing nility. I begin to take a more serious eye toward the language at this point, and integrate their feedback. The vm is in charge of determining if a function is defined within Lily code (native) or not (foreign). My goal from here is to make it possible to have an option-like type, but it will be a steep road.

From here on: number => double, str => string, method + function => function, object => any.

Typecasts are now made so they are postfix:


, instead of their former

@(type: value)

. This is the beginning of thinking of things being more chainable and functional, rather than thinking more imperatively.

Dynamic loading (dynaload) starts to shine a bit here. Each class provides a table that is a linked list of entries of 'name, &cfunc, type info'. As an example, the interpreter doesn't initially load string::concat. The emitter contains all type info, and thus can determine if there is an access of the concat method. When there is, the emitter will call the parser to load up the class method. The reason this is deferred is that creating a method isn't cheap: A var has to be created to describe it, a type has to be made, and that type is likely to not be unique. I seem to recall a savings of about 500 bytes, but that's pretty big if there are lots of builtin things that you're not using. Also, this same action happens if the parser sees string::concat.

I finally got an apache module working, and found a way to provide post, get, httpmethod, and env for Lily to use. Unlike with PHP, these are provided in a server package.

A bunch of builtin library functions were added around this time.

The syntax for describing a function was changed too. Examples:

# This does not return a value. function f(integer a, integer b) { ... } # This must return an integer function g( => integer) { ... } # This does not take arguments or return anything function h() { ... }

Generics are improved. The new syntax uses letters starting from A, and enforces that order. I theorize that this is around the time I discovered Scala (which I delight in calling my gateway drug to functional programming). At this point, I began to lean more toward Scala as a reference instead of Python's repl.

Exceptions are finally catchable and raisable, but not inheritable. A first stab is made at user-defined classes too, but there's no inheritance, sadly. One of the interesting things about user-defined classes is that they take from Scala:

class Point(integer x, integer y) {     integer @x = x     integer @y = y } that the body of the class is the constructor of the class. No need to declare members, then initialize them in a constructor function later. The body of a class becomes a function called ::new, because I didn't want to create a 'new' keyword. The idea comes from seeing a Ruby library do it.

User-defined functions finally get access to generics, but it's really, really unstable at first. A great deal of effort is spent on discovering fun segfaults that occur with generics.

Builtin class methods and the like are defined using text now, instead of raw ids. This means that string::concat's internal definition moved from something like '2 args, {STRING_ID, STRING_ID, STRING_ID' to being 'function append(string, string => string)'. This means that all functions that are builtin are now dynaloaded: The interpreter only sets up the structures to grab them through Lily code when they're first referenced, no sooner.

It is now possible to create enum classes like what you would see in Rust. This means that, finally, it is possible to do

enum class Option[A] { Some(A), None }

as a means of creating a type that can either be None or have a value. As with Rust, it is not possible to have a bare variant (variants are always put into enum class values before being assigned to anything).

I finally spent some time unifying all of my tests. was created, and is responsible for running all passing and failing tests. Previously, failing tests were run by a separate script which understood 'run this, and this string describes what the output should be', and passing tests run by Now, tests can specify what their result should be. As the name suggests, is run before each commit, and thus traps a lot of errors before they hit.

Refine the internals and my vision (3.5 - 4.0)

Match is now possible for enum classes. However, when decomposing values, it is not possible to skip values using, say, an underscore like Scala does.

The type system starts to take center stage in the language. As such, I eventually extract the type system information that was hardcoded into the emitter and abstract it off into the new 'type system'. This module now sits alongside the others (ast pool, vm, emitter, symtab, lexer), a testament to how important that types are to the language.

Lambdas are added to the language, using a

{|<vars>| <expr> }

. Sadly, lambdas can only contain a single expression. One thing I am extremely proud of here, is that I am able to have lambdas infer types. I am currently planning to have lambdas be able to contain a full multi-line statement, but have not yet gotten to it.

The keyword


is now used to create new functions, and signatures are now called types. This makes 'type inference' sound more sensible.

Single inheritance is added, but there is no such thing as private, protected, or public.

Another Reddit introduction is launched. One person makes a good point about types. As such, the syntax of how Lily declares things is changed so that it is now

var v: <type> = value

, with the

: <type>

being optional. Function arguments are done like this too now. This puts type inference, and the type system center stage: The type is specified only when it needs to be, and is otherwise inferred from the right side. Let inference do the work for you. The reddit thread is also the reason that comments go from

### ... ###

to their current

#[ ... ]#


Custom allocator support was added, then removed. This annoyed me too much to stay around, as it required passing around an allocator function a lot. Similarly, Ruby's symbols were added, then removed, as I don't feel they "fit" well with the language. The out of memory behavior for the language is now changed: Allocations simply abort if there is no memory available. This simplifies a lot of stuff in the interpreter, and allows getting rid of aft.

The bytestring class is added. This class holds stuff that can have \0 values inside of it, or invalid utf-8. The string class is thus assumed to always have valid utf-8, or simply ascii-only codepoints inside.

import is finally added. It works largely how Python's import does, except there's no

from x import *


import x.y

syntax. The first is intentional, the second is unfortunate. One may note that Lily has two modes: Tagged, and untagged. An imported file always runs in untagged mode. This allows files loaded by either mode to be shared between the two. This was difficult to implement, because it broke a lot of assumptions that the interpreter used to have (namely, that there was only one file to parse, and one set of vars/classes). The package 'type' is subsequently removed. It should be noted that, while certain runners may provide modules, they are not implicitly loaded. This is so that if I take a web-facing script, I can run it with the standard lily runner by providing a server.lly file that will be loaded instead.

autotools was removed, with cmake taking over. I disliked autotools.

A file class is added, along with builtin globals stdin, stdout, and stderr. It only has a handful of class methods.

Dynaload is extended and improved upon. The builtin exception classes that derive from Exception are dynaloaded. stdio, stdout, and stderr are dynaloaded too. This allows the interpreter to boot in about 10K of memory, a great deal less than other interpreted languages.

The language now understands closures, but they're probably fragile. Closures work by creating a copy of a function, but with a closure data entry attached to it. Accesses that would target upvalues instead target the closure data at a specific index. Regular functions and class methods can both be closures.

Last, but not least, the interpreter understands basic default arguments (optional arguments/optargs).

Lessons learned

I feel like I've learned a few things from the overall journey of creating this language:

* I knew when I created 'circle_buster', the 'offset the refcounts instead of making a garbage collector', I was doing the wrong thing. "It's only temporary", I told myself. However, instead of that being true, I instead let it grow and become entangled into the vm. This cost me a lot of time and a lot of headaches. It's called technical debt, and for good reason. You either pay up front, or ignore it and pay up later.

* I knew I wanted import when I first made the interpreter, but it was so far off that I couldn't have a full picture of everything that I would want. The other pieces had to fall in place first (notably, a register-based vm).

* Sometimes the reason that nobody is doing something is that it's a really, really bad idea (address-based vm).

* Learning different languages changed what I wanted Lily to become. In the beginning, Python was used as a resource for determining if, say, expressions were written correctly. Later, it became Scala and Ruby. A sign of this shows in how the body of a class is the constructor (idea from Scala), and the class syntax (use of @ to denote a member).

* It wasn't until around time to write closures that I started using a debugger (gdb). Before then, debugging was 'throw in a bunch of fprintfs, recompile, retry'. Had I fully realized the usefulness of the debugger, I would have saved myself a great deal of time. Similarly, had I written an ast debugger in the beginning, I would have saved myself a great deal of headache in figuring out why various things were crashing.

Finally, for those curious,

the source code of Lily is available here.

Continue reading on