The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

Visualizing Rust's type-system

In a week, I'm giving a talk at the Montral Rust Language Meetup, called "Rust after borrowing" and, as I'm searching for my words, I thought that putting them down on a screen could ease the process.

The talk is going to be about Rust; more specifically, about medium-to-advanced type-system business, such as dynamic dispatch, phantom types and higher-order functions. While doing so, I'd like to share some methods to visualize how the different parts of Rust's type-system fit together.

And I thought that I would tackle that last part right now.

Type hierarchy, Venn-style

The most flagrant example of Rust's minimalism is perhaps its type-system. It traded classical OOP for a more functional approach based solely around the concept of traits. It is through traits that we define the interface of our data structures, overload operators, marks concurrent types, and handle dynamic dispatch. One concept to rule them all!

I assume that you already know what traits are, and so I won't linger too much on this topic. However, I would like to ask you: what does the colon mean? What are you really saying when you write MyTrait: Add + Copy? Well, let us first consider the set of all possible values. In this context let x: i32 means \(x \in \mathbf i32\).

This is trivial, yet things get interesting when you start to think about the properties of traits in this model. The obvious approach would be to treat traits as subsets of the collection of possible values. Within this representation, traits start to behave in intuitive and powerful ways:

In this context, we have two traits, A and B, where each defines a certain specification. When one is asking for a type to implement two traits, what we really mean is that the values of said type exist in the intersection between both traits. In other words, T: A + B really means \(T \subset A \cap B\).

What about traits that depend on other traits? Same thing! C: A + B really means that every type implementing \(C\) must live within the intersection \(A \cap B\); that is, \(C \subset A \cap B\). The two statements are equivalent. Graphically, it is expressed by \(C\) being contained by the intersection:

You should start to see the pattern. A colon means subset of, while a plus sign means intersection of.

According to this model, if I declare \(T \subset C\), it is implied that \(T \subset A \cap B\). Therefore, the compiler should force us to implement A and B, which is exactly what it does. Whether you view trait dependencies as forming a tree or a Venn diagram, the result is the same.

The approach using sets, however, is much simpler to think about and it is possible because of the way types work. Rust's adoption of the principle of composition over inheritance means that types are pretty much independent. As such, you could say that they represent the smallest possible subset allowed by the language; a type can contain elements (values), but it cannot contain other subsets (traits or tangible types).

You should note the "tangible": this is because generic variants are different types in this mental framework, like they are in compiled code. That way, they can be in different trait subsets through conditional impl. This also means that, given \(sum : \mathbf{Vec\ i32} \rightarrow \mathbf{i32}\), you cannot map a Vec<String>: the two sets are necessarily disjoint.

We also consider that whatever methods are associated to a specific type make up their own implicit trait; types without any method map to marker traits similar to Sync. The main difference is that type traits typically implement the Sized trait, which we generally omit because, well, it contains most types. For example (where MyType does not implement Hash):

This works because, somewhere in the standard library, there is:

impl<T: Hash> Hash for Option<T>

In this picture, String, i32 and Option<i32> have access to Hash::* while Option<MyType> and Option<i32> have access to Option::*. You can notice that, because Option<i32> has access to both, it lives within the intersection of Hash and Option.

From this example, you could infer that generics allow to automatically generate types on-demand and that an impl MyType block is really just an anonymous trait which is automatically implemented for all instances of MyType.

Now, this diagram thing might make the type-system look even more daunting than it already did. However, when compared to Ruby's mental model, for example, we can see that Rust's is much simpler to think about. For instance, all types are created equal; it makes no sense to talk about a parent or a child type, and we do not have to concern ourselves with constructors, inheritance, and all of these other concepts which we've become so accustomed to. The language just forces you to be super-duper explicit about everything.

Before moving on, you should probably wait for everything to sink in, because we're going to level up and make our model even more abstract as we explore static and dynamic dispatch and conditional impl.

Extending the model

What about static and dynamic dispatch? Well, let's tackle static dispatch first. Say you define:

fn update_all<T: HasUpdate>(xs: &mut [T]) {
    for x in xs {
        x.update();
    }
}

In our model, different generic instances are associated with different maps. That is, like in mathematics, update_all::<Ork> is different from update_all::<Gnome>: they are defined over completely distinct domains. Trait-wise, the first implements Fn(&mut [Ork]) and, the second, Fn(&mut [Gnome]).

Simple, isn't it? Now, handling dynamic dispatch is a bit more convoluted, but bare with me!

Remember when I told you that we omitted the Sized trait? Well, now, we're going to have to take it into account. As we go along, we're going to blur more and more the line between types and traits, data and processes.

First, the deeper you go in the trait hierarchy, the more precise you get, and types are the deepest that you can go (values are not considered). They allow you to access the physical data directly. But what if you didn't actually need to be this precise; what if you wanted to talk abstractly about data, that is only through the traits that it implements?

In this abstract model, types are just traits that provide accessor methods with a special syntax to manipulate the attributes. As such, when you declare a function...

... you really mean that it accepts an argument x which is an element of i32: an element which implements i32. Types do not exist at this level of abstraction. Only set-like traits and element-like values exist.

Types implement the Sized trait, a property which may not be possessed by normal traits, due to the fact that they do not carry any data. This explains the error message shown when attempting to define a function which accepts a trait instance as an argument:

<anon>:5:15: 5:17 error: the trait `core::marker::Sized` is not implemented for the type `HasUpdate` [E0277]
<anon>:5 fn update_all(xs: HasUpdate) {
                       ^~
<anon>:5:15: 5:17 note: `HasUpdate` does not have a constant size known at compile-time
<anon>:5 fn update_all(xs: HasUpdate) {

Notice how it complains about the type HasUpdate not implementing Sized. But... we passed in a trait, right? It turns out that, behind the scenes, Rust actually generates trait types, which have the same name as their corresponding traits (for convenience only, it's quite confusing at first), but do not implement Sized, so that a collection can store diverse values.

However, as users of these abstractions, we do not need to worry about such details. We gain much better insights by ditching types and thinking exclusively in terms of traits as subsets (and not the other way around, which is the compiler's job).

What you have to do, then, is to create a sized value which holds on to an unsized one. Another way to put it is: we must store unsized trait objects behind a pointer. One usually uses a Box or a reference, but any pointer-like object defined by the standard library should do. For example:

fn update_all(xs: &mut [Box<HasUpdate>]) {
    for x in xs {
        x.update();
    }
}

With this function defined, all you have to do is store your values behind boxes and let rustc figure out how to call the good update method for each object; how to dynamically dispatch the correct method at run-time. This is where the name of this trick comes from.

Now that we got working code, how can we visualize this new way to think about data and processes? Simply in the same way that we always did:

Now that we blurred the line between types and traits, we do not need to go as deep to describe values: HasUpdate is the type. In the same way that we can say that many values inhabit a given type, many (diverse) values inhabit a trait. All you need is a bit of indirection for each leap of abstraction.

Why doesn't Rust default to this system, then, if it is so powerful? Well, every indirection has a cost in terms of memory and performance. Rust is also designed for niches that require bare-metal manipulation of memory and greatly benefit from defaulting to the stack. As with everything which has a cost-penalty, Rust gives you the option to explicitly opt-into the feature, but will strive for the better performance (and safety) by default.

Conditional implementation

Newcomers to Rust often complain about the verbosity of Rust's impl statements and, if you only use them to implement traits for types, they're quite boring. I mean, their sole purpose in this context is to say that a type set is included inside of a trait set. That's it!

However, the place where the syntax really shines is when you start to talk about the ways in which traits are related to each other. You do that through conditional implementation, which allow you to implement a trait for all types that respect a specific condition. For example:

impl<T: Mob> Entity for T {
    // ...
}

This says: every T which is a subset of Mob is also a subset of Entity, and every Mob shares the same implementation for the Entity trait. Graphically, Mob is contained by Entity, because while every item in Mob implements Entity, the inverse isn't always true. This pattern is easily extended to intersections. For instance:

... means that for every \(T \in A \cap B,\ T \in C\). In other words, the intersection of A and B is a subset of C. In general, it can be easily read out loud:

Or, all in one sentence:

Let me implement the trait \(C\) for all \(T \subset A \cap B\).

When you get something more complicated, it becomes harder to say out loud, but the principle stays the same:

impl<R: Num, C: Num> Add<Matrix<R, C>> for Matrix<R, C> {
    type Output = Matrix<R, C>;
    fn add(self, rhs: Matrix<R, C>) -> Matrix<R, C> { ... }
}

For all \(R, C \subset Num\), let Matrix<R, C> be a subset of Add<Matrix<R, C>, Output=Matrix<R, C>> with the following implementation.

You could say that conditional implementations enforce high-level behavior (Entity) shared by many types, provided that they implement the required lower-level subtleties (Mob).

Conclusion

So, what is the big picture here? Well, traits and values are really all there is; the deeper you go, the more precise you get, until you reach types which are nothing more than subsets of Sized with a special syntax to manipulate the underlying memory and from where you cannot go deeper.

Because of these properties, it is possible to manipulate values as though they were traits: greater abstraction at the price of control and indirection.

Trait dependencies specify the subset in which a trait must exist, without specifying how to implement it, while conditional implementations (which are really just the trait version of generic functions) specify the subset in which a trait exists, forcing similar types to share a common behavior.

I'm still developing these ideas, so any insight that you might have is more than welcome. I believe that it is important to state the philosophy and the mathematical foundations of a language when we want it to evolve without it becoming the next bloated mess. It also helps to reason in a given language and write idiomatic code, whatever this means.

Plus I'm a big mathematics lover, and it makes me happy to connect (join? :P) the semantics of software development and set theory. Even if it probably doesn't change much in everyday programming.

Anyway! Hope you enjoyed.

Continue reading on jadpole.github.io