The last time Hackerfall tried to access this page, it returned a not found error. A cached version of the page is below, or clickhereto continue anyway

Why Rust's ownership/borrowing is hard

Working with pure functions is simple: you pass arguments, you get a result no side effects happen. If, on the other hand, a function does have side effects, like mutating its arguments or global objects, it's harder to reason about. But we've got used to those too: if you see something like player.set_speed(5) you can be reasonably certain that it's going to mutate the player object in a predictable way (and may be send some signals somewhere, too).

Rust's ownership/borrowing system is hard because it creates a whole new class of side effects.

Simple example

Consider this code:

let point = Point {x: 0, y: 0};
let result = is_origin(point);
println!("{}: {}", point, result);

Nothing in the experience of most programmers would prepare them to point suddenly stopping working after being passed to is_origin()! The compiler won't let you use it in the next line. This is the side effect I'm talking about something has happened to the argument but not the kind you've seen in other languages.

Here it happens because point gets moved (instead of being copied) into the function so the function becomes responsible for destroying it and the compiler prevents you from using it after that point. The way to fix it is to either pass the argument by reference or to teach it how to copy itself. It makes total sense once you've learned about "move by default". But these things tend to jump out on you in a seemingly random fashion while you're doing some innocent refactorings or, say, adding logging.

Complicated example

Consider a parser that takes some bits of data from an underlying lexer and maintains some state:

struct Parser {
    lexer: Lexer,
    state: State,
}

impl Parser {

    fn consume_lexeme(&mut self) -> Lexeme {
        self.lexer.next()
    }

    pub fn next(&mut self) -> Event {

        let lexeme = self.consume_lexeme(); // read the next lexeme

        if lexeme == SPECIAL_VALUE {
            self.state = State::Closed      // update state of the parser
        }
    }
}

The seemingly unnecessary consume_lexeme() is just a convenience wrapper around a somewhat longer string of calls that I have in the actual code.

The lexer.next() returns a self-sufficient lexeme by copying data from the lexer's internal buffer. Now, we want to optimize it so lexemes would only hold references into that data and avoid copying. We change the method declaration to:

pub fn next<'a>(&'a mut self) -> Lexeme<'a>

The 'a thingy effectively says that the lifetime of a lexeme is now tied to the lifetime of the lexer reference on which we call .next(). It can't live all by itself but depends on data in the lexer's buffer. The 'a just spells it out explicitly here.

And now Parser::next() stops working:

error: cannot assign to `self.state` because it is borrowed [E0506]
       self.state = State::Closed
       ^~~~~~~~~~~~~~~~~~~~~~~~~~

note: borrow of `self.state` occurs here
       let lexeme = self.consume_lexeme();
       ^~~~

In plain English, Rust tells us that as long as we have lexeme available in this block of code it won't let us change self.state a different part of the parser. And this does not make any sense whatsoever!

The culprit here is the consume_lexeme() helper. Although it only actually needs self.lexer, to the compiler we say that it takes a reference to the entire parser (self). And because it's a mutable reference, the compiler won't let anyone else touch any part of the parser lest they might change the data that lexeme currently depends on.

So here we have this nasty side effect again: though we didn't change actual types in the function signature and the code is still sound and should work correctly, a different ownership dynamic suddenly doesn't let it compile anymore.

Even though I understood the problem in general it took me no less than two days until it all finally clicked and the fix became obvious.

Rusty fix

Changing consume_lexeme() to accept a reference to just the lexer instead of the whole parser has fixed the problem but the code looked a bit non-idiomatic, having changed from a dot-method notation into a plain function call:

let lexeme = consume_lexeme(self.lexer); // want self.<something-something> instead

Luckily Rust actually makes it possible to have it the right way, too. Since in Rust the definition of data fields (struct) is separate from the definition of methods (impl) I can define my own local methods for any struct, even if it's imported from a different namespace:

use lexer::Lexer;

// My Lexer methods. Does *not* affect other uses of Lexer elsewhere.
impl Lexer {
    pub fn consume(&mut self) -> Lexeme { .. }
}

// ...

let lexeme = self.lexer.consume(); // works!

Neat :-)

Rust's borrow checker is a wonderful thing that forces you into designing code to be more robust. But as it is so unlike anything you're used to, it takes time to develop a certain knack to work with it efficiently.

Continue reading on softwaremaniacs.org