Hello, world! It's been a while since I've written something technical and of course, rather than sleeping at a reasonable time, I have recently found myself thinking about things I could probably write that might help some people improve their development skills. Since starting work with Stratum Security I have built a couple of systems from the ground up and done a lot of design, documentation, and refactoring. I decided that it might be really beneficial to write about all of these things in one place, to help reduce the strain of discovering some of these ideas by reading all kinds of different articles.
I have been wanting to share some of what I've learned in my years of programming and some of my more recent work in particular, but I don't think I really want to get into the details of how to code in my blog posts. I'd much rather be able to write shorter, more abstract articles that imbue the reader with some of the insight that comes from the experience of just building lots of stuff. This article is thus mostly targeted at what I'll call "intermediate-level" developers. That is, you're probably going to get the most out of this if you're already reasonably comfortable with the process of writing code, and you are starting to think a little bit more abstractly about how to write software that is going to be reliable in production, flexible to change, and understandable by your peers.
Much of what you will read here is biased according to my personal preferences regarding things like programming paradigms and my refusal to blindly accept dogmatic processes or methodologies. Everything I have to say here has been influenced heavily by industry best practices, preferred methodologies, techniques that are just known to work, but also very much by my own experience. I am going to say that while you should, of course, be skeptical of my preferences, if you are some kind of Object-Oriented Programming zealot, you should seriously just leave now.
Assuming you have been programming for a little while by now, you are probably already beginning to become or indeed keenly aware that jumping from having an idea about writing something to just writing it is a pretty bad way to do things.
The process above just doesn't really work in real life. At least not for producing the kind of code you should ever want to show a colleague you look up to. So what we're going to look at is essentially how to take that "something magical happens" part, and disassemble it- rainbows, unicorns, and all- until we have a professional approach to figuring out what to build as well as how to build it that we can communicate with a team and that will reduce the number of headaches we give our future selves. I promise that what we're left with at the end is still going to be fun and interesting. Our goal isn't to boil software development down to a perfect science (I assure you we would fail at such an endeavor), but rather to identify some strategies for navigating through the messy haze.
Probably a phrase you have heard a number of times in the context of things like problem solving by now is "top-down" or perhaps "divide and conquer". The latter name is a bit more expressive and, as it suggests, this is quite simply a strategy for solving problems that consists of the following steps:
The example above shows how we might start to decompose our idea to build an application that will help potential animal adopters find their perfect pet at our animal shelter. The first thing we do is identify some of the big actions involved on behalf of users (future pet owners) and admins (the good folks running the shelter). From there, we decompose each step into some more technical requirements. We could keep going with this decomposition and eventually be left with a set of problems that we could probably start writing code to solve quite quickly.
Performing the top-down analysis we just did is probably something you should do with the major "stakeholders" of the system you're building. That's fancy Agile term for the people that want you to build the thing in the first place and of course probably some people who can represent your future users. This, however, is probably not the thing you want to bring back to a team of developers to try to divvy up the work with. In reality, none of your problems will decompose into such a clear hierarchy- and we're going to see that that's a good thing! What you want to do now is scan over all the requirements, problems, and potential solutions that you've identified and try to come up with a list of- to use some highly technical buzzwords, here- things you have to build.
In our pet-shelter example, we can imagine coming up with a list like the following:
These are some hefty, relatively distinct high-level technical problems that our application should address. We can imagine splitting this list up and assigning one or more items to a small team of developers. Now that we know roughly what we need to build, it's time to figure out how we are going to build it or, perhaps more precisely, how the final product should work.
This is another super technical term for the kind of diagram we created above. The idea here is that, instead of trying to decompose a problem into sub-problems, we try to represent the flow of data between components of our system. This is my favorite way to describe a system at a high level because it allows me to visualize the components and immediately start planning how to document and build each one.
Above is a sort of architectural overview of the application we have been designing. You can see the major subsystems that would have to be built and how they interact with each other and with users. Each gray box is a subsystem that could be implemented in isolation and then connected to other subsystems as they are built. More on that later.
The downside to these kinds of diagrams is that they are quite abstract and sometimes you may do more harm than good by trying to split a system up into components like this. If you were making a REST API, for example, you probably couldn't get much out of this approach.
The other method of abstract communication I want to describe is one that is often taught in software engineering courses, but may not appear in the required reading of a generic programming (self-)education. User stories are exactly what the name suggests. They are short, point-by-point stories about the steps a user takes to interact with our system in order to accomplish some goal. Usually you'll make some assumptions about the preconditions- for example, that the user is already logged in.
Here's an example of a user story for describing the action of scheduling an appointment in our animal shelter example.
It's that simple! Of course, if you need to handle some conditional cases, you can create sublists under a step. User stories have the benefit of completely describing how an entire feature works or a requirement is met. It gives you something you can explicitly test for as you work to guarantee a feature is present. The downside here is that it can be a lot of work to write user stories for every critical piece of functionality, depending on how detailed you want to be. It can also be difficult to make sure you cover everything in such a way that you can start to meaningfully divide work.
Now that you have gathered all of the requirements for your system and hopefully decomposed each of the problems you need to solve into small chunks of work, you are ready to start actually building your software. But hold on! Building software in the real world involves much more than hunkering down and writing code, much like building a house involves more than just laying bricks on top of each other!
When we look at box-and-arrow (or architectural) diagrams like the one we saw earlier, we often focus on the big shapes involved. We do this because they represent things: components, our database, a user, etc. Equally, if not even more important in some cases, are the arrows between the shapes! The arrows aren't just there for show. They tell us what components have to interact with which other components and often include, if not explicitly then implicitly, some information about how the two things interact. This is where documentation comes in.
These are just a few of the kinds of documentation you might write while developing software. The central theme here is that documentation is used to communicate intent to other developers and to your future self. That is, documentation should explain why something works a particular way and how it works. Exactly how much a given piece of documentation should say depends on the context. You probably don't need to rewrite your user stories in comments in your code, and you probably shouldn't copy function signatures in your user stories. However, it may be perfectly appropriate to copy function signatures into an API specification.
Ideally, you should produce documentation before building something or writing tests. This doesn't mean you need to do all of the documentation up front, but you should document whatever you are preparing to build before actually building it.
While Test Driven Development and the details thereof are subjects of some skepticism, the developer community appears to have largely achieved consensus around the idea that code should be accompanied by tests. I would personally advocate for a tests-first style of development.
The reason I said that
Documentation serves as a contract for both your tests and your implementation.
is because documentation acts as a very convenient anchor for everything that follows it. One common complaint (and one that I agree with) about TDD is that it is often too difficult to know what to test for before implementing anything, as implementations tend to drive most of the design and decisions at the level that code exists at. Your documentation should be sophisticated enough that you can easily write test code to verify that the implementation that follows satisfies the conditions and interface described in your documentation. With this established, you can feel very confident that your tests will in fact tell you whether your code runs and whether it does what your documentation says it does.
Tests also provide another valuable utility to your codebase- a pivot. When it comes time to change something about your implementation, whether you are adding a new feature or changing the way something works, you should once again start by updating your documentation so that it is up to date. Having done that, you can then refactor or add new test code to check that your code meets the new specifications. Now, your implementation will fail your tests, and that failure can be used as a guide to changing your implementation. You will now be able to change your implementation without creating chaos.
Taking a top-down approach to solving problems is very effective, because we often know from the beginning where we want our work to end, so we can break apart our end-goal piece by piece until we have problems we can start solving today. A lot of people try to apply the top-down approach to development, particularly people who are especially fond of OOP, or Object-Oriented Programming. As I said at the beginning of this post, I am pretty heavily biased against OOP, and I think the top-down style of development is less effective than the bottom-up approach. Top-down fails for development because it assumes you have some kind of perfect knowledge of what the thing you're building should look like at the end before you even start building. In my own and many other peoples' experiences, this is exceptionally rarely the case, even with a formal specification. When changes are introduced or when new solutions to past problems are devised, it's very difficult to traverse up your object hierarchy and start inserting objects into the middle layers. Moreover, the kind of software one writes when one develops in an "abstractions-first, delegate the details until later" approach is often convoluted at best (ObjectFactoryFactories, anyone?).
A common argument against bottom-up development is that it is unstructured and disorganized, and that you can end up writing code that you might end up throwing away. In fact, it is exactly the opposite of unstructured and disorganized, and deleting code is a good thing.
In the bottom-up approach, you start by writing code to handle the most low-level operations your system has to deal with. In our animal shelter example, these would probably be SQL queries for the database we will manage animal, user, preference, and appointment information with. The next step from there is to build a set of abstractions, likely functions in the language you are using, that call on the more low-level operations and do some cleaning and structuring on the input and output to and from the latter. From there you build higher-level abstractions, like perhaps model classes before building your API endpoints that call on the model methods you wrote.
The operation of joining lower level operations together into higher level operations is frequently referred to as composition. Just like how, in mathematics, you learned that two functions compose like
(f.g)(x) = f(g(x))
composition in software works under essentially the same principle. Of course, since we're dealing with code, we often do extra work in between, but the core idea is the same. One of the most important things to note about this approach is that, while your goal should be to create only as many layers of abstraction as necessary, and to have each layer call only on functionality defined in the layer below, you have a lot more freedom to compose functionality however you see fit in this approach. It's a given that criss-crossing between layers and so on will result in messier code, however this approach, I think, almost always leaves you with a great deal of flexibility. When you want to make a change to something, you can very easily add helpful functionality to the layer below the one you're working at in order to build a suitably abstract solution.
The bottom-up approach is also a lot easier to deal with when it comes to testing. While the top-down approach often requires using techniques like mocking to simulate low-level implementations at higher levels of abstraction, the bottom-up approach allows you to test and harden your low-level code before moving up to abstractions. This means you can effectively unit test code at each level of abstraction and use integration tests to check that your compositions function as expected. If you develop in a test-first fashion, or at least thoroughly test each layer that you build, you can be very confident that the next layer you build will only be as buggy as the code you use to compose lower-level functionality.
The following points should summarize everything I've written here into a process that you can use starting today in your next software project.
It's my hope that this article has given you, as perhaps a more intermediate-level developer, some idea of how software engineering takes place in the real world. We've demystified the "magic" that we assumed had to happen before, but I hope you will agree that the creative and social processes we've discussed here will continue to prove challenging and satisfying.
Ultimately, what sets a high quality software engineer apart from yet another "good coder" is the ability to communicate and plan with a team, to understand requirements and translate them into technical models, and then of course the ability to adhere to a rigorous development process.
As always, if you'd like to chat with me or tell me what you thought of the article, please feel free to reach out to me on Twitter.