Five Ways To Code More Defensively

Five Ways To Code More Defensively

James Charlesworth

By James Charlesworth

2 March 20199 min read

Vectors by Vecteezy

Defensive programming is the art of designing your public APIs to be strong, reliable, and impossible to break - no matter what you throw at them.

So maybe don't do that. But follow these five tips and hopefully, you won't have to.

1. Never Trust Your Inputs

It is such a basic piece of advice that you have no doubt heard time and time again, but I still constantly see developers disobeying this one rule. If you read my blog post about An Enumerable is not a List you will have seen one example of where assuming an IEnumerable<T> input will produce the same results each time it is enumerated can lead to horrible errors. This is a classic example of making assumptions about your inputs without indicating those assumptions in your code. For example the following Javascript code assumes the input parameter is a number

function doSomething(someNumber)
{
console.log(someNumber * 20);
}

Whereas the next method does not make that assumption

function doSomething(someNumber)
{
if (isNaN(someNumber))
throw "Parameter is not a number!";
console.log(someNumber * 20);
}

We haven't changed the functionality, we have just added a friendly error that anybody calling your function without knowledge of its internals will be able to read. This way they will know the error is their fault, not the fault of your superbly written code.

2. Use A Type System

This is going to be quite controversial because there is a long standing debate about static/dynamic typing in programming languages. But I don't think anybody would disagree that having types helps a lot when trying to communicate what date your code accepts and what it can be expected to return. Think about how many different ways the following Javascript snippet could break

import { api } from 'somewhere';
function sendToApi(someNumber)
{
api.send({
url: 'http://my-api.com/numbers',
method: 'POST',
data: someNumber
});
}

What if we changed the data structure of that object we pass to api.send()? Let's say we change the url field to be called path. I think it's incredibly naive to assume developers, who are human beings, will be clinical enough to wade through the entire codebase updating all the places api.send(...) were called without making a single mistake. The chance of them missing even one reference is huge and, in my opinion, totally worth the extra effort involved in using a type system. The Typescript declaration for the above method might look like this

interface ApiSendOptions {
url: string;
method: string;
data: any
}
class Api {
void send(ApiSendOptions: options) {
...
}
}

This doesn't just apply to dynamically typed languages though. Pretty much every programming language except Javascript (and Ruby) has types, but for many of them the types are either inferred (Python, F#) or somewhat optional (C#, for example, offers the dynamic type which lets you duck-type to your heart's content and p*ss off all your teammates no end). So yes, if you have a type system to hand, use it and use it to its fullest.

3. Let Errors Bubble Up

That meme at the top of the post might be a joke, but in a way I'm tempted to agree with it. Take this C# code for example...

public static void Main()
{
var data = GetDataFromDatabase(id);
try
{
_service.Send(data);
}
catch (DbException ex)
{
_logger.Log($"There was an error sending data {id} to the service", ex);
}
}
public DataObject GetDataFromDatabase(int id)
{
try
{
return _context.Objects.Single(id);
}
catch (DbException ex)
{
_logger.Log($"There was an error getting the data {id}", ex);
return null;
}
}

This code is badly written because the exceptions are caught and handled in many places and not allowed to bubble up. Anyone calling the GetDataFromDatabase(id) method for an object that doesn't exist will simply get back null and. You will have no immediate way of knowing whether that null you got back was because the id didn't exist, or because the service crashed.

public static void Main()
{
var data = GetDataFromDatabase(999);
if (data == null)
{
// No idea if the data was null because 999 is an invalid ID, or if the database is unavailable.
}
}

This kind of pattern can lead people to treat your methods with suspicion when perhaps suspicion was not warranted. That leads to bugs.

A much more defensive way to code the above would be to simply let the exceptions bubble up and, like Kayode Ewumi is suggesting, wrapping the entire application in one try...catch

public static void Main()
{
try
{
var data = GetDataFromDatabase(id);
_service.Send(data);
}
catch (DbException ex)
{
_logger.Log(ex);
}
}
public DataObject GetDataFromDatabase(int id)
{
return _context.Objects.Single(id);
}

If you really want the exceptions wrapped in the more descriptive messages then great, you can just rethrow the exceptions like this

public static void Main()
{
try
{
var data = GetDataFromDatabase(id);
_service.Send(data);
}
catch (DbException ex)
{
_logger.Log(ex);
}
}
public DataObject GetDataFromDatabase(int id)
{
try
{
return _context.Objects.Single(id);
}
catch (DbException ex)
{
throw new Exception($"There was an error getting the data {id}, see inner exception", ex);
}
}

This pattern of wrapping and rethrowing errors is much more defensive because it improves the trust consumers of your code can place in the return values you give.

4. Restrict the size of the data you have to deal with

If you have an enormous domain model with many different classes all nested in among each other and you are passing objects of this behemoth all around your code then I'd politely suggest you may be heading down a deep hole of infuriatingly buggy code. The more data your functions are given that they don't need, the more likely there is to be something in that data that will cause a bug. Here's an example in Javascript

function processData(data) {
checkCity(data.users[0].address.city);
checkCity(data.users[1].address.city);
checkCity(data.users[3].address.city);
data.secondUserName = data.users[1].name;
}

You probably don't need me to tell you all the things that could go wrong here, even if we could guarantee all the properties above exist in our data by using TypeScript (see Tip 2), we still have a large unwieldy method that touches many different parts of the data object. Also, imagine how complicated the unit tests for this method will have to be! This function is desperate to be refactored

const getCityFromAddress = (address) => address.city;
const getCityFromUser = (user) => getCityFromAddress(user.address);
const getSecondUserName = (data) => {
if (data.users.length < 2)
throw "Cannot get second user name, no second user";
return data.users[1].name;
}
function* getUsersToCheck(data) {
yield data.users[0];
yield data.users[1];
yield data.users[2];
}
function processData(data) {
for (let user of getUsersToCheck(data)) {
checkCity(getCityFromUser(user));
}
data.secondUserName = getSecondUserName(data);
}

Sure, this is more code - but by refactoring this into smaller functions each dealing with a smaller part of the data object we have achieved two things. Firstly we have made the code much easier to unit test. Those functions are simple and it's easy to see what we need to put into them to check their behaviour (they are also all pure functions). Secondly, we have stopped the processData() function from having to know too much about the structure of the data object. The structure is dealt with in our granular functions to the processData() object now only has to deal with the procedural logic of what we want to do.

5. Eliminate invalid mutable states

Ideally you should try to avoid mutability in general, but sometimes it is unavoidable. Take this file loader class for example...

public class MyFileReader : IDisposable
{
private Stream _fileStream = null;
public void Open() =>_fileStream = File.OpenRead(@"C:\temp\myfile.txt");
public string ReadAll()
{
using (var reader = new StreamReader(_fileStream))
return reader.ReadToEnd();
}
public void Close() => _fileStream.Close();
public bool IsAtEnd => _fileStream.Position == _fileStream.Length;
public void Dispose()
{
_fileStream?.Close();
_fileStream?.Dispose();
}
}

This is basically a wrapper around a file stream but it does have mutable state. The stream can only be read once and it must be opened before it can be read. This, for example is an invalid operation

var reader = new MyFileReader();
var data = reader.ReadAll(); //throws NullReferenceException;

The success or failure of the ReadAll() method here is entirely dependent on the current state of the object. In other words, in order to call ReadAll() the object must have an open stream in its fileStream field. It relies on the calling code knowing this.

Objects written with mutable state like this are not very nice to deal with, and you should avoid them wherever you can - but sometimes it is necessary and I'd never tell someone to rule out giving objects state. So how can we save ourselves and write it defensively?

We should approach situations like this from three angles.

  1. Throw detailed exceptions if an object is asked to do something that its current state does not allow.
  2. Document the allowed operations where possible.
  3. Reduce the number of discrete states the object can be in

Here are all three of these approaches added to the MyFileReader class.

public class MyFileLoader : IDisposable
{
private Stream _fileStream = null;
public void Open()
{
if (_fileStream != null)
throw new InvalidOperationException("File stream is already open");
_fileStream = File.OpenRead(@"C:\temp\myfile.txt");
}
/// <summary>
/// Reads the file, as long as it has been opened with the <see cref="Open"/> method
/// </summary>
public string ReadAll()
{
if (_fileStream == null)
throw new InvalidOperationException("You must open the file stream before reading it");
using (var reader = new StreamReader(_fileStream))
return reader.ReadToEnd();
}
public void Close()
{
if (_fileStream == null)
throw new InvalidOperationException("File stream is not open");
_fileStream.Close();
_fileStream = null;
}
public bool IsAtEnd => _fileStream.Position == _fileStream.Length;
public bool IsOpen => _fileStream != null;
public void Dispose()
{
_fileStream?.Close();
_fileStream?.Dispose();
_fileStream = null;
}
}
  1. Each method now checks the state of the file stream and throws an exception if it is not in the correct state
  2. The comment on ReadAll() hints the developer calling this that they need to be aware of the state of the object
  3. By setting the _fileStream variable to null after closing it we have reduced the number of states we can be in. Previously the variable could be null, an open and read to read stream, or a stream that had been read to the end. Now it can only be null or a stream ready to read.

Conclusion

Defensive programming is hard and very subjective, so I don't expect you to agree with everything I have suggested here but I urge you to consider some of these approaches - just in case I ever have to use some public code you have written!