Weekly Swift articles, podcasts and tips by John Sundell.

Avoiding race conditions in Swift

Published on 04 Nov 2018
Basics article available: Grand Central Dispatch

A race condition is what happens when the expected completion order of a sequence of operations becomes unpredictable, causing our program logic to end up in an undefined state. For example, we might update the UI before its content was fully loaded, or accidentally show a screen only meant for logged in users before the user has been completely logged in.

Race conditions can at first often appear to be random, and can be really tricky to debug - since it's often hard (or even impossible) to come up with reliable reproduction steps for them. This week, let's take a look at a common scenario that can cause race conditions, possible ways to avoid them - and how we can make our code more robust and predictable in the process.

An unpredictable race

Let's start by taking a look at an example, in which we're building an AccessTokenService to enable us to easily retrieve an access token for performing some form of authenticated network request. Our service is initialized with an AccessTokenLoader, which performs the actual networking, while the service itself acts as the top-level API and deals with things like caching and token validation - looking like this:

class AccessTokenService {
    typealias Handler = (Result<AccessToken>) -> Void

    private let loader: AccessTokenLoader
    private var token: AccessToken?

    init(loader: AccessTokenLoader) {
        self.loader = loader
    }

    func retrieveToken(then handler: @escaping Handler) {
        // If we have a cached token that is still valid, simply
        // return that directly as the result.
        if let token = token, token.isValid {
            return handler(.value(token))
        }

        loader.load { [weak self] result in
            // Cache the loaded token, then pass the result
            // along to the given handler.
            self?.token = result.value
            handler(result)
        }
    }
}

The above class may look really straightforward, and if used in isolation - it is. However, if we look a bit closer at our implementation, we can see that if the retrieveToken method is called twice - and the second call happens before the loader has finished loading - we'll actually end up loading two access tokens. For some authentication servers, that can be a big problem, since often only one access token can be valid at any given time - making it very likely that we'll end up with a race condition when the second request ends up invalidating the result of the first one.

Enqueueing pending handlers

So how can we prevent that sort of race condition from occurring? The first thing we can do is to make sure that no duplicate requests are performed in parallel, and instead enqueue any handlers passed to retrieveToken while the loader is busy loading.

To do that, we'll start by adding a pendingHandlers array to our access token service - and every time retrieveTokens is called, we'll append the passed handler to that array. Then, we'll make sure to only perform a single request at any given time by checking if our array only contains a single element - and instead of calling the current handler directly once the loader finished, we'll call a new private method named handle:

class AccessTokenService {
    typealias Handler = (Result<AccessToken>) -> Void

    private let loader: AccessTokenLoader
    private var token: AccessToken?
    // We'll keep track of all enqueued, pending handlers using
    // a simple array.
    private var pendingHandlers = [Handler]()

    func retrieveToken(then handler: @escaping Handler) {
        if let token = token, token.isValid {
            return handler(.value(token))
        }

        pendingHandlers.append(handler)

        // We'll only start loading if the current handler is
        // alone in the array after being inserted.
        guard pendingHandlers.count == 1 else {
            return
        }

        loader.load { [weak self] result in
            self?.handle(result)
        }
    }
}

The reason we've introduced a new handle method, rather than simply inlining our result handling logic in the load completion handler (like we did before), is that we now require a bit more complex logic and multiple references to self. Rather than having to do the classic guard let dance to turn our weak reference to self into a strong one, we simply call handle, which'll let us access all of our properties just like we'd normally do.

In our implementation of handle, we'll again cache the loaded token, and notify each pending handler that a result was loaded - like this:

private extension AccessTokenService {
    func handle(_ result: Result<AccessToken>) {
        token = result.value

        let handlers = pendingHandlers
        pendingHandlers = []
        handlers.forEach { $0(result) }
    }
}

We now have a guarantee that even if retrieveToken is called multiple times in sequence, only one token will end up being loaded - and all handlers will be notified in the correct order πŸ‘.

Enqueuing asynchronous completion handlers, like we do above, can take us a long way when it comes to avoiding race conditions in code that deals with a single source of state - but we still have one major issue that we need to tackle - thread safety.

Thread safety

Few things can end up causing race conditions more than multi-threading, especially since much of the code that we write when building apps won't actually be thread-safe. Since UIKit only runs on the main thread, it makes sense for much of our logic that operates close to the view layer to make the assumption that it will only get called from the main thread - but as soon as we step deeper into our core logic, that assumption might no longer hold true.

As long as our code is executing within the same thread, we can rely on the data that we read and write from our objects' properties to be correct. However, as soon as we introduce multi-threaded concurrency, two threads might end up reading or writing to the same property at the exact same time - resulting in one of the threads' data becoming immediately outdated.

For example, as long as our AccessTokenService from before is used within a single thread, the mechanism we put in place for dealing with race conditions by enqueuing pending completion handlers will work just fine - but if multiple threads end up using the same access token service, we might quickly end up in an undefined state, once our pendingHandlers array is concurrently mutated from multiple threads. Once again, we have a race condition on our hands.

While there are many ways to deal with multi-threading-based race conditions, one fairly straightforward way to do so on Apple's platforms is to use the power of Grand Central Dispatch - which lets us deal with threads using its much simpler queue-based abstractions.

Let's go back to our AccessTokenService, and make it thread-safe by using a dedicated DispatchQueue to sync up its internal state. We'll start by either accepting an injected queue in our service's initializer (to facilitate testing), or create a new one, then - once our retrieveToken method is called - we'll dispatch an asynchronous closure onto our queue in which we'll actually perform the token retrieval, making our class now look like this:

class AccessTokenService {
    typealias Handler = (Result<AccessToken>) -> Void

    private let loader: AccessTokenLoader
    private let queue: DispatchQueue
    private var token: AccessToken?
    private var pendingHandlers = [Handler]()

    init(loader: AccessTokenLoader,
         queue: DispatchQueue = .init(label: "AccessToken")) {
        self.loader = loader
        self.queue = queue
    }

    func retrieveToken(then handler: @escaping Handler) {
        queue.async { [weak self] in
            self?.performRetrieval(with: handler)
        }
    }
}

Just like before, we simply call a private method inside of our asynchronous closure, rather than having to add lots of self references. In our new performRetrieval method, we'll run the exact same logic as before - with the addition that we now wrap our call to handle in an asynchronous queue dispatch as well - to ensure complete thread safety:

private extension AccessTokenService {
    func performRetrieval(with handler: @escaping Handler) {
        if let token = token, token.isValid {
            return handler(.value(token))
        }

        pendingHandlers.append(handler)

        guard pendingHandlers.count == 1 else {
            return
        }

        loader.load { [weak self] result in
            // Whenever we are mutating our class' internal
            // state, we always dispatch onto our queue. That
            // way, we can be sure that no concurrent mutations
            // will occur.
            self?.queue.async {
                self?.handle(result)
            }
        }
    }
}

With the above in place, we can now use our AccessTokenService from any thread, and still be sure that our logic will remain predictable and all handlers will be called in the correct order πŸŽ‰.

The good news is that we can also quite easily test that our threading and queueing logic works as it should, since we're injecting the DispatchQueue to use in our service's initializer. For more on how to write that kind of tests, check out "Unit testing asynchronous Swift code".

Conclusion

While there's likely no way to completely avoid race conditions, using techniques like queuing and the power of Grand Central Dispatch can let us write code that is much more predictable and less prone to race condition-based errors. Whenever we're writing some form of asynchronous code, it really helps to think twice about how that code will behave when called concurrently, and to put mechanisms in place to make sure all operations are performed (and completed) in a predictable order.

Does that mean that we should write all of our code in a completely thread-safe manner? I personally don't think so. While thread safety comes very much in handy in core services like the AccessTokenService we worked on in this article, chances are that much of our code will never be used outside of the main thread - so adding complete thread safety everywhere might often become an exercise in over-engineering.

Like always, what techniques that are appropriate depends a lot on what we're building - and just keeping thread safety in mind as we're designing APIs and their implementations can often take us a long way towards making our code much more robust.

What do you think? Did you have to solve any tricky race conditions lately, and what are your favorite techniques for avoiding them? Let me know - along with your questions, comments or feedback - on Twitter @johnsundell.

Thanks for reading! πŸš€