Articles, podcasts and news about Swift development, by John Sundell.

Querying collections in Swift

Published on 26 Apr 2020
Discover page available: The Standard Library

When working with collections, such as arrays or dictionaries, it’s very common to want to query them for information about the values that they contain. We might need to find the first element that matches a given set of criteria, validate all values according to a set of requirements, and so on.

In situations like these, it might initially seem like we always need to write purpose-built algorithms that perform each query using a classic for or while loop, and while that’s certainly a fine approach in some cases, it often turns out that writing such manual iterations can be quite unnecessary.

This week, let’s explore why that is — by taking a look at some of the Swift standard library’s built-in algorithms for querying collections, and how we can compose and combine them to form a nearly infinite number of different queries.

Searching for a match

Let’s start by taking a look at a very common type of situation, in which we want to query a collection in order to find the first element that matches a given predicate.

As an example, let’s say that we’re working on a movie recommendation app, and that when the user taps a button for showing the next movie within a given genre, we search a movieQueue array and display the first match — like this:

func showNextMovie(inGenre genre: Genre) {
    var match: Movie?

    for movie in movieQueue {
        if movie.genre == genre {
            // We've found our match, so we'll break the iteration
            match = movie
            break
        }
    }

    guard let movie = match else {
        return showNoMovieFoundView()
    }

    showMovieView(for: movie)
}

While the above code works, it could definitely be implemented in a more compact way, which in this case should also improve its readability.

For starters, let’s get rid of that local match variable, by moving our for loop into its own, dedicated function. Like we took a look at in “Pure functions in Swift”, structuring logic as functions that simply operate on an input/output basis can often improve our code — both in terms of how easy it is to read, as well as how testable our overall system becomes.

While our new function won’t be entirely “pure” (since it’ll still rely on the state of our movieQueue array), it’ll now simply return the first matching Movie that was found, rather than assigning it to a local variable:

func nextMovie(inGenre genre: Genre) -> Movie? {
    for movie in movieQueue {
        if movie.genre == genre {
            return movie
        }
    }

    return nil
}

With the above in place, we can now heavily simplify our showNextMovie function from before — as it now only needs to contain the logic that determines which view that should be shown for the current state of the app:

func showNextMovie(inGenre genre: Genre) {
    guard let movie = nextMovie(inGenre: genre) else {
        return showNoMovieFoundView()
    }

    showMovieView(for: movie)
}

However, using the power of the Swift standard library, we can simplify our code even further. If we take a closer look at our nextMovie function and the for loop that it contains, there’s really nothing about it that’s specific to movies — it simply iterates over an array and returns the first element that matches a given predicate.

To slightly twist the classic App Store slogan: “There’s a collection API for that”. In this case, we can simply replace our custom nextMovie function with a call to the built-in first(where:) method, which lets us pass a predicate to match against. Combine that with the Optional type’s map method — and we can implement our entire showNextMovie function like this instead:

func showNextMovie(inGenre genre: Genre) {
    let movie = movieQueue.first(where: { $0.genre == genre })
    movie.map(showMovieView) ?? showNoMovieFoundView()
}

Just like that, we’ve essentially boiled down a 10+ line algorithm to just two lines of code, by leveraging the standard library’s built-in APIs and algorithms. Pretty nice!

There’s also a last(where:) equivalent to the API used above, as well as firstIndex(where:) and lastIndex(where:) variants that return indexes rather than elements. However, the “last” variants are only available on collections that conform to the BidirectionalCollection protocol, such as Array.

Verifying requirements

Not all collection queries are about retrieving an element, though — sometimes we might just want to verify that a series of values meet a certain requirement.

For example, let’s now say that we’re working on an app for scheduling and managing events, and that the app includes a feature that lets each participant mark whether they’re ready for the event to begin. We’ve then extended our app’s Event model with a method that lets us easily check whether all participants have marked themselves as ready — which currently looks like this:

extension Event {
    func isReadyToBegin() -> Bool {
        for participant in participants {
            guard participant.isReady else {
                return false
            }
        }

        return true
    }
}

The reason the above API was implemented as a method, and not a computed property, is because its time complexity isn’t O(1). To learn more about that philosophy, check out “Computed properties in Swift”.

However, just like before, there’s a much easier way to implement the above — this time by leveraging the standard library’s allSatisfy API, which (like the name implies) enables us to check whether all of a collection’s element satisfy a given predicate. Combine that with the fact that key paths can now be passed as functions, and we can replace our entire implementation with just a single line of code — like this:

extension Event {
    func isReadyToBegin() -> Bool {
        participants.allSatisfy(\.isReady)
    }
}

With that change in place, the question is whether we still need to keep the above method at all, given that it now simply contains a call to another API. If we’re only looking to use it in a single place, we might as well just inline the call to allSatisfy directly at the call site — however, our above method does add some additional context, as its name makes it crystal clear that we’re checking whether an event is ready to begin, so it still might make sense to keep it.

Minimum and maximum values

Next, let’s take a look at another common type of query — computing minimum and maximum values from a series of elements. This time, let’s say that we’re building a game, and that we’d like to add a convenience API for computing the maximum score among an array of players for a given level. If we again start with a manually implemented algorithm, we might write something like this:

extension Level {
    func maximumPlayerScore() -> Int {
        var maximumScore = 0

        for player in players {
            maximumScore = max(maximumScore, player.score)
        }

        return maximumScore
    }
}

While the max function that we use above is (along with its min equivalent) incredibly common and can be found in a wide range of programming languages, Swift also includes a variant of it that can be called directly on a collection. All that we have to do to use it is to pass a closure that sorts the collection’s elements in ascending order — like this:

extension Level {
    func maximumPlayerScore() -> Int {
        // Note that we need to sort the collection in ascending
        // order, which is why we use < to perform our comparsion:
        let winningPlayer = players.max { $0.score < $1.score }
        return winningPlayer?.score ?? 0
    }
}

An alternative approach to the above would be to first sort the players array by each player’s score property, and then simply pick the first element. However, doing so would be worse in terms of time complexity, since min and max can both be executed in pure linear (O(n)) time, while sort has a time complexity of O(n log n).

Composing custom algorithms

So far in this article, we’ve mainly focused on replacing custom collection algorithms with standard library API calls, and while it’s certainly beneficial to do so whenever possible, that’s not always the case.

Let’s go back to our Event model from before, and take a look at another API that’s also implemented as a collection query. This one checks whether a set of users are all present within an current event’s list of participants, and currently looks like this:

extension Event {
    func participantsContain(_ users: Set<User>) -> Bool {
        for user in users {
            guard participants.contains(where: { $0.userID == user.id }) else {
                return false
            }
        }

        return true
    }
}

While in this case there’s no equivalent standard library API that we can simply replace the above code with, there’s still a way in which we could improve it.

The beauty of the algorithms that come built-in as part of the standard library is that they’re all very focused and narrow — which in turn means that they can often be composed in order to form higher-level logic. So we don’t need to necessarily find a single API that matches an algorithm as a whole, we simply need to find one match for each of its components.

Looking at our above algorithm, we’re already using the contains(where:) API to form our inner loop, while our outer iteration still uses a classic for loop. However, if we take a closer look, it turns out that the outer loop has the exact same shape as the one that we initially used within our isReadyToBegin method from before — in that it checks that all elements within the passed set of users satisfy a given requirement.

While we could now go back to our participantsContain method and refactor our code right there, let’s in this case implement our logic as a generic algorithm instead — by composing allSatsify and contains(where:) into a new method for the Sequence protocol (which all collections and other sequences conform to):

extension Sequence {
    func contains<T: Sequence>(
        _ values: T,
        matchedBy matcher: (Element, T.Element) -> Bool
    ) -> Bool {
        values.allSatisfy { value in
            contains(where: { matcher($0, value) })
        }
    }
}

With the above in place, we can now go back to our Event type and simply make its participantsContain method call our new API by passing its predicate as a closure — like this:

extension Event {
    func participantsContain(_ users: Set<User>) -> Bool {
        participants.contains(users, matchedBy: {
            $0.userID == $1.id
        })
    }
}

Choosing whether to implement a piece of logic as either a generic algorithm or something that’s specifically written for a given use case can occasionally be quite difficult. However, a rule of thumb to keep in mind is that whenever our logic is simply a composition of other generic algorithms — it probably makes sense for that new algorithm to be generic too.

Conclusion

The Swift standard library contains a number of incredibly useful algorithms, functions and other APIs — and making full use of them not only reduces the amount of code that we need to maintain ourselves, it also lets us build our logic on top of shared code that’s well-tested and deployed within every single Swift program on the planet.

There are of course many more Collection APIs than those that were highlighted in this article, so I both recommend checking out the articles covering slicing and transformations if you haven’t done so already, and to explore the Sequence and Collection protocols yourself — chances are high that you’ll find at least one way that one of your algorithms could be simplified.

Got questions, comments or feedback? You’re always more than welcome to contact me, either via Twitter or email.

Thanks for reading! 🚀