Articles, podcasts and news about Swift development, by John Sundell.

Slicing Swift collections

Published on 02 Feb 2020
Discover page available: The Standard Library

One of Swift’s goals is to be a truly general-purpose programming language — that gracefully scales from high-level tasks, like building UIs and scripting, to low-level systems programming. That’s quite an ambitious goal, and arguably one that’s yet to be fully achieved, but there are a number of aspects of Swift’s design that does give it very scalable characteristics.

One such aspect is how the standard library takes great care to make working with its built-in collections as efficient as possible — by minimizing the number of circumstances in which their elements need to be copied, mutated and moved.

However, like with most optimizations, fully utilizing those behaviors does also require us to write our code in specific ways — especially when accessing slices or subsets of a given collection. That’s exactly what we’ll take a look at this week.

A slice of a binary pie

In Swift, a slice is a special kind of collection that doesn’t actually store any elements of its own, but rather acts as a proxy (or view) to enable us access and work with a subset of another collection as a separate instance.

As a very simple example, let’s say that we have an array containing ten numbers, and that we wish to extract the first five numbers in order to work with them separately. That can be done using Range-based subscripting, like this:

let numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
let firstFive = numbers[..<5]

At first glance, it might seem like firstFive would have the same type as numbers, which is Array<Int>, but it actually doesn’t. In fact, what we’ve done above is to create a slice, in this case of type ArraySlice<Int>.

Rather than copying the first five elements into a new Array instance, the standard library instead simply provided us with a view into that range of elements — which has significant performance benefits, especially when working with larger collections.

By not performing any copying or additional memory allocation for a collection’s elements, a slice can be created in constant (O(1)) time. Not only does that make creating a single slice faster, it also makes further slicing just as fast as if we performed it on the original collection — giving us a lot of freedom when it comes to how we use and compose Swift’s various slicing APIs in practice.

Prefixes and suffixes

Let’s start by taking a look at how we can use slicing to extract prefixes and suffixes from a collection. As an example, let’s say that we’re working on a todo app, which uses the following model to represent one of its todo lists:

struct TodoList {
    var name: String
    var items = [Item]()
    ...
}

Now let’s say that we’re building a feature that lets our users quickly view the top three items within any given list — for example within a “Today Extension” widget on iOS or macOS. To make that happen, we could use the same subscripting API as we used when slicing our above array of numbers, like this:

extension TodoList {
    var topItems: ArraySlice<Item> {
        items[..<3]
    }
}

However, while the above looks very elegant from a syntax point of view, it’s a quite dangerous implementation in this situation. Since we can’t know how many items that each TodoList will actually contain, our app might end up crashing when accessing the above property — since just like when retrieving a single element from an array, range-based subscripting also causes a crash if used with out-of-bounds indexes.

While we could of course add our own bounds-checking to our implementation, we don’t have to — since Array (and all other Collection implementations) has a prefix method that does just that:

extension TodoList {
    var topItems: ArraySlice<Item> {
        items.prefix(3)
    }
}

Now our new API will work as expected, even if a TodoList contains less than 3 items, and our implementation still has O(1) time complexity — which means that we can comfortably let it remain a computed property without the risk of causing accidental bottlenecks. Fantastic!

However, one thing that we have to keep in mind when working with slices is that they truly are a separate types compared to their original collections. That means that we can’t pass an ArraySlice instance to any API that accepts a standard Array, and vice versa — without first performing an explicit conversion. That may at first seem like an unnecessary inconvenience, but it’s actually really important — as it gives us complete control over when a slice should be separated (and its elements copied) from its original collection.

Let’s take a look at an example of doing just that, in which we’re using another “flavor” of the standard library’s prefix API (along with its suffix counterpart) to split a shipment of packages up into two separate shipments based on indexes. Since we don’t want our Shipment model to contain an ArraySlice, but rather a proper array of packages, we have to convert each of our two slices back into Array<Package> values — like this:

struct Shipment {
    var destination: Address
    var packages = [Package]()
    ...
}

extension Shipment {
    func split() -> (first: Shipment, second: Shipment) {
        guard packages.count > 1 else {
            return (self, Shipment(destination: destination))
        }

        let splitIndex = packages.count / 2

        return (
            Shipment(
                destination: destination,
                packages: Array(packages.prefix(upTo: splitIndex))
            ),
            Shipment(
                destination: destination,
                packages: Array(packages.suffix(from: splitIndex))
            )
        )
    }
}

The above calls to prefix and suffix are equivalent to the range-based subscripts packages[..<splitIndex] and packages[splitIndex...], so which one we’ll use is simply a matter of taste in this situation.

So far, we’ve been computing our prefixes and suffixes based on element counts and indexes, but we can also use completely custom logic when doing so as well. For example, here we’re calling prefix with a custom closure to determine which of a game’s top players that have scored more than 100,000 points:

let qualifiedPlayers = topPlayers.prefix { $0.score > 100_000 }

The above implementation assumes that the topPlayers array is in order according to each player’s score.

Dropping elements

An important aspect of both prefix and suffix is that neither of those APIs affect the original collection that they’re being called on, and instead return new instances for working with those subsets of elements. The same is also true for the drop family of APIs, with the only difference that they subtract a given prefix or suffix from a collection, rather than extracting it.

As an example, let’s say that we wanted to remove any numbers that appear at the beginning of a string, for example to prepare a string to be used as some form of normalized identifier. To do that, we could ask the string in question to drop all elements while the current element is a number — like this:

extension StringProtocol {
    func trimmingLeadingNumbers() -> SubSequence {
        drop(while: { $0.isNumber })
    }
}

Note how we extend StringProtocol above, and not String directly. That’s to enable our new API to be used on both standard strings, as well as the Substring type, which represents as slice of a String. To learn more about that, check out the Basics article about strings.

As mentioned earlier, one of the major benefits of Swift’s slice-based collection APIs is that they can be composed without causing any unnecessary copying. That’s really beneficial in situations like the one below, in which we’re composing our above trimmingLeadingNumbers method with a call to filter in order to normalize a username value:

func normalizeUsername(_ username: String) -> String {
    username.trimmingLeadingNumbers().filter {
        $0.isLetter || $0.isNumber
    }
}

Using the above approach, we can construct increasingly complex value transformations simply by chaining a series of operations together — all in a highly performant manner. Really cool.

Along those same lines, let’s take a look at how we could combine another drop variant — dropFirst — with prefix to easily add pagination support to any BidirectionalCollection (which includes types like Array, Range, and so on). By first calling dropFirst to remove all elements prior to where the current page begins, and then using prefix to extract a slice of the same size as our page size, we can implement our pagination extension like this:

extension BidirectionalCollection {
    func page(withIndex pageIndex: Int, size: Int) -> SubSequence {
        dropFirst(pageIndex * size).prefix(size)
    }
}

Going back to our TodoList type from before, we can then wrap the above API in a slightly more high-level abstraction, giving us a really nice pagination method that can be used to display any list of todo items in a page-by-page manner:

extension TodoList {
    func page(at index: Int) -> ArraySlice<Item> {
        items.page(withIndex: index, size: 25)
    }
}

At first, it might seem like a strange decision to return an ArraySlice from the above API, instead of converting the result into a proper Array. However, by doing so we’re following the same conventions as the standard library does — enabling the call site to decide how and when to convert each slice, which in turn enables us to perform additional chaining without performance penalties.

Splitting things up

Finally, let’s take a look at a third variant of collection slicing — splitting, which is a technique that’s incredibly commonly used when working with strings. Swift offers two main ways of splitting strings — the String type’s own split method, as well as Foundation’s components(separatedBy:) API (which was inherited from Objective-C’s NSString):

let lines = text.components(separatedBy: "\n")
let lines = text.split(separator: "\n")

While the above two calls may seem like they’re doing the exact same thing, they’re actually quite different when we start looking at the details. For starters, the first variant accepts a String to use as a separator, while the second accepts a single Character. The first also returns an array of strings, compared to the second, which returns an array of Substring values — which gives the second variant much better performance characteristics in scenarios that don’t require any copying.

Another important difference is that split gives us the option to limit the amount of splits that’ll occur, which can help us boost performance in situations when we want to extract only a limited number of substrings from a much larger string. For example, here we’re extracting the first five lines from a large body of text, by combining split with dropLast() (in order to remove the last component, which represents the remainder of the text):

let firstLines = text.split(
    separator: "\n",
    maxSplits: 5,
    omittingEmptySubsequences: true
).dropLast()

What’s really cool about split is that it’s not just a String API — just like prefix, suffix and the various drop methods we’ve taken a look at so far, it can be used with any collection. For example, here’s how we could use it to split an array of analytics events up into sessions, based on when the app moved to and from the background:

enum Event: Equatable {
    case appEnteredForeground
    case appEnteredBackground
    case impression(contentID: ContentID)
    case interaction(contentID: ContentID)
    ...
}

extension Array where Element == Event {
    func sessions() -> [ArraySlice<Event>] {
        split(omittingEmptySubsequences: true) {
            $0 == .appEnteredForeground ||
            $0 == .appEnteredBackground
        }
    }
}

The above is another example of when the Swift standard library’s protocol-oriented design really shines, as it gives us access to many different algorithms and pieces of functionality in a way that’s not directly tied to any concrete types. Not only can the APIs we’ve taken a look at in this article be used with any built-in collection, they can also be used with custom ones as well.

Conclusion

Swift’s focus on “algorithmic correctness” is something that, at times, may seem like an over-optimization — especially when dealing with smaller amounts of data typically found in simpler client/server-based iOS apps. However, this design does give Swift a lot more versatility, and can also often guide us to adopt better practices when it comes to how we handle our data.

By limiting the amount of implicit copying and memory allocation that occurs when working with collections, we ultimately gain more control over our core data structures — which in turn can help us boost the overall performance of the apps that we build, even though it might require a few conversions here and there.

What do you think? Do you use collection slices when writing Swift code, or do you mostly consider them an implementation detail of the standard library? Let me know — along with your questions, comments or feedback — either via Twitter or email.

Thanks for reading! 🚀