Articles, podcasts and news about Swift development, by John Sundell.

Managing self and cancellable references when using Combine

Published on 05 Feb 2021
Discover page available: Combine

Memory management is often especially tricky when dealing with asynchronous operations, as those tend to require us to keep certain objects in memory beyond the scope in which they were defined, while still making sure that those objects eventually get deallocated in a predictable way.

Although Apple’s Combine framework can make it somewhat simpler to manage such long-living references — as it encourages us to model our asynchronous code as pipelines, rather than a series of nested closures — there are still a number of potential memory management pitfalls that we have to constantly look out for.

In this article, let’s take a look at how some of those pitfalls can be avoided, specifically when it comes to self and Cancellable references.

A cancellable manages the lifetime of a subscription

Combine’s Cancellable protocol (which we typically interact with through its type-erased AnyCancellable wrapper) lets us control how long a given subscription should stay alive and active. Like its name implies, as soon as a cancellable is deallocated (or manually cancelled), the subscription that it’s tied to is automatically invalidated — which is why almost all of Combine’s subscription APIs (like sink) return an AnyCancellable when called.

For example, the following Clock type holds a strong reference to the AnyCancellable instance that it gets back from calling sink on a Timer publisher, which keeps that subscription active for as long as its Clock instance remains in memory — unless the cancellable is manually removed by setting its property to nil:

class Clock: ObservableObject {
    @Published private(set) var time = Date().timeIntervalSince1970
    private var cancellable: AnyCancellable?

    func start() {
        cancellable = Timer.publish(
            every: 1,
            on: .main,
            in: .default
        )
        .autoconnect()
        .sink { date in
            self.time = date.timeIntervalSince1970
        }
    }

    func stop() {
        cancellable = nil
    }
}

However, while the above implementation perfectly manages its AnyCancellable instance and the Timer subscription that it represents, it does have quite a major flaw in terms of memory management. Since we’re capturing self strongly within our sink closure, and since our cancellable (which is, in turn, owned by self) will keep that subscription alive for as long as it remains in memory, we’ll end up with a retain cycle — or in other words, a memory leak.

Avoiding self-related memory leaks

An initial idea on how to fix that problem might be to instead use Combine’s assign operator (along with a quick Data-to-TimeInterval transformation using map) to be able to assign the result of our pipeline directly to our clock’s time property — like this:

class Clock: ObservableObject {
    ...

    func start() {
        cancellable = Timer.publish(
            every: 1,
            on: .main,
            in: .default
        )
        .autoconnect()
        .map(\.timeIntervalSince1970)
.assign(to: \.time, on: self)
    }

    ...
}

However, the above approach will still cause self to be retained, as the assign operator keeps a strong reference to each object that’s passed to it. Instead, with our current setup, we’ll have to resort to a good old fashioned “weak self dance” in order to capture a weak reference to our enclosing Clock instance, which will break our retain cycle:

class Clock: ObservableObject {
    ...

    func start() {
        cancellable = Timer.publish(
            every: 1,
            on: .main,
            in: .default
        )
        .autoconnect()
        .map(\.timeIntervalSince1970)
        .sink { [weak self] time in
    self?.time = time
}
    }

    ...
}

With the above in place, each Clock instance can now be deallocated once it’s no longer referenced by any other object, which in turn will cause our AnyCancellable to be deallocated as well, and our Combine pipeline will be properly dissolved. Great!

Assigning output values directly to a Published property

Another option that can be great to keep in mind is that (as of iOS 14 and macOS Big Sur) we can also connect a Combine pipeline directly to a published property. However, while doing so can be incredibly convenient in a number of different situations, that approach doesn’t give us an AnyCancellable back — meaning that we won’t have any means to cancel such a subscription.

In the case of our Clock type, we might still be able to use that approach — if we’re fine with removing our start and stop methods, and instead automatically start each clock upon initialization, since otherwise we might end up with duplicate subscriptions. If those are tradeoffs that we’re willing to accept, then we could change our implementation into this:

class Clock: ObservableObject {
    @Published private(set) var time = Date().timeIntervalSince1970

    init() {
        Timer.publish(
            every: 1,
            on: .main,
            in: .default
        )
        .autoconnect()
        .map(\.timeIntervalSince1970)
        .assign(to: &$time)
    }
}

When calling the above flavor of assign, we’re passing a direct reference to our Published property’s projected value, prefixed with an ampersand to make that value mutable (since assign uses the inout keyword). To learn more about that pattern, check out the Basics article about value and reference types.

The beauty of the above approach is that Combine will now automatically manage our subscription based on the lifetime of our time property — meaning that we’re still avoiding any reference cycles while also significantly reducing the amount of bookkeeping code that we have to write ourselves. So for pipelines that are only configured once, and are directly tied to a Published property, using the above overload of the assign operator can often be a great choice.

Weak property assignments

Next, let’s take a look at a slightly more complex example, in which we’ve implemented a ModelLoader that lets us load and decode a Decodable model from a given URL. By using a single cancellable property, our loader can automatically cancel any previous data loading pipeline when a new one is triggered — as any previously assigned AnyCancellable instance will be deallocated when that property’s value is replaced.

Here’s what that ModelLoader type currently looks like:

class ModelLoader<Model: Decodable>: ObservableObject {
    enum State {
        case idle
        case loading
        case loaded(Model)
        case failed(Error)
    }

    @Published private(set) var state = State.idle

    private let url: URL
    private let session: URLSession
    private let decoder: JSONDecoder
    private var cancellable: AnyCancellable?

    ...

    func load() {
        state = .loading

        cancellable = session
            .dataTaskPublisher(for: url)
            .map(\.data)
            .decode(type: Model.self, decoder: decoder)
            .map(State.loaded)
            .catch { error in
                Just(.failed(error))
            }
            .receive(on: DispatchQueue.main)
            .sink { [weak self] state in
    self?.state = state
}
    }
}

While that automatic cancellation of old requests prevents us from simply connecting the output of our data loading pipeline to our Published property, if we wanted to avoid having to manually capture a weak reference to self every time that we use the above pattern (that is, loading a value and assigning it to a property), we could introduce the following Publisher extension — which adds a weak-capturing version of the standard assign operator that we took a look at earlier:

extension Publisher where Failure == Never {
    func weakAssign<T: AnyObject>(
        to keyPath: ReferenceWritableKeyPath<T, Output>,
        on object: T
    ) -> AnyCancellable {
        sink { [weak object] value in
            object?[keyPath: keyPath] = value
        }
    }
}

With the above in place, we can now simply call weakAssign whenever we want to assign the output of a given publisher to a property of an object that’s captured using a weak reference — like this:

class ModelLoader<Model: Decodable>: ObservableObject {
    ...

    func load() {
        state = .loading

        cancellable = session
            .dataTaskPublisher(for: url)
            .map(\.data)
            .decode(type: Model.self, decoder: decoder)
            .map(State.loaded)
            .catch { error in
                Just(.failed(error))
            }
            .receive(on: DispatchQueue.main)
            .weakAssign(to: \.state, on: self)
    }
}

Is that new weakAssign method purely syntactic sugar? Yes. But is it nicer than what we were using before? Also yes 🙂

Capturing stored objects, rather than self

Another type of situation that’s quite commonly encountered when working with Combine is when we need to access a specific property within one of our operators, for example in order to perform a nested asynchronous call.

To illustrate, let’s say that we wanted to extend our ModelLoader by using a Database to automatically store each model that was loaded — an operation that also wraps those model instances using a generic Stored type (for example in order to add local metadata, such as an ID or model version). To be able to access that database instance within an operator like flatMap, we could once again capture a weak reference to self — like this:

class ModelLoader<Model: Decodable>: ObservableObject {
    enum State {
        case idle
        case loading
        case loaded(Stored<Model>)
        case failed(Error)
    }

    ...
    private let database: Database
    ...

    func load() {
        state = .loading

        cancellable = session
            .dataTaskPublisher(for: url)
            .map(\.data)
            .decode(type: Model.self, decoder: decoder)
            .flatMap {
    [weak self] model -> AnyPublisher<Stored<Model>, Error> in
    
    guard let database = self?.database else {
        return Empty(completeImmediately: true)
            .eraseToAnyPublisher()
    }
    
    return database.store(model)
}
            .map(State.loaded)
            .catch { error in
                Just(.failed(error))
            }
            .receive(on: DispatchQueue.main)
            .weakAssign(to: \.state, on: self)
    }
}

The reason we use the flatMap operator above is because our database is also asynchronous, and returns another publisher that represents the current saving operation.

However, like the above example shows, it can sometimes be tricky to come up with a reasonable default value to return from an unwrapping guard statement placed within an operator like map or flatMap. Above we use Empty, which works, but it does add a substantial amount of extra verbosity to our otherwise quite elegant pipeline.

Thankfully, that problem is quite easy to fix (at least in this case). All that we have to do is to capture our database property directly, rather than capturing self. That way, we don’t have to deal with any optionals, and can now simply call our database’s store method within our flatMap closure — like this:

class ModelLoader<Model: Decodable>: ObservableObject {
    ...

    func load() {
        state = .loading

        cancellable = session
            .dataTaskPublisher(for: url)
            .map(\.data)
            .decode(type: Model.self, decoder: decoder)
            .flatMap { [database] model in
    database.store(model)
}
            .map(State.loaded)
            .catch { error in
                Just(.failed(error))
            }
            .receive(on: DispatchQueue.main)
            .weakAssign(to: \.state, on: self)
    }
}

As an added bonus, we could even pass the Database method that we’re looking to call directly into flatMap in this case — since its signature perfectly matches the closure that flatMap expects within this context (and thanks to the fact that Swift supports first class functions):

class ModelLoader<Model: Decodable>: ObservableObject {
    ...

    func load() {
        state = .loading

        cancellable = session
            .dataTaskPublisher(for: url)
            .map(\.data)
            .decode(type: Model.self, decoder: decoder)
            .flatMap(database.store)
            .map(State.loaded)
            .catch { error in
                Just(.failed(error))
            }
            .receive(on: DispatchQueue.main)
            .weakAssign(to: \.state, on: self)
    }
}

So, when possible, it’s typically a good idea to avoid capturing self within our Combine operators, and to instead call other objects that can be stored and directly passed into our various operators as properties.

Conclusion

While Combine offers many APIs and features that can help us make our asynchronous code easier to write and maintain, it still requires us to be careful when it comes to how we manage our references and their underlying memory. Capturing a strong reference to self in the wrong place can still often lead to a retain cycle, and if a Cancellable is not properly deallocated, then a subscription might stay active for longer than expected.

Hopefully this article has given you a few new tips and techniques that you can use to prevent memory-related issues when working with self and Cancellable references within your Combine-based projects, and if you have any questions, comments, or feedback, then feel free to reach out via either Twitter or email.

Thanks for reading!