Articles, podcasts and news about Swift development, by John Sundell.

Combining opaque return types with primary associated types

Published on 12 Nov 2022
Basics article available: Protocols

Ever since Swift was first introduced, it’s been very common to need to use type erasure when working with generic protocols — ones that either reference Self within their requirements, or make use of associated types.

For example, in earlier versions of Swift, when using Apple’s Combine framework for reactive programming, every time we wanted to return a Publisher from a function or computed property, we had to first type-erase it by wrapping it within an AnyPublisher — like this:

struct UserLoader {
    var urlSession = URLSession.shared
    var decoder = JSONDecoder()

    func loadUser(withID id: User.ID) -> AnyPublisher<User, Error> {
        urlSession
            .dataTaskPublisher(for: urlForLoadingUser(withID: id))
            .map(\.data)
            .decode(type: User.self, decoder: decoder)
            .eraseToAnyPublisher()
    }

    private func urlForLoadingUser(withID id: User.ID) -> URL {
        ...
    }
}

The reason type erasure had to be used in situations like that is because simply declaring that our method returns something that conforms to the Publisher protocol wouldn’t give the compiler any information as to what kind of output or errors that the publisher emits.

Of course, an alternative to type erasure would be to declare the actual, concrete type that the above method returns. But when using frameworks that rely heavily on generics (such as Combine and SwiftUI), we very often end up with really complex nested types that would be very cumbersome to declare manually.

This is a problem that was partially addressed in Swift 5.1, which introduced the some keyword and the concept of opaque return types, which are very often used when building views using SwiftUI — as they let us leverage the compiler to infer what concrete View-conforming type that’s returned from a given view’s body:

struct ArticleView: View {
    var article: Article

    var body: some View {
        ScrollView {
            VStack(alignment: .leading) {
                Text(article.title).font(.title)
                Text(article.text)
            }
            .padding()
        }
    }
}

While the above way of using the some keyword works great in the context of SwiftUI, when we’re essentially just passing a given value into the framework itself (after all, we’re never expected to access the body property ourselves), it wouldn’t work that well when defining APIs for our own use.

For example, replacing the AnyPublisher return type with some Publisher (and removing the call to eraseToAnyPublisher) within our UserLoader from before would technically work in isolation, but would also make each call site unaware of what type of output that our publisher produces — as we’d be dealing with a completely opaque Publisher type that can’t access any of the protocol’s associated types:

struct UserLoader {
    ...

    func loadUser(withID id: User.ID) -> some Publisher {
        urlSession
            .dataTaskPublisher(for: urlForLoadingUser(withID: id))
            .map(\.data)
            .decode(type: User.self, decoder: decoder)
    }

    ...
}

UserLoader()
    .loadUser(withID: userID)
    .sink(receiveCompletion: { completion in
        ...
    }, receiveValue: { output in
        // We have no way of getting a compile-time guarantee
        // that the output argument here is in fact a User
        // value, so we'd have to use force-casting to turn
        // that argument into the right type:
        let user = output as! User
        ...
    })
    .store(in: &cancellables)

This is where Swift 5.7’s introduction of primary associated types comes in. If we take a look at the declaration of Combine’s Publisher protocol, we can see that it’s been updated to take advantage of this feature by declaring that its associated Output and Failure types are primary (by putting them in angle brackets right after the protocol’s name):

protocol Publisher<Output, Failure> {
    associatedtype Output
    associatedtype Failure: Error
    ...
}

That in turn enables us to use the some keyword in a brand new way — by declaring what exact types that our return value will use for each of the protocol’s primary associated types. So if we first update our UserLoader to use that new feature:

struct UserLoader {
    ...

    func loadUser(withID id: User.ID) -> some Publisher<User, Error> {
        urlSession
            .dataTaskPublisher(for: urlForLoadingUser(withID: id))
            .map(\.data)
            .decode(type: User.self, decoder: decoder)
    }

    ...
}

Then we’ll no longer be required to use force-casting at each call site — all while also avoiding any kind of manual type erasure, as the compiler will now retain full type safety all the way from our loadUser method to each of its call sites:

UserLoader()
    .loadUser(withID: userID)
    .sink(receiveCompletion: { completion in
        ...
    }, receiveValue: { user in
        // We're now getting a properly typed User
        // value passed into this closure.
        ...
    })
    .store(in: &cancellables)

Of course, since primary associated types isn’t just a Combine-specific thing, but rather a proper Swift feature, we can also use the above pattern when working with our own generic protocols as well.

For example, let’s say that we’ve defined a Loadable protocol that lets us abstract different ways of loading a given value behind a single, unified interface (this time using Swift concurrency):

protocol Loadable<Value> {
    associatedtype Value

    func load() async throws -> Value
}

struct NetworkLoadable<Value: Decodable>: Loadable {
    var url: URL

    func load() async throws -> Value {
        // Load the value over the network
        ...
    }
}

struct DatabaseLoadable<Value: Identifiable>: Loadable {
    var id: Value.ID

    func load() async throws -> Value {
        // Load the value from the app's local database
        ...
    }
}

A big benefit of using a pattern like that is that it enables us to very neatly separate concerns, as each call site doesn’t have to be aware of exactly how a given value is loaded — we can simply return some Loadable from a given function, and thanks to our primary associated type, we get full type safety without having to reveal what underlying type that’s used to perform the actual loading:

func loadableForArticle(withID id: Article.ID) -> some Loadable<Article> {
    let url = urlForLoadingArticle(withID: id)
    return NetworkLoadable(url: url)
}

However, one important limitation of opaque return types is that the compiler requires all code paths within a scope that returns an opaque type to always return the exact same type. So, if we wanted to dynamically switch between two different Loadable implementations, then we’d get a compiler error if we tried to keep using the some keyword like we did above:

// Error: Function declares an opaque return type 'some Loadable<Article>',
// but the return statements in its body do not have matching underlying types.
func loadableForArticle(withID id: Article.ID) -> some Loadable<Article> {
    if useLocalData {
    return DatabaseLoadable(id: id)
}

    let url = urlForLoadingArticle(withID: id)
    return NetworkLoadable(url: url)
}

One way to solve the above problem would be to use the good old fashioned approach of introducing a type-erasing AnyLoadable type, which we could use to wrap both of our underlying Loadable instances — but at this point, that does arguably feel like a step backwards, since we’d have to write that type-erased wrapper manually. Or do we?

It turns out that we can, in fact, keep leveraging the compiler even in these kinds of more dynamic situations — all that we have to do is replace the some keyword with Swift’s new any keyword, and the compiler will actually perform all of the required type erasure on our behalf:

func loadableForArticle(withID id: Article.ID) -> any Loadable<Article> {
    if useLocalData {
        return DatabaseLoadable(id: id)
    }

    let url = urlForLoadingArticle(withID: id)
    return NetworkLoadable(url: url)
}

Just like when using some in combination with primary associated types, using any retains full type-safety, and still enables us to use all available Loadable APIs, and maintain complete awareness that the returned instance loads Article values. Neat!

It’s important to point out, though, that using the any keyword in the above kind of way turns our method’s return value into a so-called existential, which does come with a certain performance overhead, and might also prevent us from using certain generic APIs. For example, if we were to use the any keyword within the earlier Combine-based example, then we’d be locked out of applying any kind of operators (like map or flatMap) on the returned publisher. So, when possible, it’s definitely preferable to use the some keyword instead.

I hope that you found this article useful. If you want to learn more about some and any, then check out my earlier article about those keywords, which focuses on how they can be used when declaring properties and parameter types. And if you have any questions, comments, or feedback, then feel free to reach out.

Thanks for reading!