Articles, podcasts and news about Swift development, by John Sundell.

Type-safe identifiers in Swift

Published on 13 May 2018

In most code bases, we need a way to uniquely identify certain values and objects. It can be when keeping track of models in a cache or database, or when performing a network request to fetch more data for a certain entity.

In "Identifying objects in Swift" we took a look at how objects can be identified using the built-in ObjectIdentifier type - and this week, we're going to construct similar identifier types for values, with an increasing degree of type safety.

Please note that, since this article was written, Swift has gained a built-in Identifiable protocol that can be used to mark certain types as being uniquely identifiable. That protocol is also used within SwiftUI to identify models that are bound to certain views, such as ForEach. However, the built-in Identifiable protocol does lack some of the strong type safety that this article's implementation provides.

A dedicated type

As a language, Swift puts a heavy focus on types and type safety. With features like generics, protocols and static typing, we're always encouraged (or forced even) to know the exact types of the objects and values that we're dealing with.

When it comes to identifiers, the Swift standard library provides the dedicated UUID type - which enables us to easily add a unique identifier to any value type, like a User model:

struct User {
    let id: UUID
    let name: String
}

However, UUID is purpose-built to be backed by a string with a given standardized format (RFC 4122 to be specific), which is not always practical (or even possible) when identifiers need to be used across multiple different systems (such as the app's server, or other platforms, like Android).

Because of that, it's very common to instead just use plain strings as identifiers, like this:

struct User {
    let id: String
    let name: String
}

While strings are nice for things like text, they're not a very robust or safe solution for something like identifiers. Not every string is a valid identifier, and ideally we'd like to leverage the type system to prevent bugs that can occur when accidentally passing an incompatible string as an identifier.

Strings can also do many things that we don't really want our identifiers to be able to do (like having characters added or removed). Ideally, we'd like a more narrow, dedicated type (just like UUID) that we can use to model any identifier.

The good news is that defining such a type is very easy, and since we now (since Swift 4.1) get Hashable support for free, all we really have to do is declare an Identifier type that is backed by a string of any format - like this:

struct Identifier: Hashable {
    let string: String
}

We can now use our dedicated Identifier type in our model code, to make it crystal clear that a certain property is an identifier, not just any string:

struct User {
    let id: Identifier
    let name: String
}

The native feel

Our new Identifier type is pretty nice, but it doesn't feel as "native" to use as when we were using plains strings. For example, initializing an identifier now requires us to pass the underlying string as a parameter, like this:

let id = Identifier(string: "new-user")

Thankfully, this is something we can easily fix. Since the Swift standard library is so protocol oriented, making our own custom type feel right "at home" just requires us to conform to a few simple protocols.

First up, let's make it possible to create an identifier using a string literal. That way, we get the same convenience as when using plain strings, but with the added safety of using a dedicated type. To do that, all we have to do is conform to ExpressibleByStringLiteral and implement an additional initializer:

extension Identifier: ExpressibleByStringLiteral {
    init(stringLiteral value: String) {
        string = value
    }
}

We'd also like to make it simple to print an identifier, or include one in a string literal. To make that happen, we'll add a conformance to CustomStringConvertible as well:

extension Identifier: CustomStringConvertible {
    var description: String {
        return string
    }
}

With the above in place, we can now both easily bridge our dedicated Identifier type to and from string literals:

let user = User(id: "new-user", name: "John")
print(user.identifier) // "new-user"

Another feature that would make our identifier type feel more native is to add coding support. With the introduction of Codable, we could simply have the compiler generate coding support for us, but that would require our data (for example JSON) to have the following format:

{
    "id": {
        "string": "49-12-90-21"
    },
    "name": "John"
}

That's not very nice, and it will again make compatibility with other systems and platforms a lot harder. Instead, let's write our own Codable implementation, using a single value container:

extension Identifier: Codable {
    init(from decoder: Decoder) throws {
        let container = try decoder.singleValueContainer()
        string = try container.decode(String.self)
    }

    func encode(to encoder: Encoder) throws {
        var container = encoder.singleValueContainer()
        try container.encode(string)
    }
}

With the above in place, we can now also encode and decode an Identifier value using a single string, just like how we're able to initialize one using a string literal. Our dedicated type now has a very nice native feel to it, and can naturally be used in many different contexts. Pretty sweet! 🍭

Even more type safety

We've now prevented strings from being confused with identifiers, but we can still accidentally use an identifier value with an incompatible model type. For example, the compiler won't give us any kind of warning if we accidentally end up with code like this:

// Ideally, it shouldn't be possible to look up an 'Article' model
// using a 'User' identifier.
articleManager.article(withID: user.id)

So even though having a dedicated Identifier type is a huge step towards better type safety, we can still take things further by associating a given identifier with the type of value that it's representing. To do that, let's add a generic Value type to Identifier:

struct Identifier<Value>: Hashable {
    let string: String
}

We're now forced to specify what kind of identifier we're dealing with, making each identifier strongly associated only with its corresponding value, like this:

struct User {
    let id: Identifier<User>
    let name: String
}

If we now again try to look up an Article using a User identifier, we'll get a compiler error saying that we're trying to pass a value of Identifier<User> to a function that accepts a parameter of type Identifier<Article>.

Generalizing our generic

Can we take it a step further? Totally! 😉

Not all identifiers are backed by strings, and in some situations we might need our identifier type to also support other backing values - like Int, or even UUID.

We could of course make that happen by introducing separate types (like StringIdentifier and IntIdentifier), but that'd require us to duplicate a bunch of code, and wouldn't really feel as "Swifty". Instead, let's introduce a protocol that our identifiable models and values can conform to:

protocol Identifiable {
    associatedtype RawIdentifier: Codable = String

    var id: Identifier<Self> { get }
}

As you can see above, our Identifiable protocol allow our various types the flexibility to declare a custom RawIdentifier type - while still keeping the convenience of having String as the default. We'll then require the generic Value type of Identifier to conform to Identifiable, and we're done:

struct Identifier<Value: Identifiable> {
    let rawValue: Value.RawIdentifier

    init(rawValue: Value.RawIdentifier) {
        self.rawValue = rawValue
    }
}

Since we chose to leave String as the default raw value, all of our existing models will work just as before, all we need to do is add a conformance to Identifiable:

struct User: Identifiable {
    let id: Identifier<User>
    let name: String
}

Thanks to Swift 4.1's Conditional conformances feature, we can now easily support both Int and String literals as well. All we have to do is add a constraint on the value's RawIdentifier when conforming to each literal expression protocol, like this:

extension Identifier: ExpressibleByIntegerLiteral
          where Value.RawIdentifier == Int {
    typealias IntegerLiteralType = Int

    init(integerLiteral value: Int) {
        rawValue = value
    }
}

With the above in place, we are now free to use Int-based identifiers as well, using the exact same code, in a very type-safe way:

struct Group: Identifiable {
    typealias RawIdentifier = Int

    let id: Identifier<Group>
    let name: String
}

Type safety everywhere

Having type-safe identifiers is not only useful when it comes to preventing developer mistakes, it also enables us to construct some really nice APIs when it comes to handling our models and values in different contexts. For example, now that we have the Identifiable protocol, we could make our database code type-safe as well:

protocol Database {
    func record<T: Identifiable>(withID id: Identifier<T>) -> T?
}

The same thing goes for our networking code, which can also be made generic with a constraint on Identifiable:

protocol ModelLoader {
    typealias Handler<T> = (Result<T>) -> Void

    func loadModel<T: Identifiable>(withID id: Identifier<T>,
                                    then handler: @escaping Handler<T>)
}

We can now leverage the type system and the compiler in many more ways throughout our code base 🎉.

Conclusion

Using type-safe identifiers can be a great way to make our model code more robust and less prone to errors caused by accidentally using the wrong kind of identifier. It also enables us to leverage the compiler in interesting ways throughout our code base, in order to make handling models and values a lot simpler and more safe.

What degree of type safety that will be appropriate for your code will of course vary a lot depending on your exact setup and requirements. Simply using a dedicated, non-generic Identifier type might be enough, and can both be easy to implement and be a big step towards more type-safe identifier handling. If you want or need even more type safety, using generics and protocols can also be a great option when taking things further.

What do you think? What kind of identifiers do you use, and what kind of type safety do you like your identifiers to have? Let me know - along with any questions, comments or feedback you might have - on Twitter @johnsundell.

Thanks for reading! 🚀