Articles, podcasts and news about Swift development, by John Sundell.

Building an enum-based analytics system in Swift

Published on 03 Dec 2017
Basics article available: Enums

Gathering some form of analytics from your users is super important when continuously building, iterating on and improving a product. Learning how your users use your app in real life situations can sometimes be really surprising and take its development in new directions or act as inspiration for new features.

While there are definitely ways to take it too far and be very creepy with analytics, there are also many ways to implement systems that both inform you of how your product is actually used, while still respecting your users' privacy, data usage and overall experience.

However, implementing a solid analytics system that is also easy to use in code can be really difficult. This week, let's take a look at how such a system can be architected and implemented, based on one of my favorite Swift features - enums!

The requirements

Before starting to build any form of system it's always a good idea to write down a list of requirements in terms of how you want it to work and what you need it to do.

For our analytics system, we're going to have 4 goals:

Common approaches

One more thing before we dive into Xcode and start coding - let's have a look at some common approaches for implementing analytics in an app, to see what we can learn from them.

Singletons

The by far most common way to implement an analytics system (that I've seen in the apps I've worked on) is to use a singleton based approach. Just like we took a look at in "Avoiding singletons in Swift", using singletons for analytics can be a totally valid approach (and very convenient), but it can also quickly make our app harder to test & maintain.

Logging analytics events should in most cases be considered part of the controller layer (or an equivalent logic layer if you're not using MVC), and when using a singleton it's very easy for analytics code to start leaking out into your model or view layers.

Strings

Another common way of setting up analytics in an app is to use strings for identifiers. This is super flexible and enables you to change what identifiers you use very quickly. However, it's also a source of common mistakes and bugs when it comes to analytics code. It's far too easy to change a string in one place but forget to update it in another. This again usually makes these kind of systems harder to maintain over time, and can usually lead to lots of noise & invalid events - which makes data analysis a lot harder.

Third party SDKs

Finally, a very common solution used to implement analytics is to use a third party service or SDK. This can in many ways be a great solution in order to make it a lot faster and easier to implement analytics, just be careful with what kind of SDKs you put in your app - and make sure to research what kind of data they gather from your users in order to respect their privacy.

So I'm not recommending against using a third party SDK, quite the opposite. However, when adding such an SDK to your app, I recommend always putting a layer between that code and your own code. Doing that will make testing a lot easier, and make it much more flexible to switch out any such solution in the future.

Setting up the architecture

OK, we have our requirements down and we have done our research - let's get to work! ⚒

The way we're going to setup our analytics system is that we're going to start with an AnalyticsManager. This class will act as the top level API for logging events, and an instance of this class will be dependency injected into any view controller that wants to use our system.

But our AnalyticsManager won't actually do any logging. Instead it will use an AnalyticsEngine to send events to a backend. AnalyticsEngine will be a protocol that we can have multiple implementations of (for example one for testing, one for staging and one for production). It will also make it easier to switch out any third party SDK we might be using in the future.

Finally, we'll have an enum called AnalyticsEvent, which will contain all the events that our analytics system supports. We will use this setup instead of plain strings, both in order to have a compile time guarantee that our events are correct, and also to make refactors and other changes a lot easier in the future.

Let's get coding

Let's start from the ground up, and implement AnalyticsEvent first. We'll use an enum without a raw value, and implement a few events that we initially want to support:

enum AnalyticsEvent {
    case loginScreenViewed
    case loginAttempted
    case loginFailed(reason: LoginFailureReason)
    case loginSucceeded
    case messageListViewed
    case messageSelected(index: Int)
    case messageDeleted(index: Int, read: Bool)
}

Two notes about the code above:

Start your engine

Next, let's implement the AnalyticsEngine protocol:

protocol AnalyticsEngine: class {
    func sendAnalyticsEvent(named name: String, metadata: [String : String])
}

Quite simple, but you may be a bit surprised when looking at the above code. Didn't I just say that we shouldn't use free form strings as identifiers? What about our newly implemented AnalyticsEvent enum? 🤔

While we want all top level calls to use AnalyticsEvent in a type safe way, we don't want the underlying engine to have to know about that type. This will give us much more flexibility to refactor things in the future, and we can guarantee a uniform serialization process by not leaving it up to each engine.

Engine implementations

The beauty of this setup is that it enables multiple implementations of the AnalyticsEngine protocol. For example, we can get started with a simple CloudKit based one:

class CloudKitAnalyticsEngine: AnalyticsEngine {
    private let database: CKDatabase

    init(database: CKDatabase = CKContainer.default().publicCloudDatabase) {
        self.database = database
    }

    func sendAnalyticsEvent(named name: String, metadata: [String : String]) {
        let record = CKRecord(recordType: "AnalyticsEvent.\(name)")

        for (key, value) in metadata {
            record[key] = value as NSString
        }

        database.save(record) { _, _ in
            // We treat this as a fire-and-forget type operation
        }
    }
}

Or we could use more advanced solutions like sending data to our own backend database, or using third party SDKs like Mixpanel or Logmatic. We can also easily implement a mocked engine for testing, but more on that next week 😉.

Serialization

Before we go ahead and finalize things by implementing AnalyticsManager, let's take a look at how we can serialize an AnalyticsEvent value to prepare it for consumption by an AnalyticsEngine.

There are two parts, the name of the event and its metadata. For the most part, the name is super easy to automatically generate, since we can use the standard library's String(describing:) API to have Swift generate a string representing all cases without associated values. For cases with associated values, we'll manually return a name.

extension AnalyticsEvent {
    var name: String {
        switch self {
        case .loginScreenViewed, .loginAttempted,
             .loginSucceeded, .messageListViewed:
            return String(describing: self)
        case .loginFailed:
            return "loginFailed"
        case .messageSelected:
            return "messageSelected"
        case .messageDeleted:
            return "messageDeleted"
        }
    }
}

For metadata, we're either going to have to manually convert a given enum value to a dictionary, or use an automatic encoder such as Wrap. Here's what a simple manual implementation could look like:

extension AnalyticsEvent {
    var metadata: [String : String] {
        switch self {
        case .loginScreenViewed, .loginAttempted,
             .loginSucceeded, .messageListViewed:
            return [:]
        case .loginFailed(let reason):
            return ["reason" : String(describing: reason)]
        case .messageSelected(let index):
            return ["index" : "\(index)"]
        case .messageDeleted(let index, let read):
            return ["index" : "\(index)", "read": "\(read)"]
        }
    }
}

The API

We're finally ready to put everything together and implement AnalyticsManager. The manager will take an object conforming to AnalyticsEngine in its initializer and provide an API that lets us log a given event, like this:

class AnalyticsManager {
    private let engine: AnalyticsEngine

    init(engine: AnalyticsEngine) {
        self.engine = engine
    }

    func log(_ event: AnalyticsEvent) {
        engine.sendAnalyticsEvent(named: event.name, metadata: event.metadata)
    }
}

Very simple, but that's all we really need! 🎉

Usage

The true test of any system is how its API is to use, and what the call site looks like. Let's give our analytics system a go by implementing it in a MessageListViewController:

class MessageListViewController: UIViewController {
    private let messages: MessageCollection
    private let analytics: AnalyticsManager

    init(messages: MessageCollection, analytics: AnalyticsManager) {
        self.messages = messages
        self.analytics = analytics
        super.init(nibName: nil, bundle: nil)
    }

    override func viewDidAppear(_ animated: Bool) {
        super.viewDidAppear(animated)
        analytics.log(.messageListViewed)
    }

    private func deleteMessage(at index: Int) {
        let message = messages.delete(at: index)
        analytics.log(.messageDeleted(index: index, read: message.read))
    }
}

As you can see above, we use classic initializer-based dependency injection to pass our AnalyticsManager into our view controller as part of its setup process. Here you could also use property-based dependency injection (if you are using Storyboards for example), or the factory based approach from "Dependency injection using factories in Swift".

How did we do?

So how did our final implementation score against our 4 goals:

Conclusion

Using three distinct parts, a manager, an engine and an event enum, we are now able to easily write predictable and flexible analytics code that is heavily compile time checked.

This approach is not only nice for analytics, but the Manager + Engine combination can be a great way to abstract things like hardware sensors, location services, etc. as well. What's nice is that it lets you separate your logic and your code from interacting with your underlying dependencies. That heavily increases the chances of your code standing the test of time and not having to be completely rewritten if your underlying dependencies change.

Next week, we're going to build on top of this solution to see how to add unit tests and UI tests to verify our analytics code. So make sure to either subscribe to this blog or follow @swiftbysundell on Twitter to get notified when the next post comes out (spoiler: it'll be next Sunday 😜).

What do you think? Is this an approach of implementing analytics that you've been using before, or is it something you'll try out? Do you have any other tips on how to build an easy to use and predictable analytics system? Let me know, along with any questions, comments or feedback you have - on Twitter @johnsundell.

Thanks for reading! 🚀