Building a networking layer with operations

#networking #stackoverflow-api #concurrency

03 Oct 2018 11 min read

Networking is central to how most apps add value to their users which is why significant effort has been put into creating an easy to use networking layer by both Apple and the open-source community.

When iOS was first released, NSURLConnection was the native networking suite it shipped with. However the NSURLConnection story actually began long before iOS as it was first released in 2003 with Safari, this meant that NSURLConnection was never designed for the magnitude of tasks that it ended up having to support. These add-ons resulted in NSURLConnection having a number of sharp edges - such as how much programming effort was required to construct the response of a request: gradually building up an NSMutableData instance by responding to the delegate calls. Making networking trickier than it needed to be. A lot of the code for making network requests was boilerplate code which allowed us to abstract it away into base/parent classes that could then be reused for multiple requests. This not-quite-fit-for-purpose networking approach coupled with boilerplate code resulted in us creating generic, well-encapsulated networking layers to hide all that nasty away.

Then great third-party libraries like AFNetworking began to appear that operated on top of NSURLConnection and looked to smooth out some of those sharp edges for us. AFNetworking did this by implementing the boilerplate that NSURLConnection required and presenting to the developer an easy to use block/closure based interface. This closure based networking approach proved to be hugely successful and eventually Apple decided to embrace it with URLSession. However not everything was rosy 🥀 - due to the ease of which we could now make networking requests and process the responses, you started to see a lot more networking code implemented directly in view controllers 💥. These bloated view controllers became responsible for multiple different domains - interacting with views, making network requests and parsing network responses into model objects. Classes that combine multiple domains are trickier to reason about, maintain and extend which is why the single responsibility principle is a very common design practice.

In this post, I want to explore how we can build an app that avoids bringing multiple domains together in its view controllers by having a distinct networking layer using URLSession and Operation.

Thinking about that layer

Whenever I think of layers in software engineering I always picture that great engineering marvel: the Hoover Dam.

A photo of the Hoover Dam showing different water layers

A very visible, physical divider between two different domains - the reservoir before and the river after. In order for the water to continue to flow through the canyons, it must pass through the dam in the way that the dam dictates - just like how data flows around a system based on an agreed contract.

(Of course like any good metaphor, the dam one doesn't quite work because water should only flow one direction unlike our data but it will serve here as a visual aid)

Our networking layer will have 3 main responsibilities:

Scheduling network requests
Performing networking requests
Parsing the response from network requests

These responsibilities will come together to produce the following class structure:

Class diagram of generic networking layer showing the key components DataManager, Scheduler and Task class

Just like with the Hoover Dam our networking layer will have a well encapsulated outer edge that will transform data requests into distinct tasks that will then be scheduled for execution. Each task will consist of the actual network request and its response parsing 🌹.

Recapping

At the heart of this approach will be the ability to transform a network request and the subsequent parsing of its response into one task to be completed together. With OperationQueue and Operation, iOS already has a simple, well-supported mechanism of isolating distinct tasks within our apps. Combined OperationQueue and Operation form one of the options to support the concurrent execution of tasks.

Before continuing, let's take a small recap of what OperationQueue and Operation are.

An OperationQueue is a queue that controls the order of execution of its operations by their readiness value (must be true in order to start execution) and then scoring each operation by its priority level and time spent in the queue. This means that an OperationQueue is built on top of GCD which allows it to take advantage of an iOS device's multiple cores without the developer having to know anything about how it does this. By default, an OperationQueue will attempt to execute as many operations in parallel as the device is capable of supporting. As an OperationQueue maintains a link to its operations, it's possible to query those operations and perform additional tasks on them such as pausing or cancelling operations (which can be useful if the user logs out).

Operation is an abstract class which need to be subclassed to undertake a specific task. An Operation typically runs on a separate thread from the one that created it. Each operation is controlled via an internal state machine, the possible states are:

Pending indicates that our Operation subclass has been added to the queue.
Ready indicates that our Operation subclass is good to go and if there is space on the queue, this operation's task can be started
Executing indicates that our Operation subclass is actually doing work at the moment.
Finished indicates that our Operation subclass has completed its task and should be removed from the queue.
Cancelled indicates that our Operation subclass has been cancelled and should stop its execution.

A typical operations lifecycle will move through the following states:

Operation state diagram showing -> -> ->

It's important to note that cancelling an executing operation will not actually stop that operation on its own, instead it is up to the individual operation to clean up after itself and move into the Finished state.

Operations come in two flavours:

Non-Concurrent
Concurrent

Non-Concurrent operations perform all their work on the same thread so that when the main method returns the operation is moved into the Finished state. The queue is then be notified of this and removes the operation from its active operation pool, freeing resources for the next operation.

Concurrent operations can perform some of their work on a different thread so returning from the main method can no longer be used to move the operation into a Finished state. Instead, when we create a concurrent operation we are assuming the responsibility for moving the operation between the Ready, Executing and Finished states.

Building the concurrent operation 🏗️

A networking operation is a specialised concurrent operation because when an URLSession is making a network request, it does so on a different thread from the thread that resumed that task. Rather than cramming everything into one operation, this solution will focus on building an abstract concurrent operation and more specific operations that actually do the networking.

This post will gradually build up to a working example however if time is tight, then head on over to the completed example and take a look at ConcurrentOperation, QuestionsRetrievalOperation, QuestionsDataManager and QueueManager to see how things end up.

As mentioned, a concurrent operation takes responsibility for ensuring that its internal state is correct. This state is controlled by manipulating the isReady, isExecuting and isFinished properties. However, these are read-only so these properties will need to overridden and an operations' current state tracked via a different property. When mapping state I often find it best to use an enum type:

class ConcurrentOperation: Operation {

    // MARK: - State

    private enum State {
        case ready
        case executing
        case finished
    }

    private var state = State.ready

    override var isReady: Bool {
        return super.isReady && state == .ready
    }

    override var isExecuting: Bool {
        return state == .executing
    }

    override var isFinished: Bool {
        return state == .finished
    }
}

In the code snippet above, the private mutable property state will track this internal state with the state value then being used in each of the overridden properties to return the correct boolean result. The cases of the State enum map directly to the state properties that are being overridden.

It's important to note that the isReady property is a little more complex than isExecuting or isFinished. One of the advantages of using operations is that it's possible to chain operations together so that one operation is dependent on another operation finishing before it can begin execution. If isReady just checked state == .ready, ConcurrentOperation would lose the ability to create dependencies, as an operation that has an unfinished dependency has an isReady value of false.

(I've seen examples of concurrent operation implementations that use 3 separate, settable state properties e.g. _isReady, _isExecuting and _isFinished. While this is a valid approach, I find having 6 properties that are so similarly named hard to read and led to class bloat)

In concurrent operations, it's not uncommon to see the isAsynchronous property being overridden to always return true however in this example these operations are always going to be executed via an operation queue so there is no need to actually override this property as the queue will just ignore that value.

OperationQueue uses KVO to know when its operations change state however so far ConcurrentOperation will not inform any observers. The key paths that the queue will observe, map directly with the enum cases. So we can transform our basic State enum into a String backed enum and use the raw-value of each to hold the expected key path string:

enum State: String {
    case ready = "isReady"
    case executing = "isExecuting"
    case finished = "isFinished"
}

With this new powerful enum, let's add property observers:

var state = State.ready {
    willSet {
        willChangeValue(forKey: newValue.rawValue)
        willChangeValue(forKey: state.rawValue)
    }
    didSet {
        didChangeValue(forKey: oldValue.rawValue)
        didChangeValue(forKey: state.rawValue)
    }
}

ConcurrentOperation will now trigger KVO notifications correctly but so far the state property isn't actually being mutated. The first state change is from Ready -> Executing which is handled by overriding the start method:

override func start() {
    guard !isCancelled else {
        finish()
        return
    }

    if !isExecuting {
        state = .executing
    }

    main()
}

The start method is the entry point into an operation, it is called when the operation should begin its work. In the above code snippet a check is made to see if the operation has been moved into the Cancelled state (yep, an operation can begin in a cancelled state) - if it has, finish is immediately called (this will be shown soon) which will move the operation into the Finished state and then return. If the operation hasn't been cancelled, it is moved into the Executing state and main is called (for the actual work to begin). Each subclass of ConcurrentOperation would implement main to perform its specific task. main is actually the entry point for non-concurrent operations and technically for a concurrent operation any method could have been chosen as the work method (provided each subclass implements it). However, by choosing main the cognitive load on any future developer is lessened by allowing them to transfer the expectation of how non-concurrent operations work to our concurrent operation implementation.

(In older versions of iOS, concurrent operations were responsible for calling main on a different thread - this is no longer the case)

Its important to note that super.start() isn't being called on purpose, as by overriding start this operation assumes full control of maintaining its state.

func finish() {
    if isExecuting {
        state = .finished
    }
}

finish acts as a nice symmetrical opposite of the start method and ensures that when an operation has finished that it is moved into the Finished state. It's essential that all operations eventually call this method - if you are experiencing odd behaviour where your queue seems to have jammed and no operations are being processed, one of your operations is probably missing a finish call somewhere.

Finally the cancel method needs to be overridden:

override func cancel() {
    super.cancel()

    finish()
}

It's important that regardless of how a concurrent operation ends that it moves into the Finished state and it does so when cancelled.

And that's it. By taking control of its state, the above operation has transformed itself from a boring non-concurrent operation into....a...well, slightly less boring concurrent operation.

Adding networking specific add-ons 📡

Before we start using ConncurentOperation however let's make one more change to make it more useful by adding a custom completion closure that will use the widespread Result enum type to return the outcome of the operations task:

enum Result {
    case success(T)
    case failure(Error)
}

class ConcurrentOperation: Operation {

    typealias OperationCompletionHandler = (_ result: Result) -> Void

    var completionHandler: (OperationCompletionHandler)?

    // omitting properties and methods that we have already seen

    func complete(result: Result) {
        finish()

        if !isCancelled {
            completionHandler?(result)
        }
    }
}

ConcurrentOperation now supports returning results of an unknown type T which each operation subclass will define. The custom completion closure is then triggered when the complete method is called.

And that's it, we have an abstract concurrent networking operation 🎓.

Now, let's build some concrete operations.

Making that call

StackOverflow has a wonderful, open API that will be used below to build a networking operation. The networking operation will retrieve and parse the latest questions which are tagged with iOS via the /questions endpoint.

Instead of building the operation piece by piece like we did above, lets look at the complete operation and then examine 👀 more closely the interesting parts:

class QuestionsRetrievalOperation: ConcurrentOperation {

    private let session: URLSession
    private let urlRequestFactory: QuestionsURLRequestFactory
    private var task: URLSessionTask?
    private let pageIndex: Int

    // MARK: - Init

    init(pageIndex: Int, session: URLSession = URLSession.shared, urlRequestFactory: QuestionsURLRequestFactory = QuestionsURLRequestFactory()) {
        self.pageIndex = pageIndex
        self.session = session
        self.urlRequestFactory = urlRequestFactory
    }

    // MARK: - Main

    override func main() {
        let urlRequest = urlRequestFactory.requestToRetrieveQuestions(pageIndex: pageIndex)

        task = session.dataTask(with: urlRequest) { (data, response, error) in
            guard let data = data else {
                DispatchQueue.main.async {
                    if let error = error {
                        self.complete(result: .failure(error))
                    } else {
                        self.complete(result: .failure(APIError.missingData))
                    }
                }
                return
            }

            do {
                let page = try JSONDecoder().decode(QuestionPage.self, from: data)

                DispatchQueue.main.async {
                    self.complete(result: .success(page))
                }
            } catch let error {
                DispatchQueue.main.async {
                    self.complete(result: .failure(APIError.serialization))
                }
            }
        }

        task?.resume()
    }

    // MARK: - Cancel

    override func cancel() {
        task?.cancel()
        super.cancel()
    }
}

(I won't show the QuestionPage, QuestionsURLRequestFactory and APIError in this post but if you are interested you can see them by cloning the example project here. QuestionPage is a model struct that will be populated with the questions endpoint JSON response. QuestionsURLRequestFactory is a helper factory for returning a URLRequest object configured for accessing the /questions endpoint. APIError is a custom Error enum)

In the main method above, an instance of URLSession is used to create a URLSessionDataTask which will make a network request and parse its response. Depending on the outcome of the network request and parsing of its response, complete is called and either an error or a parsed QuestionPage model is passed back wrapped into a Result. When completing an operation, special attention needs to be paid to ensure that the completionHandler property is used directly as this would cause the operation to end but not move it into the Finished state and what would result eventually is a jammed queue (as spoken about above).

In the cancel method, the URLSessionTask instance used to make the network request is cancelled so ensuring that no bandwidth is needlessly wasted.

And that's pretty much it for what a networking operation looks like. Of course, each different network operation will be unique but the general structure will be similar to QuestionsRetrievalOperation (if you want to see another concurrent operation example, take a look at this one).

Getting the managers involved

So far 2 of the 3 responsibilities of the networking layer shown above have been built:

Performing networking requests
Parsing the response from network requests

Time to look at the final responsibility:

Scheduling network requests.

If we refer back to the class structure above, this responsibility is handled by both the DataManager and Scheduler classes.

class QuestionsDataManager {

    private let queueManager: QueueManager

    // MARK: - Init

    init(withQueueManager queueManager: QueueManager = QueueManager.shared) {
        self.queueManager = queueManager
    }

    // MARK: - Retrieval

    func retrievalQuestions(pageIndex: Int, completionHandler: @escaping (_ result: Result) -> Void) {
        let operation = QuestionsRetrievalOperation(pageIndex: pageIndex)
        operation.completionHandler = completionHandler
        queueManager.enqueue(operation)
    }
}

The above manager is responsible for creating, configuring and scheduling operations on a queue. While this example only shows one method, the idea here is that this manager would handle all networking requests for the /questions endpoint.

The sharper reader will have spotted another custom class in the above code snippet: QueueManager. This class is responsible for the queue (or queues) that the operations will be added to - it acts as the Scheduler class.

class QueueManager {

    lazy var queue: OperationQueue = {
        let queue = OperationQueue()

        return queue;
    }()

    // MARK: - Singleton

    static let shared = QueueManager()

    // MARK: - Addition

    func enqueue(_ operation: Operation) {
        queue.addOperation(operation)
    }
}

The above manager creates a shared OperationQueue instance and controls access to it via the enqueue. To ensure that operations are consistently added to the same queue, this manager is a singleton. While this class only has the one queue it can be easily extended to hold multiple queues (perhaps one queue for user-driven events and another for system events - allowing greater control on suspending/cancelling queues) with each DataManager specifying which queue to add the operation to by passing an additional parameter when enqueuing that operation.

Looking at what we have

While using operations adds to the complexity of making a network request, when we step back and look at the benefit that having an isolated, well-encapsulated networking layer brings to our apps, I hope that we can all agree it is a cost worth paying 💰.

To see this networking layer in action, head over to the repo and clone the project.

What do you think? Let me know by getting in touch on Mastodon or Bluesky.