Swift multithreading & concurrency

Concurrency is the notion of multiple things happening at the same time, which is generally achieved either via time-slicing, or truly in parallel if multiple CPU cores are available to the host operating system.

What is it?

The clock speed of a computer determined the greatest amount of work it could do per CPU cycle. Heat and physical constraints became limiting concerns for greater clock rates as CPU designs became more compact. As a result, chip manufacturers began to add more processor cores to each chip in order to boost overall performance. A single chip could execute more CPU instructions each cycle by increasing the number of cores without increasing its speed, size, or thermal output.

How can we take advantage of these extra cores?

Multithreading - is a feature of the host operating system that allows for the creation and use of a n number of threads. Its major goal is to allow the execution of two or more components of a programme at the same time in order to take advantage of all available CPU time. A common misconception is that multithreading requires a multi-core processor, although this is not the case; single-core CPUs are fully capable of handling multiple threads.

Concurrency - is the ability to handle many tasks at once. For example, in a chatroom, having many discussions and interleaving (context-switching) between them, but never genuinely chatting with two people at once. It's the illusion of many things happening at once while, in reality, they're switching very quickly, whereas in the parallelism model, both jobs are running at the same time. Both execution models use multithreading, which involves numerous threads working together to achieve a common goal.

In simple terms Multithreading is a generalized technique for introducing a combination of concurrency and parallelism into our program.

Thread

At any given time, a modern multitasking operating system has hundreds of programmes (or processes) active. However, because the majority of these programmes are system daemons or background processes with a small memory footprint, what is actually needed is a means for individual applications to take advantage of the additional cores. Many threads (sub-processes) can operate on shared memory in a single application (process). Our goal is to be able to control and manipulate these threads to our benefit.

Introducing concurrency to an app requires the creation of one or more threads. Threads are low-level constructs that need to be managed manually, which adds enormous levels of complexity and risk without any guarantees of improved performance.

Swift takes an asynchronous approach to solving the concurrency problem of managing threads.

Asynchronous functions are common in most programming environments, and are often used to initiate tasks that might take a long time, like reading a file from the disk, or downloading a file from the web. When invoked, an asynchronous function executes some work behind the scenes to start a background task, but returns immediately, regardless of how long the original task might takes to actually complete.

A core technology that Swift provides for starting tasks asynchronously is Grand Central Dispatch or GCD for short, also known as the Dispatch framework.

Dispatch framework - Grand Central Dispatch (GCD)

Dispatch, also known as Grand Central Dispatch (GCD) abstracts away thread management code and moves it down to the system level, exposing a light API to define tasks and execute them on an appropriate dispatch queue. It takes care of all thread management and scheduling, providing a holistic approach to task management and execution, while also providing better efficiency than traditional threads.

The Grand Central Dispatch (GCD), or Dispatch framework is based on the underlying thread pool design pattern. This means that there are a fixed number of threads spawned by the system - based on some factors like CPU cores - they're always available waiting for tasks to be executed concurrently.

Creating threads on the run is an expensive task so GCD organizes tasks into specific queues, and later on the tasks waiting on these queues are going to be executed on a proper and available thread from the pool. This approach leads to great performance and low execution latency. We can say that the Dispatch framework is a very fast and efficient concurrency framework designed for modern multicore hardwares and needs.

Components of Grand Central Dispatch (GCD), or Dispatch framework:

Grand Central Dispatch - FIFO Queues
DispatchQueue - Main DispatchQueue - Global DispatchQueue - Private(Custom)
Main Thread - Serial Thread Pool - Concurrent Attributed - Serial or Concurrent
  1. Serial Queue - Tasks
  1. High Priority Queue - Tasks
  2. Default Priority Queue - Tasks
  3. Low Priority Queue - Tasks
  4. Background Priority Queue - Tasks
  1. Private Serial Queue - Tasks
  2. Private Concurrent Queue - Tasks

Dispatch Queues

GCD organizes task into queues, these are just like the queues at the ticket window of railway station or theaters. On every dispatch queue, tasks will be executed in the same order as we add them to the queue - FIFO: the first task in the line will be executed first - but we should note that the order of completion is not guaranteed. Tasks will be completed according to the code complexity. So if we add two tasks to the queue, a slow one first and a fast one later, the fast one can finish before the slower one.

There are two types of dispatch queues:

  • Serial Queues: can execute one task at a time, these queues can be utilized to synchronize access to a specific resource. It is just like the queues at the ticket window of railway station or theaters. We should use this type of queues when the order of tasks matters.
  • Concurrent Queues: can execute one or more tasks parallell in the same time. It is like a 100 meter sprint, where all the runners starts the race but there is no assurity that who will finish race first. We should use this type of queues when the order of tasks does not matter.

Main, Global and Private(Custom) Queues

  • Main Queue - is a serial one, every task on the main queue runs on the main thread.
  • Global Queues - are system provided concurrent queues shared through the operating system. There are exactly four of them organized by high, default, low priority plus an IO throttled background queue.
  • Private(Custom) Queues - can be created by the user. Custom concurrent queues always mapped into one of the global queues by specifying a Quality of Service property (QoS). In most of the cases if we want to run tasks in parallel it is recommended to use one of the global concurrent queues

Custom Queues by Quality of Service

  • .userInteractive (UI updates) - serial main queue
  • .userInitiated (async UI related tasks) - high priority global queue
  • .default - default priority global queue
  • .utility - low priority global queue
  • .background - global background queue
  • .unspecified (lowest) - low priority global queue

How to create a Queue?

import Dispatch

DispatchQueue.main
DispatchQueue.global(qos: .userInteractive)
DispatchQueue.global(qos: .userInitiated)
DispatchQueue.global(qos: .default)
DispatchQueue.global(qos: .utility)
DispatchQueue.global(qos: .background)
DispatchQueue.global(qos: .unspecified)
DispatchQueue(label: "io.github.iharishsuthar.queues.serial")
DispatchQueue(label: "io.github.iharishsuthar.queues.concurrent", attributes: .concurrent)

Using Dispatch queues, performing a task on a background queue and updating the UI on the main queue after the work is completed is a fairly simple task.

DispatchQueue.global(qos: .background).async {
    // do background job here
    DispatchQueue.main.async {
        // update ui here
    }
}

How to make sync and async calls on Queues?

Sync is just an async call with a semaphore that waits for the return value. A sync call will block, on the other hand an async call will immediately return.

let glblQueue = DispatchQueue.global()

let greet = glblQueue.sync {
    return "Hello"
}

glblQueue.async {
    print("\(greet) world")
}

How to delay execution using Queue?

DispatchQueue.main.asyncAfter(deadline: .now() + .seconds(3)) {
    // Code here will get executed after 3 seconds have been passed
}

How to use DispatchWorkItem?

From iOS 8 apple introduced DispatchWorkItem which encapsulates work that can be performed. A work item can be dispatched onto a DispatchQueue and within a DispatchGroup. Using DispatchWorkItem we can cancel a running task and it can also notify a queue when their task is completed.

It's like working in a team where, there is a team leader which is DispatchQueue in GCD, and there are team members which is DispatchWorkItem in GCD who gets specific task to finish, and once the task is finished that team member(DispatchWorkItem) notifies the same to team leader(DispatchQueue). There can be instances where the team leader(DispatchQueue) may ask a specific team member(DispatchWorkItem) to cancel its current task as it no longer needs to be finished, in such cases there is a .cancel() method available for team member(DispatchWorkItem) to cancel its current task. There is a isCancelled property available in DispatchWorkItem using which we can check if that task is cancelled or not?

let workItem = DispatchWorkItem {
    for i in 1..<9 {
        guard workItem.isCancelled == false else {
            print("Cancelled")
            break
        }
        sleep(1)
        print(String(i))
    }
}

workItem.notify(queue: .main) {
    print("Done")
}

DispatchQueue.global().asyncAfter(deadline: .now() + .seconds(3)) {
    workItem.cancel()
}

DispatchQueue.main.async(execute: workItem)

In the above Swift code the work item is defined to execute a loop from 1 to 9 with 1 second delay, and if the work item gets cancelled the loop is breaked, after defining the work item we have subscribed to notify method which gets called once the workItem is finished either by completing its taks or by getting cancelled. We have also used .asyncAfter(deadline:) to cancel workItem after 3 seconds. Then finally we executed the work item on main queue asynchronously, we could have used .perform() method to run the workItem on the current queue instead of using .async(execute:) method.

Prior to iOS 8 it was not possible to cancel a task in DispatchQueue however, with introduction of DispatchWorkItem apple has provided cancel method to cancel a tax in DispatchQueue. Thus sometimes developers have this misconception that they can not cancel a task in DispatchQueue at all.

How to use DispatchGroup to perform concurrent tasks?

So let's say we need to perform multiple network calls in order to construct the data required by a view controller? This is where DispatchGroup can help us. All of our long running background task can be executed concurrently, when everything is ready we'll receive a notification. We have to be careful and use thread-safe data structures, so we will always modify arrays for example on the same thread.

func fetchInfo(delay: UInt32, completion: () -> Void) {
    sleep(delay)
    completion()
}

let workGroup = DispatchGroup()

workGroup.enter()
fetchInfo(delay: 1) {
    print("1")
    workGroup.leave()
}

workGroup.enter()
fetchInfo(delay: 2) {
    print("2")
    workGroup.leave()
}

workGroup.enter()
fetchInfo(delay: 3) {
    print("3")
    workGroup.leave()
}

workGroup.notify(queue: .main) {
    print("Done")
}

In above Swift code the enter function is an indication to DispatchGroup that a task is being started, and the leave function is an indication to DispatchGroup that it has completed the given task, also note that we always have to balance out the enter and leave calls on the DispatchGroup.

The above example can also be written as below using a for loop to make it short and simple:

func fetchInfo(delay: UInt32, completion: () -> Void) {
    sleep(delay)
    completion()
}

let workGroup = DispatchGroup()

for i in 1...3 {
    workGroup.enter()
    fetchInfo(delay: UInt32(i)) {
        print("\(i)")
        workGroup.leave()
    }
}

workGroup.notify(queue: .main) {
    print("Done")
}

When there are closures available for async taks why use DispatchGroup?

The answer here is simple concurrency. Let's try to understand it by a simple example here, suppose there are 3 network calls we need to make and we need to append result of each network call in an array and once all 3 network calls get finished we need to print the result of the array.

Using Closure

func fetchInfo(delay seconds: Int, completion: @escaping (String) -> Void) {
    DispatchQueue.global().asyncAfter(deadline: (.now() + .seconds(seconds))) {
        completion("Needed \(seconds) seconds")
    }
}

let closureStartDt = Date()
var timesheet: [String] = [String]()
fetchInfo(delay: 1) { infoX in
    timesheet.append(infoX)
    fetchInfo(delay: 2) { infoY in
        timesheet.append(infoY)
        fetchInfo(delay: 3) { infoZ in
            timesheet.append(infoZ)
            debugPrint(timesheet)
            debugPrint("With closure it took \(Date().timeIntervalSince(closureStartDt)) seconds to finish")
        }
    }
}

In the above Swift code we tried to depict above situation using closures, where we get results asynchronously and it also prints time it took to finish this taks.

Using DispatchGroup

func fetchInfo(delay seconds: Int, completion: @escaping (String) -> Void) {
    DispatchQueue.global().asyncAfter(deadline: (.now() + .seconds(seconds))) {
        completion("Needed \(seconds) seconds")
    }
}

let workGroupStartDt = Date()
var timesheet: [String] = [String]()

let workGroup = DispatchGroup()

workGroup.enter()
fetchInfo(delay: 1) { infoX in
    timesheet.append(infoX)
    workGroup.leave()
}

workGroup.enter()
fetchInfo(delay: 2) { infoY in
    timesheet.append(infoY)
    workGroup.leave()
}

workGroup.enter()
fetchInfo(delay: 3) { infoZ in
    timesheet.append(infoZ)
    workGroup.leave()
}

workGroup.notify(queue: .main) {
    debugPrint(timesheet)
    debugPrint("With dispatch group it took \(Date().timeIntervalSince(workGroupStartDt)) seconds to finish")
}

In the above Swift code we tried to depict above situation using DispatchGroup, where we get results asynchronously and it also prints time it took to finish this taks.

Notice the DispatchGroup finishes the task faster because of concurrency, it executes all tasks at the same time or in parallel, and closures finishes the task in serial fashion one after another. That is power of concurrency.

Finer control with OperationQueue

GCD is great when we want to dispatch one-off tasks or closures into a queue in a 'set-it-and-forget-it' fashion, and it provides a very lightweight way of doing so. But what if we want to create a repeatable, structured, long-running task that produces associated state or data? And what if we want to model this chain of operations such that they can be cancelled, suspended and tracked, while still working with a closure-friendly API? That's where OperationQueue comes into picture it is a queue that regulates the execution of operations.

With OperationQueue, we can use Operation objects and queue them onto an OperationQueue, which is a high-level wrapper around DispatchQueue. It offers much more in comparison to the lower-level GCI API as stated below:

  • The Operation and OperationQueue classes have a number of properties that can be observed, using KVO (Key Value Observing). If we want to monitor the state of an operation or operation queue.
  • Operations can be paused, resumed, and cancelled. Once we dispatch a task using Grand Central Dispatch, we no longer have control or insight into the execution of that task. The Operation API is more flexible in that respect, giving the us control over the operation's life cycle.
  • OperationQueue allows us to specify the maximum number of queued operations that can run simultaneously, giving us a finer degree of control over the concurrency aspects.
  • It also allows us to create dependencies between tasks, which we could do this with GCD as well, but here its available as object oriented wrapper.
func fetchInfo(delay seconds: Int, completion: @escaping (String) -> Void) {
    DispatchQueue.global().asyncAfter(deadline: (.now() + .seconds(seconds))) {
        completion("Needed \(seconds) seconds")
    }
}

let operation = BlockOperation()
operation.qualityOfService = .utility

operation.addExecutionBlock {
    fetchInfo(delay: 1) { info in
        debugPrint(info)
    }
}

operation.addExecutionBlock {
    fetchInfo(delay: 2) { info in
        debugPrint(info)
    }
}

operation.addExecutionBlock {
    fetchInfo(delay: 3) { info in
        debugPrint(info)
    }
}

operation.start()

In the above swift code, we have create instance of BlockOperation class which is subclass of Operation class, we have add three operation blocks to this instance, and to execute this operation we have called the .start() method of BlockOperation class instance that we created.

If we do not want to use an operation queue, we can execute an operation ouseld by calling its .start() method directly from code. Executing operations manually does put more of a burden on our code, because starting an operation that is not in the ready state triggers an exception. The isReady property reports on the operation's readiness.

func fetchInfo(delay seconds: Int, completion: @escaping (String) -> Void) {
    DispatchQueue.global().asyncAfter(deadline: (.now() + .seconds(seconds))) {
        completion("Needed \(seconds) seconds")
    }
}

let operationX = BlockOperation()
operationX.addExecutionBlock {
    fetchInfo(delay: 1) { info in
        debugPrint(info)
    }
}

let operationY = BlockOperation()
operationY.addExecutionBlock {
    fetchInfo(delay: 2) { info in
        debugPrint(info)
    }
}

let operationZ = BlockOperation()
operationZ.addExecutionBlock {
    fetchInfo(delay: 3) { info in
        debugPrint(info)
    }
}

let operationQueue = OperationQueue()
operationQueue.qualityOfService = .utility
operationQueue.addOperations([operationX, operationY, operationZ], waitUntilFinished: false)

In the above swift code, we have created instances of BlockOperation class and added them to instance of OperationQueue class using .addOperations() methods, and with the instance of OperationQueue class we have now more control over state of our operations. There is one more thing to notice instead of assigning qualityOfService property value to individual operation we can now assign it to instance of OperationQueue class which will then manage the qualityOfService of all its operations for us.

Once we add an operation object to a queue, the queue assumes all responsibility for it, reducing our overhead of checking the operation's readiness.

func fetchInfo(delay seconds: Int, completion: @escaping (String) -> Void) {
    DispatchQueue.global().asyncAfter(deadline: (.now() + .seconds(seconds))) {
        completion("Needed \(seconds) seconds")
    }
}

let operationX = BlockOperation()
operationX.addExecutionBlock {
    fetchInfo(delay: 1) { info in
        debugPrint(info)
    }
}

let operationY = BlockOperation()
operationY.addExecutionBlock {
    fetchInfo(delay: 2) { info in
        debugPrint(info)
    }
}

let operationZ = BlockOperation()
operationZ.addExecutionBlock {
    fetchInfo(delay: 3) { info in
        debugPrint(info)
    }
}

operationY.addDependency(operationX)
operationZ.addDependency(operationY)

let operationQueue = OperationQueue()
operationQueue.qualityOfService = .utility
operationQueue.addOperation(operationX)
operationQueue.addOperation(operationY)
operationQueue.addOperation(operationZ)

In the above Swift code, we have defined operation dependency and operation queue will execute these operations in that order only resulting in first resolving the dependency and then execute the dependent operation, and all this without maintaining the operation order in our code.

Always use the highest-level abstraction available, and drop down to lower-level abstractions when they are needed. In short use OperationQueue(highest level API) that meets our needs instead of using DispatchQueue(lower level API) unless its needed.

Semaphores

Semaphore is a Greek word which means visual signals. In short, they are signaling mechanisms and are commonly used to control access to a shared resource.

These visual signals are like traffic signals or the flags used in railway or flags used on Airports to communicate with the driver or pilot to give specific signals. Let's take an example of signaling used in railways to understand it better. Suppose there is a train named "Train X" and is suppose to arrive on "Platform #1" , sometimes the train is stopped before its arrival on the station for few minutes, and that is because there is already a train named "Train Y" on the "Platform #1" of the station, and thus "Platform #1" is already occupied. When "Train Y" leaves the "Platform #1", railway engineers give a green signal (semaphore) to "Train X" so that it can go to "Platform #1" of the station for its arrival. Now suppose the railway engineers didn't gave any red signal (semaphore) to "Train X" and at that time "Train Y" is already on "Platform #1" in that case there could have been a very big accident, thus to avoid such accidents we have signals.

We can this of this example in our Swift code as well, considering "Train X" & "Train Y" as two threads who are trying to occupy a common resource "Platform #1", and these two thread wants to perform some operation on this resources, lets say they want to read or write or delete some data on this common resource, and if we allow both threads to do some operation on a common resource it will cause a accident known as "Race Condition" in programming, where two different threads try to access a common resource at the same time.

In above example the accident prevention is being managed by railway engineers and the same job in programming is done by semaphores, hence semaphore is an object that controls the access to shared resource. If a thread has occupied a resource semaphore makes sure the other thread who wants to access same resource waits untill the first thread frees the resource.

struct RailwayStation {
    
    typealias Platform = (number: Int, train: String?)
    
    enum TransitType: String {
        case arrival = "arrival"
        case departure = "departure"
    }

    enum PlatformTransitError: Error {
        case PlatformAlreadyExist
        case PlatformDoesNotExist
        case PlatformAlreadyOccupied
        case NoTrainOnPlatformToDeparture
    }
    
    private var name: String
    private var code: String
    private var platforms: [Platform]
    
    init(name: String, code: String) {
        self.name = name
        self.code = code
        self.platforms = [Platform]()
    }
    
    func info(platform: Int? = nil) throws -> [String: Any] {
        guard let value = platform else {
            return ["name": self.name, "code": self.code, "platforms": self.platforms]
        }
        
        guard let index = self.platforms.firstIndex(where: { ($0.number == value) }) else {
            throw PlatformTransitError.PlatformDoesNotExist
        }
        
        return ["name": self.name, "code": self.code, "platform": self.platforms[index]]
    }
    
    mutating func add(platform: Int) throws {
        guard let _ = self.platforms.firstIndex(where: { ($0.number == platform) }) else {
            self.platforms.append((platform, nil))
            return
        }
        
        throw PlatformTransitError.PlatformAlreadyExist
    }
    
    mutating func remove(platform: Int) throws {
        guard let index = self.platforms.firstIndex(where: { ($0.number == platform) }) else {
            throw PlatformTransitError.PlatformDoesNotExist
        }
        
        self.platforms.remove(at: index)
    }
    
    mutating func transti(train: String, on platform: Int, for transit: TransitType) throws {
        guard let index = self.platforms.firstIndex(where: { ($0.number == platform) }) else {
            throw PlatformTransitError.PlatformDoesNotExist
        }
        
        switch transit {
            case .arrival:
                if self.platforms[index].number == platform && self.platforms[index].train == nil {
                    self.platforms[index].train = train
                } else {
                    throw PlatformTransitError.PlatformAlreadyOccupied
                }
                
                break
            case .departure:
                if self.platforms[index].number == platform && self.platforms[index].train == train {
                    self.platforms[index].train = nil
                } else {
                    throw PlatformTransitError.NoTrainOnPlatformToDeparture
                }
                
                break
        }
    }
}

var ahmedabadJn = RailwayStation(name: "Ahmedabad Junction", code: "ADI")
do {
    for pn in 1...12 {
        try ahmedabadJn.add(platform: pn)
    }
    
    debugPrint(try ahmedabadJn.info())
} catch {
    debugPrint("Error: \(error)")
}

let queue = DispatchQueue(label: "inr.transit.queue", qos: .utility, attributes: .concurrent)
let semaphore = DispatchSemaphore(value: 1)

queue.async {
    defer {
        semaphore.signal()
    }
    
    semaphore.wait()
    do {
        try ahmedabadJn.transti(train: "KarnavatiExp", on: 1, for: .arrival)
        debugPrint(try ahmedabadJn.info(platform: 1))
        
    } catch {
        debugPrint("Error: \(error)")
    }
}

queue.async {
    defer {
        semaphore.signal()
    }
    
    semaphore.wait()
    do {
        try ahmedabadJn.transti(train: "KarnavatiExp", on: 1, for: .departure)
        debugPrint(try ahmedabadJn.info(platform: 1))
    } catch {
        debugPrint("Error: \(error)")
    }
}

queue.async {
    defer {
        semaphore.signal()
    }
    
    semaphore.wait()
    do {
        try ahmedabadJn.transti(train: "GujaratQueen", on: 1, for: .arrival)
        debugPrint(try ahmedabadJn.info(platform: 1))
    } catch {
        debugPrint("Error: \(error)")
    }
}

Semaphores usually aren't necessary for code like the one in our example, but they become more powerful when we need to enforce synchronous behavior whille performing an asynchronous operation. The above could would work just as well with a custom OperationQueue with a maxConcurrentOperationCount.

Conclusion

Adding concurrency to our code is beneficial for taking advantage of multi-core performance, but it also comes with a slew of other concerns, including:

  • Deadlock: A situation where a thread locks a critical portion of the code and can halt the application's run loop entirely. In the context of GCD, we should be very careful when using the dispatchQueue.sync { } calls as we could easily get ourself in situations where two synchronous operations can get stuck waiting for each other.
  • Priority Inversion: A condition where a lower priority task blocks a high priority task from executing, which effectively inverts their priorities. GCD allows for different levels of priority on its background queues, so this is quite easily a possibility.
  • Producer-Consumer Problem: A race condition where one thread is creating a data resource while another thread is accessing it. This is a synchronization problem, and can be solved using locks, semaphores, serial queues, or a barrier dispatch if we're using concurrent queues in GCD.

Thread safety is of the utmost concern when dealing with concurrency. It's best to profile our app thoroughly to ensure that concurrency is enhancing our app's performance and not degrading it.