Ambiguous Decoding

January 23rd, 2023
#networking

Dealing with a JSON network response in iOS projects used to be a pain - you would have to manually parse the response, extract the required values, ignore those that weren't needed, and build your model instances 🤮. At best, it was tedious work; at worst, it was a source of bugs. It wasn't long before a whole host of 3rd party solutions were developed to automate away the process of matching a JSON response to a Swift type. As sometimes happens when a 3rd party solution gains traction in the developer community, this functionality was pulled into Swift itself. In Swift 4, we got native support for encoding and decoding JSON in the form of JSONEncoder and JSONDecoder that worked side-by-side with two protocols Encodable and Decodable to make converting between Swift types and JSON easy-peasy.

Encodable and Decodable protocols are often combined as the Codable protocol.

For encoding, take the type you want to encode, conform it to Encodable and pass it to an instance of JSONEncoder:

struct Example: Encodable {
    let title: String
}

let example = Example(title: "Stay wonderful!")

let encoded = try JSONEncoder().encode(example)

encoded holds a JSON representation of an instance of Example.

Decoding is equally as simple. Take the below JSON response:

{
    "title":"Will do!"
}

The above JSON can be decoded into an Example instance by conforming Example to Decodable and passing it, along with the JSON response in Data form, to an instance of JSONDecoder:

struct Example: Encodable, Decodable {
    let title: String
}

let decoded = try JSONDecoder().decode(Example.self, from: data)

decoded now holds an Example instance with a title value of: Will do!.

We get a lot of functionality for the amount of code written above. We get that functionality because Encodable and Decodable are more than just protocols. Conforming a type to either protocol triggers the compiler to synthesise conformance to those protocols (notice how Example doesn't implement any of the methods defined in either Encodable or Decodable). In order to synthesise conformance, the compiler makes several assumptions about how our Swift type matches its JSON counterpart.

A photo of a bridge spanning two shores and so making it possible to travel between them

In this article, I want to explore what happens when one of those assumptions proves false. When automatic decoding/encoding isn't possible because the structure of the JSON representation can't be directly converted into a Swift representation due to differences between how JSON treats data and how Swift treats data.

Overcoming differences 🤝

The array type in Swift is homogeneous, i.e. each element is of the same type; the array type in JSON is heterogeneous, i.e. elements can be of different types. This can present a tricky issue for us as consumers of a JSON endpoint that returns different types in the same array.

Let's take the below JSON response as an example:

{
   "media":[
      {
         "media_type":"text",
         "id":12,
         "text":"This is an example of text media"
      },
      {
         "media_type":"image",
         "id":2785,
         "caption":"This is an example of image media",
         "url":"https://example.com/images/2785.jpg"
      }
   ]
}

Here the array, media, is heterogeneous as it contains 2 different JSON objects: text and image. Directly converting the JSON media array into a Swift array isn't possible as there is no way to declare a Swift array that holds multiple types.

However, it is possible to indirectly hold Swift representations of text and image in an array if those two types are grouped under a common type. In Swift, an enum is the perfect data structure for grouping a suite of distinct but related types.

Using an enum, it is possible to customise the decoding process to extract elements from the above JSON response as distinct objects and still keep them grouped in the same array.

Let's start by looking at how we determine what type each element in the media array is:

//1
struct Content: Decodable {
    let media: [Media]
}

//2
enum Media: Decodable {
    case text
    case image

    //3
    enum CodingKeys: String, CodingKey {
        case mediaType = "media_type"
    }

    // MARK: - Init

    //4
    init(from decoder: Decoder) throws {
        //5
        let container = try decoder.container(keyedBy: CodingKeys.self)
        let type = try container.decode(String.self, forKey: .mediaType)

        //6
        switch type {
        case "text":
            self = .text
        case "image":
            self = .image
        default:
            fatalError("Unexpected media type encountered")
        }
    }
}

Let's walk through the above code:

  1. Content, which conforms to Decodable, holds the array of all Media instances and is used to mirror the JSON structure.
  2. Media, which conforms to Decodable, is an artificial type that expresses each known media type as a case.
  3. As Media doesn't have any properties, no coding keys are synthesised. So Media has to define its own - CodingKeys. CodingKeys conforms to CodingKey, which the JSONDecoder instance expects its keys to be. The CodingKeys enum only contains one case as the only information from the JSON that Media needs to know about to determine what case to be is - media_type.
  4. In order to customise the decoding process, Media needs to implement it's own init(from decoder: Decoder) throws method rather than depend on the synthesised version.
  5. A container is created using the keys declared in the CodingKeys enum, with the media_type value extracted as a String instance.
  6. type is switched over to compare against the 2 supported media types. If type is a match for the string representation, self is set to that case; if there is no match, a fatal error is thrown.

The fatalError could be replaced with an unknown/unsupported case if crashing the app here is undesired.

While Media can determine which type each element in media is, it's not that useful on its own. Let's extend Media to capture the details of each element in media:

struct Text: Decodable {
    let id: Int
    let text: String
}

struct Image: Decodable {
    let id: Int
    let caption: String
    let url: URL
}

Text and Image each conform to Decodable and mirror their respective JSON object. Text and Image will be used as associated values to the cases in Media.

Note that we didn't need to implement init(from decoder: Decoder) throws here as the synthesised implementation is perfect for our needs.

Let's alter Media to make use of Text and Image:

enum Media: Decodable {
    //1
    case text(Text)
    case image(Image)

    //Omitting unchanged code

    init(from decoder: Decoder) throws {
        let container = try decoder.container(keyedBy: CodingKeys.self)
        let type = try container.decode(String.self, forKey: .mediaType)

        //2
        switch type {
        case "text":
            let text = try Text(from: decoder)
            self = .text(text)
        case "image":
            let image = try Image(from: decoder)
            self = .image(image)
        default:
            fatalError("Unexpected media type encountered")
        }
    }
}
  1. Each case now has a dedicated type for that media type as an associated value.
  2. For each media type, the Decoder instance is passed into Text or Image as needed to continue the decoding process at the next level.

We can test our decoding implementation by:

let json = """
{
   "media":[
      {
         "media_type":"text",
         "id":12,
         "text":"This is an example of text media"
      },
      {
         "media_type":"image",
         "id":2785,
         "caption":"This is an example of image media",
         "url":"https://example.com/images/2785.jpg"
      }
   ]
}
"""

let content = try JSONDecoder().decode(Content.self, from: json.data(using: .utf8)!)

If everything went well, Content should contain the same data as media does in the JSON representations.

You can see this in action by running the ContentTests in the linked project.

Now that the decoding side has been explored let's look at how to encode Content:

//1
struct Content: Decodable, Encodable {
    let media: [Media]
}

//2
enum Media: Decodable, Encodable {
    //Omitted unchanged properties and methods

    // MARK: - Encode

    //3
    func encode(to encoder: Encoder) throws {
        var container = encoder.container(keyedBy: CodingKeys.self)

        let object: Codable
        let type: String

        switch self {
        case .text(let text):
            type = "text"
            object = text
        case .image(let image):
            type = "image"
            object = image
        }

        try container.encode(type, forKey: .mediaType)
        try object.encode(to: encoder)
    }
}

//4
struct Text: Decodable, Encodable {
    //Omitted unchanged properties
}

//5
struct Image: Decodable, Encodable {
    //Omitted unchanged properties
}

I could have used Codable, which combines Decodable and Encodable, but I kept them separate in this article for clarity.

  1. To support encoding, Content now needs to conform to Encodable.
  2. To support encoding, Media now needs to conform to Encodable.
  3. As the Swift implementation of a JSON media object is two types, a custom func encode(to encoder: Encoder) throws needs to be implemented to combine those two types back into one. Here a container is made from CodingKeys so that the media type can be encoded before the encoder is passed to the enum cases associated value type instance to add the stored properties of that instance to the data that's already been encoded.
  4. To support encoding, Text now needs to conform to Encodable.
  5. To support encoding, Image now needs to conform to Encodable.

Using an enum to bridge the gap between JSON and Swift isn't only limited to elements in an array. Another common use case is where the structure of the JSON object is the same, but the type of a field changes, e.g. sometimes an Int, sometimes a String, etc. - in this case, we use an enum to represent the type in pretty much the same way as shown above.

And that's it! 🥳

Looking back

As we have seen, Decodable and Encodable are easy-to-use, powerful tools in the Swift toolkit. While on the surface, Decodable and Encodable are just protocols, when combined with the synthesised functionality we get from the compiler, we get the ability in most cases to convert from JSON into Swift types and vice versa with either no customisation or at least very little. Even in our tricky JSON example that proved too complex for the synthesised functionality to automatically convert to Swift, the amount of code we had to write wasn't much.

To see the above code snippets in a working example alongside unit tests, head over to the repo and clone the project.

What do you think? Let me know by getting in touch on Twitter - @wibosco