The privileges & challenges of being a primitive type for Codable in Swift

This is an elaboration of some posts made in the discussion of SE-425: 128-bit Integer Types. I think it warrants sharing a little more broadly – not because most people ever need to care about this, but rather because it’s interesting.

You might have noticed that the primitive types e.g.:

Bool
Int & UInt (and all fixed-sized variants, e.g. Int8)
String
Float
Double
Nil

…all conform to Codable (which is just a convenience protocol combining Encodable & Decodable). That means they have two methods:

func encode(to encoder: any Encoder) throws
init(from decoder: any Decoder) throws

This is good – it ensures you can reason about Encodable and Decodable types and not have any weird exceptions for these primitives, e.g. an Array<MyCodableStruct> is just as Codable as Array<String>. Array doesn’t have to special-case the primitive types, it just calls encode on its Elements irrespectively.

However, if you stop and think about it, it might seem like there’s an infinite loop here. For example, if you call encode(to:) on UInt64, it has only the Encoder APIs to work with – it has to use those APIs, because it doesn’t actually know how to serialise itself, because it has no idea what the serialisation format is – that’s defined by the particular Encoder / Decoder in use.

So, basically UInt64 has to call an encode method somewhere, like this one, which would surely then just call encode(to: self) on the original UInt64, right? Infinite recursion!

Of course, no, thankfully.

The first part of that thinking is correct – see for example UInt64s actual implementation of Codable:

extension UInt64: Codable {
  public init(from decoder: any Decoder) throws {
    self = try decoder.singleValueContainer().decode(UInt64.self)
  }

  public func encode(to encoder: any Encoder) throws {
    var container = encoder.singleValueContainer()
    try container.encode(self)
  }
}

The trick to avoiding self-recursion is that encoders & decoders must special-case the primitive types. They may never call encode(to: self) / init(from: self) on the primitive types.

😔 Alas the compiler does not enforce this – if this rule is broken, the consequence is indeed infinite recursion at runtime.

The way they do this is two-fold:

There are specific overloads of encode (and decode) for the primitive types.

An encoder / decoder doesn’t technically have to provide those specialised overloads – it can just implement the generic version, which will satisfy the protocol constraints – but this is ultimately only an optimisation because either way it must serialise those primitive values intrinsically…
The implementation of the generic methods must special-case the primitive types. e.g. the pertinent bit of JSONEncoder‘s implementation:

func wrapGeneric<T: Encodable>(_ value: T, for node: _CodingPathNode, _ additionalKey: (some CodingKey)? = _CodingKey?.none) throws -> JSONReference? {
    switch T.self {
    case is Date.Type:
        // Respect Date encoding strategy
        return try self.wrap(value as! Date, for: node, additionalKey)
    case is Data.Type:
        // Respect Data encoding strategy
        return try self.wrap(value as! Data, for: node, additionalKey)
#if FOUNDATION_FRAMEWORK // TODO: Reenable once URL and Decimal are moved
    case is URL.Type:
        // Encode URLs as single strings.
        let url = value as! URL
        return self.wrap(url.absoluteString)
    case is Decimal.Type:
        let decimal = value as! Decimal
        return .number(decimal.description)
#endif // FOUNDATION_FRAMEWORK
    case is _JSONStringDictionaryEncodableMarker.Type:
        return try self.wrap(value as! [String : Encodable], for: node, additionalKey)
    case is _JSONDirectArrayEncodable.Type:
        let array = value as! _JSONDirectArrayEncodable
        if options.outputFormatting.contains(.prettyPrinted) {
            return .init(.directArray(array.individualElementRepresentation(options: options)))
        } else {
            return .init(.nonPrettyDirectArray(array.nonPrettyJSONRepresentation(options: options)))
        }
    default:
        break
    }

    return try _wrapGeneric({
        try value.encode(to: $0)
    }, for: node, additionalKey)
}

This specialisation inside the generic method is required even if there are specialised overloads, because the type of T is not always known at runtime; sometimes it truly is an existential. So the specialisations can’t always be called directly. In the case of JSONEncoder (above) it manually unboxes the existential and invokes the appropriate specialisation.

Tangentially, notice how JSONEncoder has special-case handling of Date, Data, and Dict, even though those types are not considered ‘primitives’ and do have their own, fully-functional Codable implementations in terms of the primitive types (e.g. Data‘s). This is because JSONEncoder believes it can do a better job for those types, given its specific knowledge of the JSON format. Encoders & decoders are always allowed to specialise additional types beyond the primitive types.

For all types other than those primitive types, their Codable representation must ultimately be defined in terms of only those primitive types (with allowances for keyed containers (a la Dictionary) and unkeyed containers (a la Array) to provide structure).

Consider for example Optional‘s Codable conformance:

public func encode(to encoder: any Encoder) throws {
  var container = encoder.singleValueContainer()
  switch self {
  case .none: try container.encodeNil()
  case .some(let wrapped): try container.encode(wrapped)
  }
}

public init(from decoder: any Decoder) throws {
  let container = try decoder.singleValueContainer()
  if container.decodeNil() {
    self = .none
  }  else {
    let element = try container.decode(Wrapped.self)
    self = .some(element)
  }
}

It doesn’t have the luxury of a special-case guaranteed by all encoders & decoders, so it has to figure out how to represent itself using only the primitive types.

It demonstrates both ways of doing this: direct use of primitive types, and indirect (a.k.a. punting the problem downwards).

If the Optional is empty, it encodes itself as simply a nil value (one of the supported primitive types).

If it’s not empty, it simply defers to the wrapped value to do the work. That value must then do the same thing – figure out a way to represent itself using only the primitive types and/or punt the challenge to its component types.

So if you have e.g. Optional<Int64>, that essentially looks like:

Optional.encode(to: someJSONEncoder) which just calls…
Int64.encode(to: someJSONEncoder) which just calls…
someJSONEncoder.singleValueContainer().encode(self) (where self is the Int64 value), which has some further redirection through abstractions but ultimately does the actual serialisation.

Can new primitive types be added?

That is in fact what prompted this post and the discussion that precipitated it. Int128 & UInt128 or finally coming to Swift (properly – the standard library has had them internally for a while). But that raises the question of how they will be supported for Codable. Technically, the three options are:

Be added to the Codable system as primitive types (i.e. additional overloads for them on SingleValueEncodingContainer & friends).
Not be added as Codable-recognised primitive types, and instead implement their Codable conformance in terms of only the existing primitive types, e.g. as strings instead, or pairs of 64-bit integers, etc.
Not support Codable at all.

Option #3 is obviously highly undesirable. Note that if the standard library doesn’t provide Codable conformance for these types, no-one can, because protocol conformances cannot be added outside of the modules which define at least one of (a) the type in question or (b) the protocol in question. Since the standard library defines both in this case, it is the only place where the conformance may be made.

Option #2 is the only other option most types have; most don’t have the luxury of getting special treatment from encoders & decoders like the standard library’s primitive types do. But it’s a frustrating option for Int128 & UInt128 because it adds runtime and possibly wire-format overhead to their use as Codable, and makes their serialisation & deserialisation more complicated.

Interestingly, it does not preclude them from being supported more efficiently & elegantly by encoders & decoders that do intrinsically supported them because, as we saw with JSONEncoder, the encoder & decoder are always free to override the standard representation for any type.

Option #1 seems simple and obviously appropriate for these types that are, after all, just new siblings into the FixedWidthInteger standard library family. However, adding new requirements (properties, methods, etc) to a protocol is a breaking change; it is both source-incompatible and binary-incompatible. …unless the new requirements come with default implementations. The problem is, what would the default implementation be?

There are technically three sub-options:

Have the default implementation be effectively the same as Option #2 above; implemented in terms of the existing primitive types.
Throw an exception (fortunately all the relevant encoding & decoding methods are already marked throws with no restriction on the thrown type, because they predate typed throws).
Crash (or hang, or similar).

At time of writing the debate is ongoing as to whether sub-option #1 or #2 is the best option for Int128 & UInt128. Personally I think throwing an exception is the best option (for reasons detailed in the forum thread), but only time will tell what Swift ultimately chooses.

Either way, the important thing is for encoders & decoders to actually add support for the new primitives.

The privileges & challenges of being a primitive type for Codable in Swift

Can new primitive types be added?

Moar posts

Leave a Comment Cancel reply