Serde data model
The Serde data model is the API by which data structures and data formats interact. You can think of it as Serde's type system.
In code, the serialization half of the Serde data model is defined by the
Serializer trait and the deserialization half is defined by the
Deserializer trait. These are a way of mapping every Rust data structure
into one of 29 possible types. Each method of the Serializer trait corresponds
to one of the types of the data model.
When serializing a data structure to some format, the Serialize
implementation for the data structure is responsible for mapping the data
structure into the Serde data model by invoking exactly one of the Serializer
methods, while the Serializer implementation for the data format is
responsible for mapping the Serde data model into the intended output
representation.
When deserializing a data structure from some format, the Deserialize
implementation for the data structure is responsible for mapping the data
structure into the Serde data model by passing to the Deserializer a
Visitor implementation that can receive the various types of the data model,
while the Deserializer implementation for the data format is responsible for
mapping the input data into the Serde data model by invoking exactly one of the
Visitor methods.
Types
The Serde data model is a simplified form of Rust's type system. It consists of the following 29 types:
- 14 primitive types- bool
- i8, i16, i32, i64, i128
- u8, u16, u32, u64, u128
- f32, f64
- char
 
- string- UTF-8 bytes with a length and no null terminator. May contain 0-bytes.
- When serializing, all strings are handled equally. When deserializing, there are three flavors of strings: transient, owned, and borrowed. This distinction is explained in Understanding deserializer lifetimes and is a key way that Serde enabled efficient zero-copy deserialization.
 
- byte array - [u8]- Similar to strings, during deserialization byte arrays can be transient, owned, or borrowed.
 
- option- Either none or some value.
 
- unit- The type of ()in Rust. It represents an anonymous value containing no data.
 
- The type of 
- unit_struct- For example struct UnitorPhantomData<T>. It represents a named value containing no data.
 
- For example 
- unit_variant- For example the E::AandE::Binenum E { A, B }.
 
- For example the 
- newtype_struct- For example struct Millimeters(u8).
 
- For example 
- newtype_variant- For example the E::Ninenum E { N(u8) }.
 
- For example the 
- seq- A variably sized heterogeneous sequence of values, for example Vec<T>orHashSet<T>. When serializing, the length may or may not be known before iterating through all the data. When deserializing, the length is determined by looking at the serialized data. Note that a homogeneous Rust collection likevec![Value::Bool(true), Value::Char('c')]may serialize as a heterogeneous Serde seq, in this case containing a Serde bool followed by a Serde char.
 
- A variably sized heterogeneous sequence of values, for example 
- tuple- A statically sized heterogeneous sequence of values for which the length
will be known at deserialization time without looking at the serialized
data, for example (u8,)or(String, u64, Vec<T>)or[u64; 10].
 
- A statically sized heterogeneous sequence of values for which the length
will be known at deserialization time without looking at the serialized
data, for example 
- tuple_struct- A named tuple, for example struct Rgb(u8, u8, u8).
 
- A named tuple, for example 
- tuple_variant- For example the E::Tinenum E { T(u8, u8) }.
 
- For example the 
- map- A variably sized heterogeneous key-value pairing, for example BTreeMap<K, V>. When serializing, the length may or may not be known before iterating through all the entries. When deserializing, the length is determined by looking at the serialized data.
 
- A variably sized heterogeneous key-value pairing, for example 
- struct- A statically sized heterogeneous key-value pairing in which the keys are
compile-time constant strings and will be known at deserialization time
without looking at the serialized data, for example struct S { r: u8, g: u8, b: u8 }.
 
- A statically sized heterogeneous key-value pairing in which the keys are
compile-time constant strings and will be known at deserialization time
without looking at the serialized data, for example 
- struct_variant- For example the E::Sinenum E { S { r: u8, g: u8, b: u8 } }.
 
- For example the 
Mapping into the data model
In the case of most Rust types, their mapping into the Serde data model is
straightforward. For example the Rust bool type corresponds to Serde's bool
type. The Rust tuple struct Rgb(u8, u8, u8) corresponds to Serde's tuple
struct type.
But there is no fundamental reason that these mappings need to be
straightforward. The Serialize and Deserialize traits can perform any
mapping between Rust type and Serde data model that is appropriate for the use
case.
As an example, consider Rust's std::ffi::OsString type. This type represents
a platform-native string. On Unix systems they are arbitrary non-zero bytes and
on Windows systems they are arbitrary non-zero 16-bit values. It may seem
natural to map OsString into the Serde data model as one of the following
types:
- As a Serde string. Unfortunately serialization would be brittle because an
OsStringis not guaranteed to be representable in UTF-8 and deserialization would be brittle because Serde strings are allowed to contain 0-bytes.
- As a Serde byte array. This fixes both problems with using string, but now
if we serialize an OsStringon Unix and deserialize it on Windows we end up with the wrong string.
Instead the Serialize and Deserialize impls for OsString map into the
Serde data model by treating OsString as a Serde enum. Effectively it acts
as though OsString were defined as the following type, even though this does
not match its definition on any individual platform.
enum OsString {
    Unix(Vec<u8>),
    Windows(Vec<u16>),
    // and other platforms
}
The flexibility around mapping into the Serde data model is profound and
powerful. When implementing Serialize and Deserialize, be aware of the
broader context of your type that may make the most instinctive mapping not the
best choice.