Skip to content
Open
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
1d3574c
WIP strucutred cbor
JesusMcCloud Jun 26, 2025
d561069
proper when for decoding
JesusMcCloud Jul 5, 2025
5857ef0
baseline tagging
JesusMcCloud Jul 5, 2025
718422f
some cleanups
JesusMcCloud Jul 5, 2025
6984f69
cleanups
JesusMcCloud Aug 4, 2025
998f01e
tree encoding half-done
JesusMcCloud Aug 5, 2025
1a46b06
tree encoding close
JesusMcCloud Aug 5, 2025
46c8258
polish encoder
JesusMcCloud Aug 5, 2025
f6bc421
visibility fixes
JesusMcCloud Aug 5, 2025
890b284
more checks
JesusMcCloud Aug 5, 2025
255f250
WIP decode from CborElement
JesusMcCloud Aug 5, 2025
4e15cc2
more AI slop
JesusMcCloud Aug 6, 2025
834fa67
cleanup after Junie
JesusMcCloud Aug 6, 2025
d13ec50
clean up more
JesusMcCloud Aug 6, 2025
9986e7b
fix structural issues
JesusMcCloud Aug 7, 2025
2a61e81
fix map size regression
JesusMcCloud Aug 7, 2025
342500b
streamlining and cleanups
JesusMcCloud Aug 7, 2025
f681698
benchmarks
JesusMcCloud Aug 7, 2025
b154640
add more tests
JesusMcCloud Aug 7, 2025
397c5c2
clarify faulty test vector
JesusMcCloud Aug 7, 2025
ebd87c9
fix test vector
JesusMcCloud Aug 7, 2025
8d49f2c
Fix Tagging and Simplify
JesusMcCloud Aug 7, 2025
6ae839d
finalize api
JesusMcCloud Aug 8, 2025
f9dc353
APIDUMP
JesusMcCloud Aug 8, 2025
259966f
docs + apidump
JesusMcCloud Aug 8, 2025
fba0348
refactor cborint
JesusMcCloud Aug 26, 2025
aeb4c4c
add CborEncoder.encoderCborElement
JesusMcCloud Aug 27, 2025
1d97928
pimp CborEncoder
JesusMcCloud Aug 27, 2025
1aec1ff
Apply suggestions from code review
JesusMcCloud Dec 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions benchmark/src/jmh/kotlin/kotlinx/benchmarks/cbor/CborBaseLine.kt
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,25 @@ open class CborBaseline {
}

val baseBytes = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)
val baseStruct = cbor.encodeToCborElement(KTestOuterMessage.serializer(), baseMessage)

@Benchmark
fun toBytes() = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)

@Benchmark
fun fromBytes() = cbor.decodeFromByteArray(KTestOuterMessage.serializer(), baseBytes)


@Benchmark
fun structToBytes() = cbor.encodeToByteArray(CborElement.serializer(), baseStruct)

@Benchmark
fun structFromBytes() = cbor.decodeFromByteArray(CborElement.serializer(), baseBytes)

@Benchmark
fun fromStruct() = cbor.decodeFromCborElement(KTestOuterMessage.serializer(), baseStruct)

@Benchmark
fun toStruct() = cbor.encodeToCborElement(KTestOuterMessage.serializer(), baseMessage)

}
129 changes: 128 additions & 1 deletion docs/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ stable, these are currently experimental features of Kotlin Serialization.
* [Tags and Labels](#tags-and-labels)
* [Arrays](#arrays)
* [Custom CBOR-specific Serializers](#custom-cbor-specific-serializers)
* [CBOR Elements](#cbor-elements)
* [Encoding from/to `CborElement`](#encoding-fromto-cborelement)
* [Tagging `CborElement`s](#tagging-cborelements)
* [Caution](#caution)
* [Types of CBOR Elements](#types-of-cbor-elements)
* [ProtoBuf (experimental)](#protobuf-experimental)
* [Field numbers](#field-numbers)
* [Integer types](#integer-types)
Expand Down Expand Up @@ -308,13 +313,123 @@ When annotated with `@CborArray`, serialization of the same object will produce
```
This may be used to encode COSE structures, see [RFC 9052 2. Basic COSE Structure](https://www.rfc-editor.org/rfc/rfc9052#section-2).


### Custom CBOR-specific Serializers
Cbor encoders and decoders implement the interfaces [CborEncoder](CborEncoder.kt) and [CborDecoder](CborDecoder.kt), respectively.
These interfaces contain a single property, `cbor`, exposing the current CBOR serialization configuration.
This enables custom cbor-specific serializers to reuse the current `Cbor` instance to produce embedded byte arrays or
react to configuration settings such as `preferCborLabelsOverNames` or `useDefiniteLengthEncoding`, for example.


### CBOR Elements

Aside from direct conversions between bytearray and CBOR objects, Kotlin serialization offers APIs that allow
other ways of working with CBOR in the code. For example, you might need to tweak the data before it can parse
or otherwise work with such unstructured data that it does not readily fit into the typesafe world of Kotlin
serialization.

The main concept in this part of the library is [CborElement]. Read on to learn what you can do with it.

#### Encoding from/to `CborElement`

Bytes can be decoded into an instance of `CborElement` with the [Cbor.decodeFromByteArray] function by either manually
specifying `CborElement.serializer()` or specifying [CborElement] as generic type parameter.
It is also possible to encode arbitrary serializable structures to a `CborElement` through [Cbor.encodeToCborElement].

Since these operations use the same code paths as regular serialization (but with specialized serializers), the config flags
behave as expected:

```kotlin
fun main() {
val element: CborElement = Cbor.decodeFromHexString("a165627974657343666f6f")
println(element)
}
```

The above snippet will print the following diagnostic notation

```text
CborMap(tags=[], content={CborString(tags=[], value=bytes)=CborByteString(tags=[], value=h'666f6f)})
```

#### Tagging `CborElement`s

Every CborElement—whether it is used as a property, a value inside a collection, or even a complex key inside a map
(which is perfectly legal in CBOR)—supports tags. Tags can be specified by passing them s varargs parameters upon
CborElement creation.
For example, take following structure (represented in diagnostic notation):

<!--- TEST -->

```hexdump
bf # map(*)
61 # text(1)
61 # "a"
cc # tag(12)
1a 0fffffff # unsigned(268,435,455)
d8 22 # base64 encoded text, tag(34)
61 # text(1)
62 # "b"
# invalid length at 0 for base64
20 # negative(-1)
d8 38 # tag(56)
61 # text(1)
63 # "c"
d8 4e # typed array of i32, little endian, twos-complement, tag(78)
42 # bytes(2)
cafe # "\xca\xfe"
# invalid data length for typed array
61 # text(1)
64 # "d"
d8 5a # tag(90)
cc # tag(12)
6b # text(11)
48656c6c6f20576f726c64 # "Hello World"
ff # break
```

Decoding it results in the following CborElement (shown in manually formatted diagnostic notation):

```
CborMap(tags=[], content={
CborString(tags=[], value=a) = CborPositiveInt( tags=[12], value=268435455),
CborString(tags=[34], value=b) = CborNegativeInt( tags=[], value=-1),
CborString(tags=[56], value=c) = CborByteString( tags=[78], value=h'cafe),
CborString(tags=[], value=d) = CborString( tags=[90, 12], value=Hello World)
})
```

##### Caution

Tags are properties of `CborElements`, and it is possible to mixing arbitrary serializable values with `CborElement`s that
contain tags inside a serializable structure. It is also possible to annotate any [CborElement] property
of a generic serializable class with `@ValueTags`.
**This can lead to asymmetric behavior when serializing and deserializing such structures!**

#### Types of CBOR Elements

A [CborElement] class has three direct subtypes, closely following CBOR grammar:

* [CborPrimitive] represents primitive CBOR elements, such as string, integer, float boolean, and null.
CBOR byte strings are also treated as primitives
Each primitive has a [value][CborPrimitive.value]. Depending on the concrete type of the primitive, it maps
to corresponding Kotlin Types such as `String`, `Int`, `Double`, etc.
Note that Cbor discriminates between positive ("unsigned") and negative ("signed") integers!
`CborPrimitive` is itself an umbrella type (a sealed class) for the following concrete primitives:
* [CborNull] mapping to a Kotlin `null`
* [CborBoolean] mapping to a Kotlin `Boolean`
* [CborInt] represents signed CBOR integer (major type 1 encompassing `-2^64..-1`) and unsigned CBOR integer (major type 0 encompassing `0..2^64-1`).
Since this exceeds the range of Kotlin's built-in `Long` type, CborInt consists of `sign` (set to `CborInt.Sing.POSITIVE`, `CborInt.Sing.NEGATIVE`, or `CborInt.Sing.ZERO`) and `value` representing the absolute value as an `ULong`. It also features a `toLong()` function, albeit incurring possible truncation for negative values exceeding `Long.MIN_VALUE`.
* [CborString] maps to a Kotlin `String`
* [CborFloat] maps to Kotlin `Double`
* [CborByteString] maps to a Kotlin `ByteArray` and is used to encode them as CBOR byte string (in contrast to a list
of individual bytes)

* [CborList] represents a CBOR array. It is a Kotlin `List` of `CborElement` items.

* [CborMap] represents a CBOR map/object. It is a Kotlin `Map` from `CborElement` keys to `CborElement` values.
This is typically the result of serializing an arbitrary


## ProtoBuf (experimental)

[Protocol Buffers](https://developers.google.com/protocol-buffers) is a language-neutral binary format that normally
Expand Down Expand Up @@ -1673,5 +1788,17 @@ This chapter concludes [Kotlin Serialization Guide](serialization-guide.md).
[Cbor.decodeFromByteArray]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor/decode-from-byte-array.html
[CborBuilder.ignoreUnknownKeys]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-builder/ignore-unknown-keys.html
[ByteString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-byte-string/index.html
[CborElement]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-element/index.html
[Cbor.encodeToCborElement]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/encode-to-cbor-element.html
[CborPrimitive]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-primitive/index.html
[CborPrimitive.value]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-primitive/value.html
[CborNull]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-null/index.html
[CborBoolean]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-boolean/index.html
[CborInt]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-int/index.html
[CborString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-string/index.html
[CborFloat]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-float/index.html
[CborByteString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-byte-string/index.html
[CborList]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-list/index.html
[CborMap]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-map/index.html

<!--- END -->
5 changes: 5 additions & 0 deletions docs/serialization-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ Once the project is set up, we can start serializing some classes.
* <a name='tags-and-labels'></a>[Tags and Labels](formats.md#tags-and-labels)
* <a name='arrays'></a>[Arrays](formats.md#arrays)
* <a name='custom-cbor-specific-serializers'></a>[Custom CBOR-specific Serializers](formats.md#custom-cbor-specific-serializers)
* <a name='cbor-elements'></a>[CBOR Elements](formats.md#cbor-elements)
* <a name='encoding-fromto-cborelement'></a>[Encoding from/to `CborElement`](formats.md#encoding-fromto-cborelement)
* <a name='tagging-cborelements'></a>[Tagging `CborElement`s](formats.md#tagging-cborelements)
* <a name='caution'></a>[Caution](formats.md#caution)
* <a name='types-of-cbor-elements'></a>[Types of CBOR Elements](formats.md#types-of-cbor-elements)
* <a name='protobuf-experimental'></a>[ProtoBuf (experimental)](formats.md#protobuf-experimental)
* <a name='field-numbers'></a>[Field numbers](formats.md#field-numbers)
* <a name='integer-types'></a>[Integer types](formats.md#integer-types)
Expand Down
Loading