docs for bipack format
This commit is contained in:
parent
569d62faa9
commit
2cd68276d4
89
docs/bipack.md
Normal file
89
docs/bipack.md
Normal file
@ -0,0 +1,89 @@
|
||||
# Bipack: compact binary serialization
|
||||
|
||||
## Why?
|
||||
|
||||
Bipack was designed with the following main goals:
|
||||
|
||||
### Be as compact as possible
|
||||
|
||||
For this reason it is a binary notation, it uses binary form for decimal numbers and can use variery of encoding for
|
||||
integers:
|
||||
|
||||
#### Varint
|
||||
|
||||
Variable-length compact encoding is used internally in some cases. It uses a 0x80 bit in every byte to mark coninuation.
|
||||
See `object Varint`.
|
||||
|
||||
#### Smartint
|
||||
|
||||
Variable-length compact encoding for signed and unsigned integers use as few bytes as possible to encode integers. It is
|
||||
used automatically when serializing integers. It is slightly more sophisticated than straight `Varint`.
|
||||
|
||||
### Do not reveal information about stored data
|
||||
|
||||
Many extendable formats, like JSON, BSON, BOSS and may others are keeping data in key-value pairs. While it is good in
|
||||
many aspets, it has a clear disadvantages: it uses more space and it reveals inner data structure to the world. It is
|
||||
possible to unpack such formats with zero information about inner structure.
|
||||
|
||||
Bipack does not store field names, so it is not possible to unpack or interpret it without knowledge of the data
|
||||
structure. Only probablistic analysis. Let's not make life of attacker easier :)
|
||||
|
||||
### Allow upgrading data structures with backward compatibility
|
||||
|
||||
The dark side of serialization formats of this kind is that you can't change the structures without either loosing
|
||||
backward compatibility with already serialzied data or using volumous boilerplate code to implement some sort of
|
||||
versioning.
|
||||
|
||||
Not to waste space and reveal more information that needed Bipack allows extending classes marked as [@Extendable] to be
|
||||
extended with more data _appended to the end of list of fields with required defaul values_. For such classes Bipack
|
||||
stores number of actually serialized fields and atuomatically uses default values for non-serialized ones when unpacking
|
||||
old data.
|
||||
|
||||
### Protect data with framing and CRC
|
||||
|
||||
When needed, serialization lobrary allow to store/check CRC32 tag of the structure name with `@Framed` (can be overriden
|
||||
as usual with `@SerialName`), or be followed with CRC32 of the serialized binary data, that will be checked on
|
||||
deserialization, using `@CrcProtected`. This allows to check the data consistency out of the box and only where needed.
|
||||
|
||||
# Usage
|
||||
|
||||
Use kotlinx serializatino as usual. There are following Bipack-specific annotation at your service. All class annotations could be combined.
|
||||
|
||||
## @Extendable
|
||||
|
||||
Classes marked this way store number of fields. It allows to add to the class data more fields, to the end of list, with
|
||||
default initializers, keeping backward compatibility. For example if you have serialized:
|
||||
|
||||
```kotlin
|
||||
@Serializable
|
||||
@Extendable
|
||||
data class foo(i: Int)
|
||||
```
|
||||
|
||||
and then decided to add a field:
|
||||
|
||||
```kotlin
|
||||
@Serializable
|
||||
@Extendable
|
||||
data class foo(val i: Int, bar: String = "buzz")
|
||||
```
|
||||
|
||||
It adds 1 or more bytes to the serialized data (field counts in `Varint` format)
|
||||
|
||||
Bipack will properly deserialize the data serialzied for an old version.
|
||||
|
||||
## @CrcProtected
|
||||
|
||||
Bipack will calculate and store CRC32 of serialized data at the end, and automatically check it on deserializing throwing `InvalidFrameCRCException` if it does not match.
|
||||
|
||||
It adds 4 bytes to the serialized data.
|
||||
|
||||
## @Framed
|
||||
|
||||
Put the CRC32 of the serializing class name (`@SerialName` allows to change it as usual) and checks it on deserializing. Throws `InvalidFrameHeaderException` if it does not match.
|
||||
|
||||
It adds 4 bytes to the serialized data.
|
||||
|
||||
## @Unisgned
|
||||
|
||||
This __field annontation__ allows to store __integer fields__ of any size more compact by not saving the sign. Could be applyed to both signed and unsigned integers of any size.
|
Loading…
Reference in New Issue
Block a user