156 lines
5.3 KiB
Markdown
156 lines
5.3 KiB
Markdown
# Binary tools and BiPack serializer
|
|
|
|
Multiplatform binary tools collection, including portable serialization of the compact and fast [Bipack] format, and many useful tools to work with binary data, like CRC family checksums, dumps, etc. It works well also in the browser and in native targets.
|
|
|
|
# Recent changes
|
|
|
|
- 0.1.0: uses modern kotlin 1.9.*, fixes problem with singleton or empty/object serialization
|
|
|
|
last 1.8 version is 0.0.8, some fixes are not yet backported to it pls leave an issue of needed.
|
|
|
|
# Usage
|
|
|
|
Add our maven:
|
|
|
|
```kotlin
|
|
repositories {
|
|
// ...
|
|
maven("https://gitea.sergeych.net/api/packages/SergeychWorks/maven")
|
|
}
|
|
```
|
|
|
|
And add dependecy to the proper place in yuor project like this:
|
|
|
|
```kotlin
|
|
dependencies {
|
|
// ...
|
|
implementation("net.sergeych:mp_bintools:0.1.0")
|
|
}
|
|
```
|
|
|
|
# Bipack
|
|
|
|
## Why?
|
|
|
|
Bipack is a compact and efficient binary serialization library (and format) was designed with the following main goals:
|
|
|
|
### Allow easy unpacking existing binary structures
|
|
|
|
Yuo describe your structure as `@Serializable` classes, and - voila, bipack decodes and encodes it for you! We aim to make it really easy to convert data from other binary formats by adding more format annotations
|
|
|
|
### Be as compact as possible
|
|
|
|
For this reason it is a binary notation, it uses binary form for decimal numbers and can use variery of encoding for
|
|
integers:
|
|
|
|
#### Varint
|
|
|
|
Variable-length compact encoding is used internally in some cases. It uses a 0x80 bit in every byte to mark coninuation.
|
|
See `object Varint`.
|
|
|
|
#### Smartint
|
|
|
|
Variable-length compact encoding for signed and unsigned integers use as few bytes as possible to encode integers. It is
|
|
used automatically when serializing integers. It is slightly more sophisticated than straight `Varint`.
|
|
|
|
### Do not reveal information about stored data
|
|
|
|
Many extendable formats, like JSON, BSON, BOSS and may others are keeping data in key-value pairs. While it is good in
|
|
many aspets, it has a clear disadvantages: it uses more space, and it reveals inner data structure to the world. It is
|
|
possible to unpack such formats with zero information about inner structure.
|
|
|
|
Bipack does not store field names, so it is not possible to unpack or interpret it without knowledge of the data
|
|
structure. Only probablistic analysis. Let's not make life of attacker easier :)
|
|
|
|
### - allow upgrading data structures with backward compatibility
|
|
|
|
The dark side of serialization formats of this kind is that you can't change the structures without either loosing
|
|
backward compatibility with already serialzied data or using volumous boilerplate code to implement some sort of
|
|
versioning.
|
|
|
|
Not to waste space and reveal more information that needed Bipack allows extending classes marked as [@Extendable] to be
|
|
extended with more data _appended to the end of list of fields with required defaul values_. For such classes, Bipack stores the number of actually serialized fields and atuomatically uses default values for non-serialized ones when unpacking
|
|
old data.
|
|
|
|
### Protect data with framing and CRC
|
|
|
|
When needed, serialization lobrary allow to store/check CRC32 tag of the structure name with `@Framed` (can be overriden
|
|
as usual with `@SerialName`), or be followed with CRC32 of the serialized binary data, that will be checked on
|
|
deserialization, using `@CrcProtected`. This allows checking the data consistency out of the box and only where needed.
|
|
|
|
# Usage
|
|
|
|
Use kotlinx serializatino as usual. There are the following Bipack-specific annotations at your disposal (can be combined):
|
|
|
|
## @Extendable
|
|
|
|
Classes marked this way store number of fields. It allows to add to the class data more fields, to the end of list, with
|
|
default initializers, keeping backward compatibility. For example if you have serialized:
|
|
|
|
```kotlin
|
|
@Serializable
|
|
@Extendable
|
|
data class foo(val i: Int)
|
|
```
|
|
|
|
and then decided to add a field:
|
|
|
|
```kotlin
|
|
@Serializable
|
|
@Extendable
|
|
data class foo(val i: Int, val bar: String = "buzz")
|
|
```
|
|
|
|
It adds 1 or more bytes to the serialized data (field counts in `Varint` format)
|
|
|
|
Bipack will properly deserialize the data serialzied for an old version.
|
|
|
|
## @CrcProtected
|
|
|
|
Bipack will calculate and store CRC32 of serialized data at the end, and automatically check it on deserializing
|
|
throwing `InvalidFrameCRCException` if it does not match.
|
|
|
|
It adds 4 bytes to the serialized data.
|
|
|
|
## @Framed
|
|
|
|
Put the CRC32 of the serializing class name (`@SerialName` allows to change it as usual) and checks it on deserializing.
|
|
Throws `InvalidFrameHeaderException` if it does not match.
|
|
|
|
It adds 4 bytes to the serialized data.
|
|
|
|
## @Unisgned
|
|
|
|
This __field annontation__ allows to store __integer fields__ of any size more compact by not saving the sign. Could be
|
|
applyed to both signed and unsigned integers of any size.
|
|
|
|
## @FixedSize(size)
|
|
|
|
Use it with fixed-size collections (like hashes, keys, etc.) to not keep collection size in the packed binary. It saves
|
|
at least one byte.
|
|
|
|
## @Fixed
|
|
|
|
Can be used with any integer type to store/restor it as is, fixed-size, big-endian:
|
|
|
|
- Short, UShort: 2 bytes
|
|
- Int, UInt: 4 bytes
|
|
- Long, ULong: 8 bytes
|
|
|
|
Note that without this modifier all integers are serialized into variable-length compressed format, see class [Smartint]
|
|
from this library.
|
|
|
|
Example:
|
|
|
|
~~~kotlin
|
|
@Serializable
|
|
class Foo(
|
|
@Fixed
|
|
val eightBytesLongInt: Long
|
|
)
|
|
|
|
// so:
|
|
assertEquals("00 00 00 01 00 00 00 02", BipackEncoder.encode(Foo(0x100000002)).encodeToHex())
|
|
~~~
|
|
|