Collection of binary format tools and manipulations
Go to file
2023-04-01 17:27:52 +01:00
docs some texts fixed 2023-04-01 17:27:52 +01:00
kotlin-js-store BiPack: added @FixedSize for collection serialization 2023-03-31 06:20:26 +01:00
src some texts fixed 2023-04-01 17:27:52 +01:00
.gitignore ignore and annotation description 2023-03-31 07:10:59 +01:00
build.gradle.kts some support for fixed number 2023-04-01 16:12:20 +01:00
gradle.properties initial: varing and smarting packing tools 2023-02-03 13:10:43 +01:00
gradlew initial: varing and smarting packing tools 2023-02-03 13:10:43 +01:00
gradlew.bat initial: varing and smarting packing tools 2023-02-03 13:10:43 +01:00
README.md some texts fixed 2023-04-01 17:27:52 +01:00
settings.gradle.kts initial: varing and smarting packing tools 2023-02-03 13:10:43 +01:00

Binary tools and BiPack serializer

beta version

Multiplatform binary tools collection, including portable serialization of the compact and fast [Bipack] format, that works well also in the browser and in native targets.

Usage

TODO: specify maven: how?

Bipack

Why?

Bipack is a compact and efficiten binary serialization library (and format) was designed with the following main goals:

- be as compact as possible

For this reason it is a binary notation, it uses binary form for decimal numbers and can use variery of encoding for integers:

Varint

Variable-length compact encoding is used internally in some cases. It uses a 0x80 bit in every byte to mark coninuation. See object Varint.

Smartint

Variable-length compact encoding for signed and unsigned integers use as few bytes as possible to encode integers. It is used automatically when serializing integers. It is slightly more sophisticated than straight Varint.

- do not reveal information about stored data

Many extendable formats, like JSON, BSON, BOSS and may others are keeping data in key-value pairs. While it is good in many aspets, it has a clear disadvantages: it uses more space, and it reveals inner data structure to the world. It is possible to unpack such formats with zero information about inner structure.

Bipack does not store field names, so it is not possible to unpack or interpret it without knowledge of the data structure. Only probablistic analysis. Let's not make life of attacker easier :)

- allow upgrading data structures with backward compatibility

The dark side of serialization formats of this kind is that you can't change the structures without either loosing backward compatibility with already serialzied data or using volumous boilerplate code to implement some sort of versioning.

Not to waste space and reveal more information that needed Bipack allows extending classes marked as [@Extendable] to be extended with more data appended to the end of list of fields with required defaul values. For such classes Bipack stores number of actually serialized fields and atuomatically uses default values for non-serialized ones when unpacking old data.

- protect data with framing and CRC

When needed, serialization lobrary allow to store/check CRC32 tag of the structure name with @Framed (can be overriden as usual with @SerialName), or be followed with CRC32 of the serialized binary data, that will be checked on deserialization, using @CrcProtected. This allows to check the data consistency out of the box and only where needed.

Usage

Use kotlinx serializatino as usual. There are following Bipack-specific annotation at your service. All class annotations could be combined.

@Extendable

Classes marked this way store number of fields. It allows to add to the class data more fields, to the end of list, with default initializers, keeping backward compatibility. For example if you have serialized:

@Serializable
@Extendable
data class foo(val i: Int)

and then decided to add a field:

@Serializable
@Extendable
data class foo(val i: Int, val bar: String = "buzz")

It adds 1 or more bytes to the serialized data (field counts in Varint format)

Bipack will properly deserialize the data serialzied for an old version.

@CrcProtected

Bipack will calculate and store CRC32 of serialized data at the end, and automatically check it on deserializing throwing InvalidFrameCRCException if it does not match.

It adds 4 bytes to the serialized data.

@Framed

Put the CRC32 of the serializing class name (@SerialName allows to change it as usual) and checks it on deserializing. Throws InvalidFrameHeaderException if it does not match.

It adds 4 bytes to the serialized data.

@Unisgned

This field annontation allows to store integer fields of any size more compact by not saving the sign. Could be applyed to both signed and unsigned integers of any size.

@FixedSize(size)

Use it with fixed-size collections (like hashes, keys, etc.) to not keep collection size in the packed binary. It saves at least one byte.

@Fixed

Can be used with any integer type to store/restor it as is, fixed-size, big-endian:

  • Short, UShort: 2 bytes
  • Int, UInt: 4 bytes
  • Long, ULong: 8 bytes

Note that without this modifier all integers are serialized into variable-length compressed format, see class [Smartint] from this library.

Example:

@Serializable
class Foo(
     @Fixed
     val eightBytesLongInt: Long
)

// so:
assertEquals("00 00 00 01 00 00 00 02", BipackEncoder.encode(Foo(0x100000002)).encodeToHex())