65 lines
3.1 KiB
Markdown
65 lines
3.1 KiB
Markdown
# FB2 Import/Export Specification
|
|
|
|
## Scope
|
|
|
|
Toread supports FictionBook 2.0 files as plain XML (`.fb2`) and as a standard ZIP archive containing one FB2 XML file (`.fb2.zip`).
|
|
|
|
The vendored schema files live in `shared/src/commonMain/resources/fb2/`:
|
|
|
|
- `FictionBook.xsd`
|
|
- `FictionBookGenres.xsd`
|
|
- `FictionBookLang.xsd`
|
|
- `FictionBookLinks.xsd`
|
|
|
|
These files were copied from `https://github.com/gribuser/fb2` so builds and validation references do not depend on the upstream repository remaining available.
|
|
|
|
## Import
|
|
|
|
The common API is `Fb2Format.parse(input: ByteArray, fileName: String? = null)`.
|
|
|
|
Import detection:
|
|
|
|
- A file is treated as ZIP when its bytes start with the ZIP local-file signature `PK\003\004` or the provided filename ends with `.zip`.
|
|
- Otherwise bytes are decoded as UTF-8 XML.
|
|
- In ZIP archives, the first entry ending with `.fb2` is used. If no such entry exists, the first non-directory entry is used.
|
|
|
|
ZIP support:
|
|
|
|
- Stored ZIP entries are supported on every multiplatform target.
|
|
- Deflated ZIP entries are supported on JVM and Android through `java.util.zip.Inflater`.
|
|
- Deflated ZIP entries currently fail with `Fb2ParseException` on JS and Wasm targets until a common/browser inflater is added.
|
|
- ZIP64 and encrypted archives are not supported.
|
|
|
|
XML mapping:
|
|
|
|
- `description/title-info/book-title` maps to `Fb2Book.title`.
|
|
- `description/title-info/author` maps to `Fb2Book.authors`.
|
|
- `genre`, `lang`, `keywords`, `date`, `annotation`, and `sequence` are imported from `title-info`.
|
|
- `src-lang`, `translator`, and `coverpage/image` are imported from `title-info`.
|
|
- `description/document-info` maps to `Fb2DocumentInfo`.
|
|
- The first non-notes `body` is imported as the readable body.
|
|
- Direct `body/image`, `body/title`, `section/title`, direct `section/image`, direct `section/p`, and nested `section` elements are preserved.
|
|
- `binary` elements are imported with `id`, `content-type`, and whitespace-normalized Base64 content.
|
|
- Image references keep their `href`; `Fb2ImageRef.binaryId` resolves `#cover.jpg` to `cover.jpg`, and `Fb2Book.binaryFor(image)` returns the corresponding embedded binary when present.
|
|
|
|
The importer is intentionally structural, not a full XSD validator.
|
|
|
|
## Export
|
|
|
|
The common API is:
|
|
|
|
- `Fb2Format.exportXml(book: Fb2Book)` for plain `.fb2` XML.
|
|
- `Fb2Format.exportZip(book: Fb2Book, entryName: String = "book.fb2")` for `.fb2.zip`.
|
|
|
|
Export behavior:
|
|
|
|
- XML is emitted as UTF-8 FictionBook 2.0 with the FB2 and XLink namespaces.
|
|
- Required FB2 description fields are emitted from the `Fb2Book` model.
|
|
- Missing `document-info` fields are filled with deterministic defaults: date `1970-01-01`, id `toread-generated`, version `1.0`.
|
|
- ZIP export uses a standard stored ZIP entry, so it is portable without requiring a common deflater.
|
|
|
|
Round-trip guarantees:
|
|
|
|
- Imported title, authors, language, source language, translators, genres, document info, cover images, body title/images, sections, paragraph text, and binaries are represented in the model.
|
|
- Formatting, comments, stylesheets, tables, inline style markup, cover references, and unknown FB2 extension elements are not preserved by the current model.
|