third_party.pigweed.src/pw_snapshot/design_discussion.rst

.. _module-pw_snapshot-design_discussion:

=================
Design Discussion
=================
There were a handful of key requirements going into the design of pw_snapshot:

* **Pre-established file format** - Building and maintaining tooling to support
  parsing binary snapshot data is a high maintenance burden that detracts from
  the appeal of a pre-existing widely known/supported format.
* **Incremental writing** - Needing to build an entire snapshot before
  committing it as a finished file is a big limitation on embedded devices where
  RAM is often very constrained. It is important that a snapshot can be built in
  smaller in-memory segments that can be committed incrementally to a larger
  sink (e.g. UART, off-chip flash).
* **Extensible** - Pigweed doesn't know everything users might want to capture
  in a snapshot. It's important that users have ways to include their own
  information into snapshots with minimal friction.
* **Relatively compact** - It's important that snapshots can contain useful
  information even when they are limited to a few hundred bytes in size.

Why Proto?
==========
Protobufs are widely used and supported across many languages and platforms.
This greatly reduces the encode/decode tooling maintenance introduced by using
custom or unstructured formats. While using a format like JSON provides
similarly wide tooling support, encoding the same information as a proto
significantly reduces the final file size.

While protobuffer messages aren't truly streamable (i.e. can be written without
any intermediate buffers) due to how message nesting works, a large message can
be incrementally written as long as there's enough buffer space for encoding the
largest single sub-message in the proto.

Why overlay multiple protos?
============================
Proto 2 supported a feature called "extensions" that explicitly allowed this
behavior. While proto 3 removed this feature, it doesn't disallow the old
behavior of serializing two 'overlayed' protos to the same data stream. Proto 3
recommends using an "Any" proto instead of extensions, as it is more explicit
and eliminates the issue of collisions in proto messages. Unfortunately, proto
'Any' messages introduce unacceptable overhead. For a single integer that would
encode to a few bytes using extensions, an Any submessage quickly expands to
tens of bytes.

pw_snapshot's proto format takes advantage of "extensions" from proto 2 without
explicitly relying on the feature. To reduce the risk of colissions and maximize
encoding efficiency, certain ranges are reserved to allow Pigweed to grow while
ensuring downstream customers have equivalent flexibility when using the
Snapshot proto format.

Why no file header?
===================
Right now it's assumed that anything that is storing or transferring a
serialized snapshot implicitly tracks its size (and a checksum, if desired).
While a container format might be introduced independently, pw_snapshot focuses
on treating an encoded snapshot as raw serialized proto data.
pw_snapshot: Add snapshot proto Initial commit for the pw_snapshot module that introduces the proto format for device snapshots, and documentation that introduces the pw_snapshot module and how to use it. Change-Id: I63e12d245073e82de03be995a001a0ee0cc1f443 Reviewed-on: https://pigweed-review.googlesource.com/c/pigweed/pigweed/+/38103 Commit-Queue: Armando Montanez <amontanez@google.com> Reviewed-by: Ewout van Bekkum <ewout@google.com> Reviewed-by: Keir Mierle <keir@google.com> Reviewed-by: David Rogers <davidrogers@google.com> 2021-03-10 19:46:35 +00:00			`.. _module-pw_snapshot-design_discussion:`

			`=================`
			`Design Discussion`
			`=================`
			`There were a handful of key requirements going into the design of pw_snapshot:`

			`* Pre-established file format - Building and maintaining tooling to support`
			`parsing binary snapshot data is a high maintenance burden that detracts from`
			`the appeal of a pre-existing widely known/supported format.`
			`* Incremental writing - Needing to build an entire snapshot before`
			`committing it as a finished file is a big limitation on embedded devices where`
			`RAM is often very constrained. It is important that a snapshot can be built in`
			`smaller in-memory segments that can be committed incrementally to a larger`
			`sink (e.g. UART, off-chip flash).`
			`* Extensible - Pigweed doesn't know everything users might want to capture`
			`in a snapshot. It's important that users have ways to include their own`
			`information into snapshots with minimal friction.`
			`* Relatively compact - It's important that snapshots can contain useful`
			`information even when they are limited to a few hundred bytes in size.`

			`Why Proto?`
			`==========`
			`Protobufs are widely used and supported across many languages and platforms.`
			`This greatly reduces the encode/decode tooling maintenance introduced by using`
			`custom or unstructured formats. While using a format like JSON provides`
			`similarly wide tooling support, encoding the same information as a proto`
			`significantly reduces the final file size.`

			`While protobuffer messages aren't truly streamable (i.e. can be written without`
			`any intermediate buffers) due to how message nesting works, a large message can`
			`be incrementally written as long as there's enough buffer space for encoding the`
			`largest single sub-message in the proto.`

			`Why overlay multiple protos?`
			`============================`
			`Proto 2 supported a feature called "extensions" that explicitly allowed this`
			`behavior. While proto 3 removed this feature, it doesn't disallow the old`
			`behavior of serializing two 'overlayed' protos to the same data stream. Proto 3`
			`recommends using an "Any" proto instead of extensions, as it is more explicit`
			`and eliminates the issue of collisions in proto messages. Unfortunately, proto`
			`'Any' messages introduce unacceptable overhead. For a single integer that would`
			`encode to a few bytes using extensions, an Any submessage quickly expands to`
			`tens of bytes.`

			`pw_snapshot's proto format takes advantage of "extensions" from proto 2 without`
			`explicitly relying on the feature. To reduce the risk of colissions and maximize`
			`encoding efficiency, certain ranges are reserved to allow Pigweed to grow while`
			`ensuring downstream customers have equivalent flexibility when using the`
			`Snapshot proto format.`

			`Why no file header?`
			`===================`
			`Right now it's assumed that anything that is storing or transferring a`
			`serialized snapshot implicitly tracks its size (and a checksum, if desired).`
			`While a container format might be introduced independently, pw_snapshot focuses`
			`on treating an encoded snapshot as raw serialized proto data.`