pixi

package module
v0.0.16 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 3, 2025 License: BSD-3-Clause Imports: 12 Imported by: 0

README

pixi

This repository contains the specification of a the Pixi file format. To start, think of Pixi like an opinionated cloud optimized GeoTIFF, but with explicit support for more than two dimensions and fewer built-in specifics for image-particular interpretation concerns.

Design Considerations

  1. More than images: Pixi is a format for tiled multidimensional raster data, which can be in dimensions higher than 2 or even 3. The interpretation of each dimension, even for dimensions 2 and 3, is defined on a per-file basis and cannot necessarily be assumed. Viewer applications may assume conventions for such files, but this is not specified by this document.

  2. Queryable: it should be possible to transmit only a portion of a file (a 'tile') without needing to transmit the rest of the data. This is especially important for cloud optimized scenarios or files that are accessed by many different machines.

  3. Robustness to transmission errors: it should be possible to detect datastream transmission errors reliably.

  4. Portability: encoding, decoding, and transmission should be software and hardware platform independent.

  5. Performance: any filtering and compression should be aimed at efficient decoding. Fast encoding is a less important goal than fast decoding. Decoding speed may be achieved at the expense of encoding speed.

  6. Compression: files should be compressed effectively, consistent with the other design goals.

  7. Interchangeability: any standard-conforming Pixi decoder shall be capable of reading all conforming Pixi datastreams.

  8. Freedom from legal restrictions: no algorithms should be used that are not freely available.

Terminology

Concepts

Layers
Fields
Separation
Dimensions and Tiling
Robustness and Errors

Layout

This section details the byte-level layout of a Pixi file.

Pixi Header

Every Pixi file should start with seven bytes: "PIXI". This is followed by the version number, written as a number in UTF-8 string in two bytes. The version number should be prefixed with leading zeros if the printed number string is not long enough to fill two bytes.

Following this magic sequence and version number is the offset size indicator. This is a single byte, indicating the number of bytes that will make up offset values later in this file, used to point to different byte indices within the file. Currently, the only supported values of the offset size indicator are 4 and 8 (for 32-bit and 64-bit requirements respectively).

Then the endianness indicator follows, another single byte. This indicates the endianness of all multibyte values that follow in the data stream. The two supported options are little endian at 0x00 and big endian with 0xff.

Following this is the first layer offset, which will be an integer composed of the number of bytes specified by the offset size indicator. This will be the byte offset in the file, with index 0 equal to the start of the file, at which the first layer's first byte can be found.

Following this offset is the tagging offset. This will be the offset in the file at which the tagging section can start being read.

Layer Header
Tagging Section
Field Header

Compression

Conformance

Viewers

Editors

Documentation

Index

Constants

View Source
const (
	FileType string = "pixi" // Every file starts with these four bytes.
	Version  int    = 1      // Every file has a version number as the second set of four bytes.
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Compression

type Compression uint32

Represents the compression method used to shrink the data persisted to a layer in a Pixi file.

const (
	CompressionNone   Compression = 0 // No compression
	CompressionFlate  Compression = 1 // Standard FLATE compression
	CompressionLzwLsb Compression = 2 // Least-significant-bit Lempel-Ziv-Welch compression from Go standard lib
	CompressionLzwMsb Compression = 3 // Most-significant-bit Lempel-Ziv-Welch compression from Go standard lib
)

func (Compression) ReadChunk added in v0.0.12

func (c Compression) ReadChunk(r io.Reader, chunk []byte) (int, error)

Reads a compressed chunk of data into the given slice which must be the size of the desired uncompressed data. Returns the number of bytes read or and error if the read failed.

func (Compression) String added in v0.0.12

func (c Compression) String() string

func (Compression) WriteChunk added in v0.0.12

func (c Compression) WriteChunk(w io.Writer, chunk []byte) (int, error)

Compresses the given chunk of data according to the selected compression scheme, and writes the compressed data to the writer. Returns the number of compressed bytes written, or an error if the write failed.

type Dimension

type Dimension struct {
	Name     string // Friendly name to refer to the dimension in the layer.
	Size     int    // The total number of elements in the dimension.
	TileSize int    // The size of the tiles in the dimension. Does not need to be a factor of Size.
}

Represents an axis along which tiled, gridded data is sstored in a Pixi file. Data sets can have one or more dimensions, but never zero. If a dimension is not tiled, then the TileSize should be the same as a the total Size.

func (Dimension) HeaderSize added in v0.0.12

func (d Dimension) HeaderSize(h PixiHeader) int

Get the size in bytes of this dimension description as it is laid out and written to disk.

func (*Dimension) Read added in v0.0.12

func (d *Dimension) Read(r io.Reader, h PixiHeader) error

Reads a description of the dimension from the given binary stream, according to the specification in the Pixi header h.

func (Dimension) Tiles

func (d Dimension) Tiles() int

Returns the number of tiles in this dimension. The number of tiles is calculated by dividing the size of the dimension by the tile size, and then rounding up to the nearest whole number if there are any remaining bytes that do not fit into a full tile.

func (*Dimension) Write added in v0.0.12

func (d *Dimension) Write(w io.Writer, h PixiHeader) error

Writes the binary description of the dimenson to the given stream, according to the specification in the Pixi header h.

type DimensionSet added in v0.0.12

type DimensionSet []Dimension

func (DimensionSet) SampleCoordinates added in v0.0.12

func (set DimensionSet) SampleCoordinates() iter.Seq[SampleCoordinate]

Iterate over the sample indices of the dimensions in the order the dimensions are laid out. That is, the index increments for the size of the first dimension, then the second (nesting the first), then the third (nesting the second (nesting the first)), and so on.

func (DimensionSet) Samples added in v0.0.12

func (d DimensionSet) Samples() int

The total number of samples in the data set. If the tile size of any dimension is not a multiple of the dimension size, the 'padding' samples are not included in the count.

func (DimensionSet) TileCoordinates added in v0.0.12

func (set DimensionSet) TileCoordinates() iter.Seq[TileCoordinate]

func (DimensionSet) TileSamples added in v0.0.12

func (d DimensionSet) TileSamples() int

The number of samples per tile in the data set. Each tile has the same number of samples, regardless of if the data is stored separated or continguous.

func (DimensionSet) Tiles added in v0.0.12

func (d DimensionSet) Tiles() int

Computes the number of non-separated tiles in the data set. This number is the same regardless of how the tiles are laid out on disk; use the DiskTiles() method to determine the number of tiles actually stored on disk. Note that DiskTiles() >= Tiles() by definition.

type Field

type Field struct {
	Name string    // A friendly name for this field, to help guide interpretation of the data.
	Type FieldType // The type of data stored in each element of this field.
}

Describes a set of values in a data set with a common shape. Similar to a field of a record in a database, but with a more restricted set of available types per field.

func (Field) BytesToValue added in v0.0.12

func (f Field) BytesToValue(raw []byte, order binary.ByteOrder) any

Reads the value of a given FieldType from the provided raw byte slice. The read operation is type-dependent, with each field type having its own specific method for reading values. This ensures that the correct data is read and converted into the expected format.

func (Field) HeaderSize added in v0.0.12

func (d Field) HeaderSize(h PixiHeader) int

Get the size in bytes of this dimension description as it is laid out and written to disk.

func (*Field) Read

func (d *Field) Read(r io.Reader, h PixiHeader) error

Reads a description of the field from the given binary stream, according to the specification in the Pixi header h.

func (Field) Size

func (f Field) Size() int

Returns the size of a field in bytes.

func (*Field) Write

func (d *Field) Write(w io.Writer, h PixiHeader) error

Writes the binary description of the field to the given stream, according to the specification in the Pixi header h.

func (Field) WriteValue added in v0.0.12

func (f Field) WriteValue(raw []byte, val any)

This function writes a value of any type into bytes according to the specified FieldType. The written bytes are stored in the provided byte array. This function will panic if the FieldType is unknown or if an unsupported field type is encountered.

type FieldType

type FieldType uint32

Describes the size and interpretation of a field.

const (
	FieldUnknown FieldType = 0  // Generally indicates an error.
	FieldInt8    FieldType = 1  // An 8-bit signed integer.
	FieldUint8   FieldType = 2  // An 8-bit unsigned integer.
	FieldInt16   FieldType = 3  // A 16-bit signed integer.
	FieldUint16  FieldType = 4  // A 16-bit unsigned integer.
	FieldInt32   FieldType = 5  // A 32-bit signed integer.
	FieldUint32  FieldType = 6  // A 32-bit unsigned integer.
	FieldInt64   FieldType = 7  // A 64-bit signed integer.
	FieldUint64  FieldType = 8  // A 64-bit unsigned integer.
	FieldFloat32 FieldType = 9  // A 32-bit floating point number.
	FieldFloat64 FieldType = 10 // A 64-bit floating point number.
)

func (FieldType) BytesToValue added in v0.0.12

func (f FieldType) BytesToValue(raw []byte, o binary.ByteOrder) any

This function reads the value of a given FieldType from the provided raw byte slice. The read operation is type-dependent, with each field type having its own specific method for reading values. This ensures that the correct data is read and converted into the expected format.

func (FieldType) ReadValue added in v0.0.12

func (f FieldType) ReadValue(r io.Reader, o binary.ByteOrder) (any, error)

func (FieldType) Size

func (f FieldType) Size() int

This function returns the size of each element in a field in bytes.

func (FieldType) String added in v0.0.12

func (f FieldType) String() string

func (FieldType) WriteValue added in v0.0.12

func (f FieldType) WriteValue(raw []byte, val any)

This function writes a value of any type into bytes according to the specified FieldType. The written bytes are stored in the provided byte array. This function will panic if the FieldType is unknown or if an unsupported field type is encountered.

type FormatError

type FormatError string

func (FormatError) Error

func (e FormatError) Error() string

type IntegrityError added in v0.0.12

type IntegrityError struct {
	TileIndex int
	LayerName string
}

func (IntegrityError) Error added in v0.0.12

func (e IntegrityError) Error() string

type Layer added in v0.0.12

type Layer struct {
	Name string // Friendly name of the layer
	// Indicates whether the fields of the dataset are stored separated or contiguously. If true,
	// values for each field are stored next to each other. If false, the default, values for each
	// index are stored next to each other, with values for different fields stored next to each
	// other at the same index.
	Separated   bool
	Compression Compression // The type of compression used on this dataset (e.g., Flate, lz4).
	// A slice of Dimension structs representing the dimensions and tiling of this dataset.
	// No dimensions equals an empty dataset. Dimensions are stored and iterated such that the
	// samples for the first dimension are the closest together in memory, with progressively
	// higher dimensions samples becoming further apart.
	Dimensions     DimensionSet
	Fields         []Field // An array of Field structs representing the fields in this dataset.
	TileBytes      []int64 // An array of byte counts representing (compressed) size of each tile in bytes for this dataset.
	TileOffsets    []int64 // An array of byte offsets representing the position in the file of each tile in the dataset.
	NextLayerStart int64   // The byte-index offset of the next layer in the file, from the start of the file. 0 if this is the last layer in the file.
}

Pixi files are composed of one or more layers. Generally, layers are used to represent the same data set at different 'zoom levels'. For example, a large digital elevation model data set might have a layer that shows a zoomed-out view of the terrain at a much smaller footprint, useful for thumbnails and previews. Layers are also useful if data sets of different resolutions should be stored together in the same file.

func NewLayer added in v0.0.12

func NewLayer(name string, separated bool, compression Compression, dimensions []Dimension, fields []Field) *Layer

Helper constructor to ensure that certain invariants in a layer are maintained when it is created.

func (*Layer) DataSize added in v0.0.12

func (d *Layer) DataSize() int64

The on-disk size in bytes of the (potentially compressed) data set. Does not include the dataset header size.

func (*Layer) DiskTileSize added in v0.0.12

func (d *Layer) DiskTileSize(tileIndex int) int

The size of the requested disk tile in bytes. For contiguous files, the size of each tile is always the same. However, for separated data sets, each field is tiled (so the number of on-disk tiles is actually fieldCount * Tiles()). Hence, the tile size changes depending on which field is being accessed.

func (*Layer) DiskTiles added in v0.0.12

func (d *Layer) DiskTiles() int

The number of discrete data tiles actually stored in the backing file. This number differs based on whether fields are stored 'contiguous' or 'separated'; in the former case, DiskTiles() == Tiles(), in the latter case, DiskTiles() == Tiles() * number of fields.

func (*Layer) HeaderSize added in v0.0.12

func (d *Layer) HeaderSize(h PixiHeader) int

Get the total number of bytes that will be occupied in the file by this layer's header.

func (*Layer) OverwriteHeader added in v0.0.12

func (l *Layer) OverwriteHeader(w io.WriteSeeker, h PixiHeader, headerStartOffset int64) error

For a layer header which has already been written to the given position, writes the layer header again to the same location before returning the stream cursor to the position it was at previously. Generally this is used to update tile byte counts and tile offsets after they've been written to a stream.

func (*Layer) OverwriteTile added in v0.0.12

func (l *Layer) OverwriteTile(w io.WriteSeeker, h PixiHeader, tileIndex int, data []byte) error

func (*Layer) ReadLayer added in v0.0.12

func (d *Layer) ReadLayer(r io.Reader, h PixiHeader) error

Reads a description of the layer from the given binary stream, according to the specification in the Pixi header h.

func (*Layer) ReadTile added in v0.0.12

func (l *Layer) ReadTile(r io.ReadSeeker, h PixiHeader, tileIndex int, data []byte) error

Read a raw tile (not yet decoded into sample fields) at the given tile index. The tile must have been previously written (either in this session or a previous one) for this operation to succeed. The data is verified for integrity using a four-byte checksum placed directly after the saved tile data, and an error is returned (along with the data read into the chunk) if the checksum check fails.

func (*Layer) SampleSize added in v0.0.12

func (d *Layer) SampleSize() int

The size in bytes of each sample in the data set. Each field has a fixed size, and a sample is made up of one element of each field, so the sample size is the sum of all field sizes.

func (*Layer) WriteHeader added in v0.0.12

func (d *Layer) WriteHeader(w io.Writer, h PixiHeader) error

Writes the binary description of the layer to the given stream, according to the specification in the Pixi header h.

func (*Layer) WriteTile added in v0.0.12

func (l *Layer) WriteTile(w io.WriteSeeker, h PixiHeader, tileIndex int, data []byte) error

Write the encoded tile data to the current stream position, updating the offset and byte count for this tile in the layer header (but not writing those offsets to the stream just yet). The data is written with a 4-byte checksum directly after it, which is used to verify data integrity when reading the tile later.

type Pixi added in v0.0.12

type Pixi struct {
	Header PixiHeader    // The metadata about the file version and how to read information from the file.
	Layers []*Layer      // The metadata information about each layer in the file.
	Tags   []*TagSection // The string tags of the file, broken up into sections for easy appending.
}

Represents a single pixi file composed of one or more layers. Functions as a handle to access the description of the each layer as well as the data stored in each layer.

func ReadPixi added in v0.0.12

func ReadPixi(r io.ReadSeeker) (Pixi, error)

Convenience function to read all the metadata information from a Pixi file into a single containing struct.

func (*Pixi) DiskDataBytes added in v0.0.12

func (d *Pixi) DiskDataBytes() int64

The total size of the data portions of the file in bytes. Does not count header information as part of the size.

func (*Pixi) LayerOffset added in v0.0.12

func (d *Pixi) LayerOffset(l *Layer) int64

Gets the byte-index offset from the start of the file at which the layer header begins.

type PixiHeader added in v0.0.12

type PixiHeader struct {
	Version          int
	OffsetSize       int
	ByteOrder        binary.ByteOrder
	FirstLayerOffset int64
	FirstTagsOffset  int64
}

Contains information used to read or write the rest of a Pixi data file. This information is always found at the start of a stream of Pixi data.

func (*PixiHeader) OverwriteOffsets added in v0.0.12

func (h *PixiHeader) OverwriteOffsets(w io.WriteSeeker, firstLayer int64, firstTags int64) error

func (*PixiHeader) Read added in v0.0.12

func (s *PixiHeader) Read(r io.Reader, val any) error

Reads a fixed-size value, or a slice of such values, using the byte order given in the header.

func (*PixiHeader) ReadFriendly added in v0.0.12

func (s *PixiHeader) ReadFriendly(r io.Reader) (string, error)

func (*PixiHeader) ReadHeader added in v0.0.12

func (h *PixiHeader) ReadHeader(r io.Reader) error

Read Pixi header information into this struct from the current position in the reader stream. Will return an error if the reading fails, or if there are format errors in the Pixi header.

func (*PixiHeader) ReadOffset added in v0.0.12

func (s *PixiHeader) ReadOffset(r io.Reader) (int64, error)

Reads a file offset from the current position in the reader, based on the offset size read earlier in the file. Panics if the file offset size has not yet been set, and returns an error if reading fails.

func (*PixiHeader) ReadOffsets added in v0.0.12

func (s *PixiHeader) ReadOffsets(r io.Reader, offsets []int64) error

Reads a slice of offsets from the current position in the reader, based on the offset size read earlier in the file. Panics if the file offset size has not yet been set, and returns an error if reading fails.

func (*PixiHeader) Write added in v0.0.12

func (s *PixiHeader) Write(w io.Writer, val any) error

Writes a fixed size value, or a slice of such values, using the byte order given in the header.

func (*PixiHeader) WriteFriendly added in v0.0.12

func (s *PixiHeader) WriteFriendly(w io.Writer, friendly string) error

func (*PixiHeader) WriteHeader added in v0.0.12

func (h *PixiHeader) WriteHeader(w io.Writer) error

Write the information in this header to the current position in the writer stream.

func (*PixiHeader) WriteOffset added in v0.0.12

func (s *PixiHeader) WriteOffset(w io.Writer, offset int64) error

Writes a file offset to the current position in the writer stream, based on the offset size specified in the header. Panics if the file offset size has not yet been set, and returns an error if writing fails.

func (*PixiHeader) WriteOffsets added in v0.0.12

func (s *PixiHeader) WriteOffsets(w io.Writer, offsets []int64) error

Writes a slice of offsets to the current position in the writer stream, based on the offset size specified in the header. Panics if the file offset size has not yet been set, and returns an error if writing fails.

type SampleCoordinate added in v0.0.12

type SampleCoordinate []int

func (SampleCoordinate) ToSampleIndex added in v0.0.12

func (coord SampleCoordinate) ToSampleIndex(set DimensionSet) SampleIndex

func (SampleCoordinate) ToTileCoordinate added in v0.0.12

func (coord SampleCoordinate) ToTileCoordinate(set DimensionSet) TileCoordinate

func (SampleCoordinate) ToTileSelector added in v0.0.12

func (coord SampleCoordinate) ToTileSelector(set DimensionSet) TileSelector

type SampleIndex added in v0.0.12

type SampleIndex int

func (SampleIndex) ToSampleCoordinate added in v0.0.12

func (index SampleIndex) ToSampleCoordinate(set DimensionSet) SampleCoordinate

type TagSection added in v0.0.12

type TagSection struct {
	Tags          map[string]string // The tags for this section.
	NextTagsStart int64             // A byte-index offset from the start of the file pointing to the next tag section. 0 if this is the last tag section.
}

Pixi files can contain zero or more tag sections, used for extraneous non-data related metadata to help describe the file or indicate context of the file's ownership and lifespan. While the tags are conceptually just a flat list of string pairs, the layout in the file is done in sections with offsets pointing to further sections, allowing easier 'appending' of additional tags regardless of where in the file previous tags are stored.

func (*TagSection) Read added in v0.0.12

func (t *TagSection) Read(r io.Reader, h PixiHeader) error

Reads a tag section from the given binary stream, according to the specification in the Pixi header h.

func (*TagSection) Write added in v0.0.12

func (t *TagSection) Write(w io.Writer, h PixiHeader) error

Writes the tag section in binary to the given stream, according to the specification in the Pixi header h.

type TileCoordinate added in v0.0.12

type TileCoordinate struct {
	Tile   []int
	InTile []int
}

func (TileCoordinate) ToSampleCoordinate added in v0.0.12

func (coord TileCoordinate) ToSampleCoordinate(set DimensionSet) SampleCoordinate

func (TileCoordinate) ToTileSelector added in v0.0.12

func (coord TileCoordinate) ToTileSelector(set DimensionSet) TileSelector

type TileIndex added in v0.0.12

type TileIndex int

type TileSelector added in v0.0.12

type TileSelector struct {
	Tile   int
	InTile int
}

func (TileSelector) ToTileCoordinate added in v0.0.12

func (s TileSelector) ToTileCoordinate(set DimensionSet) TileCoordinate

func (TileSelector) ToTileIndex added in v0.0.12

func (s TileSelector) ToTileIndex(set DimensionSet) TileIndex

type UnsupportedError

type UnsupportedError string

func (UnsupportedError) Error

func (e UnsupportedError) Error() string

Directories

Path Synopsis
cmd
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL