Overview¶
The Anscombe Transform¶
The Anscombe Transform is a variance-stabilizing transformation specifically designed for data with Poisson noise. In photon-limited imaging, the noise variance grows linearly with the signal mean (characteristic of Poisson statistics), which makes compression difficult because different intensity levels have different noise characteristics.
The Problem¶
In photon-limited data: - Low intensity regions have low noise variance - High intensity regions have high noise variance - This heteroscedastic noise makes efficient compression challenging
The Solution¶
The Anscombe Transform applies a square-root-like transformation that: 1. Equalizes noise variance across all intensity levels 2. Reduces the number of unique grayscale values needed 3. Improves compressibility without losing signal accuracy
Mathematically, the transform is:
For our codec, we adapt this to account for camera parameters:
Codec Architecture¶
The codec is implemented in two versions to support both Zarr V2 and V3:
Zarr V2: AnscombeTransformV2¶
- Implements the
numcodecs.Codecinterface - Used as a compressor in Zarr V2 arrays
- Registered with ID
"anscombe-v1"
Zarr V3: AnscombeTransformV3¶
- Implements the
ArrayArrayCodecinterface - Used as a filter before compression in Zarr V3 arrays
- Registered with the same ID
"anscombe-v1"
Both share the same core encode() and decode() functions, ensuring consistent behavior.
How It Works¶
Encoding Pipeline¶
- Normalize: Convert raw data to photon counts using
conversion_gainandzero_level - Transform: Apply the Anscombe Transform to stabilize variance
- Quantize: Discretize the transformed values to
encoded_dtype(typicallyuint8) - Compress: Apply additional compression (e.g., Blosc, Zstd)
Decoding Pipeline¶
- Decompress: Uncompress the data
- Lookup: Apply inverse transform via lookup table
- Denormalize: Convert back to original units using
conversion_gainandzero_level
Lookup Tables¶
The codec uses pre-computed lookup tables for efficiency: - Forward lookup: Maps input values to transformed values - Inverse lookup: Maps transformed values back to original values
These tables are computed once during codec initialization and reused for all encode/decode operations.
Performance Characteristics¶
Compression Ratios¶
Typical compression ratios (Anscombe + Blosc/Zstd): - 3-8x for typical multiphoton microscopy data - 6-10x for astronomy data - 3-6x for radiography data
The exact ratio depends on: - Signal-to-noise ratio of the data - Spatial correlation in the images - Choice of secondary compressor
Speed¶
The codec is designed for speed: - Encoding: ~500-1000 MB/s (single-threaded) - Decoding: ~800-1500 MB/s (single-threaded) - Scales well with chunk-based parallel processing
Accuracy¶
The codec is designed to be nearly lossless for photon-limited data:
- Max absolute error: ~ 0.25 noise-sigma per pixel (for beta=0.5)
- Error scales with quantization (beta parameter)
- For default parameters (beta=0.5), the noise variance is increased by a ~1 % with respect to the original noise variance and no bias is introduced.
When to Use This Codec¶
Good Use Cases ✅¶
- Multiphoton microscopy movies
- Astronomy images with photon counting detectors
- Radiography/X-ray imaging
- Any data with Poisson noise where signal ≈ variance
- Data where you can estimate or know
conversion_gainandzero_level
Not Recommended ❌¶
- Data with non-Poisson or non-stationalry noise (e.g., pre-processed images)
- Data where detector parameters are unknown and can't be estimated
- Data that has been transformed with a non-linear function (e.g. gamma correction)