
Virtual chunks that just work — but securely
Delivering enterprise-grade access control for virtual chunks - surprisingly subtle!

Software Engineer

Staff Engineer
21 posts

Delivering enterprise-grade access control for virtual chunks - surprisingly subtle!

Software Engineer

Staff Engineer
Access billions of chunks of satellite imagery as a single Zarr store, without copying any data!

Software Engineer
We released Icechunk-ERA5, a performance-optimized, daily updating ERA5 data cube available now in the Earthmover data marketplace.

CEO & Co-founder
A new extension to Zarr just landed: the rectilinear chunk grid lets you specify arbitrarily sized chunks along each axis, aligning chunk boundaries with the natural structure of your data instead of forcing a regular grid.

CTO & Co-founder
Cloud Engineer @ DevSeed

Software Engineer (Freelance)

How we engineer rigor into Icechunk and its upstream brethren using property and stateful testing techniques. Also, a cautionary tale.

Forward Deployed Engineer

Staff Engineer

The Company Kettle is not a typical insurance company. Using AI to build smarter insurance products, Kettle provides insurance for property owners in areas affected by catastrophic climate events, with a particular focus on wildfire. Their AI models consume over 130 terabytes of satellite, weather,

COO

Eoliann builds proprietary climate risk models that estimate the physical impact of extreme weather events — floods, wildfires, and storms — on critical physical infrastructure: electricity transmission lines, substations, and gas pipelines…

COO

When we released Icechunk 1.0 last July, we declared it production-ready and committed to format stability. Since then, adoption has exceeded our expectations. Teams across weather forecasting, climate science, neuroscience, and AI/ML have pushed Icechunk into scenarios we didn't fully anticipate--r

CEO & Co-founder

Staff Engineer

Zarr Python with Icechunk or Obstore now fully saturates the network between EC2 and S3, achieving the physically maximum possible throughput for reading and writing tensor data in the cloud. Benchmarks compare Zarr, Tensorstore, TileDB, and Parquet stacks across a range of chunk sizes and instance types.

CEO & Co-founder

Earthmover co-organizes the Zarr Summit in Rome, bringing together developers and adopters to advance the open-source cloud-native array format as adoption accelerates across major organizations like ESA, NASA, and NVIDIA.

CEO & Co-founder

Zarr lacks built-in support for concurrent readers and writers, leading to inconsistent reads and conflicting writes in team settings. Icechunk solves this by adding atomic updates, consistent snapshots, and Git-like version control on top of Zarr.

Software Engineer

Introducing the Radar DataTree, a new data model that organizes thousands of fragmented weather radar scans into a single time-aware, cloud-native, version-controlled dataset using xarray-datatree, Zarr, and Icechunk.

Data Scientist

Icechunk 1.0 is now stable and production-ready, bringing transactional safety, efficient versioning, high-performance Rust-based I/O, and virtual references for HDF5 and NetCDF to cloud-native array storage. The release includes manifest splitting, distributed writes, conflict resolution, and a 30 TB ERA5 sample dataset.

CEO & Co-founder

Zarr is an open-source, cloud-native protocol for storing chunked, compressed N-dimensional arrays. This guide covers how Zarr works, its ecosystem of tools like Xarray and Icechunk, and when to use it for large-scale scientific and ML data.

Software Engineer

At the 2025 Cloud-Native Geospatial conference, Zarr adoption was surging across the geospatial domain, with Copernicus Sentinel, USGS Landsat, Google Earth Engine, and ESRI ArcGIS all embracing the format for cloud-optimized array data.

CTO & Co-founder

A practical walkthrough of how Icechunk uses transactions and conflict detection to guarantee data consistency when multiple processes write concurrently. The post demonstrates optimistic concurrency control and the rebase workflow using a bank-account transfer example.

Staff Engineer

Why traditional scientific file formats like NetCDF perform poorly on cloud object storage, and how cloud-optimized formats like Zarr and Icechunk solve the problem by separating metadata and chunking data.

Software Engineer
zarr-python’s performance paradox Last month, we released Zarr-Python 3.0 - a ground-up rewrite of the library (read more about it in this post). Beyond the exciting new features in Zarr V3, we put a lot of work into addressing some long standing performance issues with Zarr-Python 2. With the improvements described in this blog post, we’ve achieved a 14x speedup in loading the ARCO ERA5 dataset! Zarr-Python 2 had a paradoxical performance quirk; although the library could generate massive petabyte-scale datasets, it struggled to perform well when managing large or highly nested hierarchies. For example, listing the contents of a large Zarr group could be painfully slow, particularly if that Zarr group was stored on a high latency storage backend. Zarr users would experience this as long

Software Engineer (Freelance)

Zarr-Python 3.0 is released with full support for the Zarr V3 specification, chunk-sharding for more flexible storage, major performance improvements from a fully asynchronous core, and a modernized extensible codebase.

CTO & Co-founder

Earthmover announces Icechunk, an open-source transactional storage engine for Zarr that brings ACID transactions, time travel, data versioning, and high-performance Rust-based I/O to multidimensional array data in cloud object storage.

CEO & Co-founder

The Zarr-Python project is undergoing a major refactor toward version 3.0, bringing full support for the Zarr V3 specification, new asynchronous APIs for better performance, and a modernized plugin system for codecs and storage backends.

CTO & Co-founder