MBTiles Mask Dataset
MBTilesMaskWindowedDataset trains segmentation models from MBTiles imagery and
GeoTIFF masks. The mask is the reference grid: each sampled mask window defines
the output CRS, transform, resolution, and shape. Source imagery is warped into
that grid before returning a training sample.
Use this dataset after validating alignment with
export-mbtiles-mask-aligned.
If you still need to generate the masks themselves from vector data, use
MBTiles Multiclass Mask Builder first.
When to Use
| Data layout | Recommended approach |
|---|---|
| Images already aligned GeoTIFFs | RasterPatchDataset |
| MBTiles imagery + GeoTIFF masks | MBTilesMaskWindowedDataset |
| Custom patch CSV over aligned rasters | CSVWindowedSegmentationDataset |
Sample Contract
Each item returns:
{
"image": FloatTensor[C, H, W],
"mask": LongTensor[H, W],
"mask_path": "...",
"row_off": 1024,
"col_off": 2048,
}
Without augmentations, uint8 and uint16 imagery are normalized to [0, 1].
When n_classes == 2, mask values greater than zero become foreground class
1. For multiclass masks, set n_classes to the real number of classes so
class IDs are preserved.
Hydra Example
train_dataset:
_target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_mask_dataset.MBTilesMaskWindowedDataset
mbtiles_path: /data/source_imagery.mbtiles
mask_dir: /data/train_masks
patch_size: 512
stride: 512
selected_bands: [1, 2, 3]
image_dtype: uint8
image_resampling: bilinear
window_index_cache: /data/cache/train_mbtiles_mask_windows.parquet
n_classes: 2
Window Index Cache
Set window_index_cache to avoid scanning masks and recomputing the sliding
window index on every dataset construction:
train_dataset:
_target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_mask_dataset.MBTilesMaskWindowedDataset
mbtiles_path: /data/source_imagery.mbtiles
mask_dir: /data/train_masks
patch_size: 512
stride: 512
window_index_cache: /data/cache/train_windows.parquet
Supported formats are .csv and .parquet. When the cache exists, the dataset
loads these columns and skips mask discovery:
mask_path,row_off,col_off,width,height
For pre-selected square windows, the cache may provide patch_size instead of
width,height. Extra columns are ignored, so coreset or sampling CSVs can be
used directly:
mask_path,row_off,col_off,patch_size,class_entropy,sampler_weight
The cache can also describe windows by world-coordinate bounds in the mask CRS. This is useful when sampling outputs already contain patch footprints:
mask_path,minx,miny,maxx,maxy,crs
Sampling outputs that use tile_minx, tile_miny, tile_maxx, and
tile_maxy are also accepted in bounds mode.
By default, the dataset auto-detects pixel windows first and then bounds. For
CSV files where the mask path column has another name, set
window_index_mask_path_key. To force bounds interpretation:
train_dataset:
window_index_cache: /data/splits/val.csv
window_index_mask_path_key: image_path
window_index_coordinate_mode: bounds
When the cache does not exist, the dataset scans the masks, builds the index, and writes the cache for future runs.
Implementation Notes
For each mask window:
- Read the mask window directly.
- Compute
mask_src.window_transform(window). - Open the MBTiles/source raster through rasterio.
- Use rasterio
WarpedVRTto read source pixels into the mask window grid. - Return tensors in the same shape expected by the training
Model.
This avoids assuming that MBTiles tile row/column indexes match mask pixel coordinates.
Resampling
- Use
bilinearorcubicfor RGB imagery. - Use
nearestfor categorical source rasters. - If the MBTiles native zoom is much coarser than the mask resolution, output still aligns but image detail is lost.