MBTiles Mask Dataset

MBTilesMaskWindowedDataset trains segmentation models from MBTiles imagery and GeoTIFF masks. The mask is the reference grid: each sampled mask window defines the output CRS, transform, resolution, and shape. Source imagery is warped into that grid before returning a training sample.

Use this dataset after validating alignment with export-mbtiles-mask-aligned. If you still need to generate the masks themselves from vector data, use MBTiles Multiclass Mask Builder first.

When to Use

Data layout	Recommended approach
Images already aligned GeoTIFFs	`RasterPatchDataset`
MBTiles imagery + GeoTIFF masks	`MBTilesMaskWindowedDataset`
Custom patch CSV over aligned rasters	`CSVWindowedSegmentationDataset`

Sample Contract

Each item returns:

{
    "image": FloatTensor[C, H, W],
    "mask": LongTensor[H, W],
    "mask_path": "...",
    "row_off": 1024,
    "col_off": 2048,
}

Without augmentations, uint8 and uint16 imagery are normalized to [0, 1]. When n_classes == 2, mask values greater than zero become foreground class 1. For multiclass masks, set n_classes to the real number of classes so class IDs are preserved.

Hydra Example

conf/examples/mbtiles_mask_windowed_segmentation.yaml
train_dataset:
  _target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_mask_dataset.MBTilesMaskWindowedDataset
  mbtiles_path: /data/source_imagery.mbtiles
  mask_dir: /data/train_masks
  patch_size: 512
  stride: 512
  selected_bands: [1, 2, 3]
  image_dtype: uint8
  image_resampling: bilinear
  window_index_cache: /data/cache/train_mbtiles_mask_windows.parquet
  n_classes: 2

Window Index Cache

Set window_index_cache to avoid scanning masks and recomputing the sliding window index on every dataset construction:

train_dataset:
  _target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_mask_dataset.MBTilesMaskWindowedDataset
  mbtiles_path: /data/source_imagery.mbtiles
  mask_dir: /data/train_masks
  patch_size: 512
  stride: 512
  window_index_cache: /data/cache/train_windows.parquet

Supported formats are .csv and .parquet. When the cache exists, the dataset loads these columns and skips mask discovery:

mask_path,row_off,col_off,width,height

For pre-selected square windows, the cache may provide patch_size instead of width,height. Extra columns are ignored, so coreset or sampling CSVs can be used directly:

mask_path,row_off,col_off,patch_size,class_entropy,sampler_weight

The cache can also describe windows by world-coordinate bounds in the mask CRS. This is useful when sampling outputs already contain patch footprints:

mask_path,minx,miny,maxx,maxy,crs

Sampling outputs that use tile_minx, tile_miny, tile_maxx, and tile_maxy are also accepted in bounds mode.

By default, the dataset auto-detects pixel windows first and then bounds. For CSV files where the mask path column has another name, set window_index_mask_path_key. To force bounds interpretation:

train_dataset:
  window_index_cache: /data/splits/val.csv
  window_index_mask_path_key: image_path
  window_index_coordinate_mode: bounds

When the cache does not exist, the dataset scans the masks, builds the index, and writes the cache for future runs.

Implementation Notes

For each mask window:

Read the mask window directly.
Compute mask_src.window_transform(window).
Open the MBTiles/source raster through rasterio.
Use rasterio WarpedVRT to read source pixels into the mask window grid.
Return tensors in the same shape expected by the training Model.

This avoids assuming that MBTiles tile row/column indexes match mask pixel coordinates.

Resampling

Use bilinear or cubic for RGB imagery.
Use nearest for categorical source rasters.
If the MBTiles native zoom is much coarser than the mask resolution, output still aligns but image detail is lost.

When to Use​

Sample Contract​

Hydra Example​

Window Index Cache​

Implementation Notes​

Resampling​