Skip to main content

MBTiles Crops Dataset

MBTilesCropsGeoTifMaskDataset trains segmentation models from pre-selected crop windows paired with a spatially-aligned mask from any rasterio-readable source.

Use this dataset when your training windows are already defined — either as pixel-space offsets in a CSV/Parquet file, or as geographic features in a vector file — and your masks are uncropped GeoTIFFs, a VRT mosaicking multiple files, or another MBTile.

Key properties

PropertyBehaviour
Window sourceCSV/Parquet (pixel offsets) or vector file (bbox-snap)
Image sourceAny rasterio-readable raster (MBTile, GeoTIFF, VRT, …)
Mask sourceAny rasterio-readable raster (VRT, GeoTIFF, MBTile, …)
Patch sizeAlways fixed (patch_size × patch_size)
CRS mismatchResolved automatically via WarpedVRT
Resolution mismatchResolved automatically via WarpedVRT
Mask resamplingAlways nearest-neighbour (class integrity)
Image resamplingConfigurable (default bilinear)

Window sources

CSV / Parquet

The file must contain at least two columns: row_off and col_off (pixel-space offsets relative to the image raster). Column names are configurable via row_off_key and col_off_key.

row_off,col_off
0,0
0,256
256,0
256,256

Vector file (GeoPackage, Shapefile, GeoJSON, …)

Each feature defines one training window. The feature bounding-box centre is projected to image pixel space and a patch_size × patch_size window is centred on that point (bbox-snap), then clamped to the raster bounds. Features with empty geometries are skipped.

Mask types

Multi-band (RGB/RGBA color-coded)

Provide a color_map mapping [R, G, B, class_idx] entries. Pixels whose colour is not in the map are assigned class 0 (background). Required when the mask has more than one band — instantiation raises ValueError otherwise.

Single-band (integer class indices)

Omit color_map. Pixel values are used directly as class indices. Set n_classes: 2 to binarise all non-zero values to foreground class 1.

Using a VRT to mosaic multiple GeoTIFFs

When your masks span multiple GeoTIFF files, build a VRT with GDAL and pass it as mask_path:

gdalbuildvrt /data/masks/mask.vrt /data/masks/*.tif

The WarpedVRT inside the dataset handles CRS and resolution differences automatically.

Configuration reference

ParameterTypeDefaultDescription
image_mbtiles_pathstrrequiredPath to the image raster (reference grid)
mask_pathstrrequiredPath to the mask raster (VRT / GeoTIFF / MBTile)
crops_pathstrrequiredPath to CSV/Parquet or vector file
patch_sizeintrequiredFixed patch height and width in pixels
color_maplistnull[[R,G,B,class], …] — required for multi-band masks
n_classesint2Classes for single-band masks; 2 binarises non-zero values
selected_bandslistnull1-based image band indices (null = all bands)
image_dtypestr"uint8""uint8" | "uint16" | "float32" | "native"
image_resamplingstr"bilinear"Resampling for image warping
crops_layerstrnullLayer name for multi-layer vector files (GPKG)
col_off_keystr"col_off"Column name for horizontal pixel offset
row_off_keystr"row_off"Column name for vertical pixel offset
augmentation_listlist[]Albumentations transforms
data_loaderdictDataLoader config (shuffle, num_workers, …)
return_metadataboolfalseAdd row_off/col_off to each sample
window_index_cachestrnull.csv or .parquet path to persist window index

Example: CSV windows + VRT mask

train_dataset:
_target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_crops_dataset.MBTilesCropsGeoTifMaskDataset
image_mbtiles_path: /data/imagery.mbtiles
mask_path: /data/masks/mask.vrt
crops_path: /data/crops/windows.csv
patch_size: 256
color_map:
- [255, 0, 0, 1]
- [0, 255, 0, 2]
- [0, 0, 255, 3]
augmentation_list:
- _target_: albumentations.HorizontalFlip
p: 0.5
- _target_: albumentations.Normalize
mean: [0.485, 0.456, 0.406]
std: [0.229, 0.224, 0.225]
- _target_: albumentations.pytorch.ToTensorV2
data_loader:
shuffle: true
num_workers: 8

Example: vector windows + single-band GeoTIFF mask

train_dataset:
_target_: pytorch_segmentation_models_trainer.dataset_loader.mbtiles_crops_dataset.MBTilesCropsGeoTifMaskDataset
image_mbtiles_path: /data/imagery.mbtiles
mask_path: /data/masks/mask.tif
crops_path: /data/crops/regions.gpkg
crops_layer: training_windows
patch_size: 256
n_classes: 2
augmentation_list:
- _target_: albumentations.HorizontalFlip
p: 0.5
- _target_: albumentations.pytorch.ToTensorV2
data_loader:
shuffle: true
num_workers: 8

Output format

Each __getitem__ returns a dict:

KeyShapedtypeCondition
"image"(C, H, W)float32always
"mask"(H, W)int64always
"metadata"dictonly when return_metadata: true

metadata contains row_off and col_off (int) in image pixel coordinates.