🌡 Desert Semantic Segmentation using SegFormer (MiT-B2)

A SegFormer transformer model fine-tuned on the Offroad Segmentation Training Dataset for 10-class semantic segmentation of desert terrain β€” built for UGV (Unmanned Ground Vehicle) autonomous navigation in off-road environments.


🧠 Model Architecture

Component Detail
Framework HuggingFace Transformers
Model SegFormer
Backbone MiT-B2 (nvidia/mit-b2)
Parameters 27,354,314 (all trainable)
Decoder Lightweight MLP Head
Classes 10
Input Size 512 Γ— 512
GPU NVIDIA A100-PCIE-40GB

πŸ—‚ Dataset Classes (10 Categories)

Class ID Raw Mask Value Label
0 100 Trees
1 200 Lush Bushes
2 300 Dry Grass
3 500 Dry Bushes
4 550 Ground Clutter
5 600 Flowers
6 700 Logs
7 800 Rocks
8 7100 Landscape
9 10000 Sky

πŸ“Š Dataset Statistics

Split Samples Proportion
Train 2,142 75%
Validation 286 10%
Test 429 15%
Total 2,857 β€”
  • Image resolution: 960 Γ— 540 (RGB)
  • Mask format: uint16 with raw class value encoding
  • Total annotated instances: 16,951

🎨 Augmentation Pipeline

11 augmentations specifically chosen for desert and off-road conditions:

Augmentation Purpose
Color Jitter Handles varying sun angles and color temperatures
Gamma Change Simulates over/under-exposed outdoor scenes
Gaussian Noise Robustness to sensor noise in UGV cameras
Motion / Gaussian / Median Blur Motion blur from vehicle movement
Random Shadows Shadows from rocks, vegetation, terrain
Random Fog Dust storms and atmospheric haze
Brightness/Contrast Atmospheric and lighting variations
Texture Mixup Prevents overfitting to specific terrain patterns
Horizontal Flip Improves directional generalization
Shift / Scale / Rotate Spatial robustness
Coarse Dropout Simulates sensor occlusion

βš™οΈ Training Configuration

Parameter Value
Epochs 50
Batch Size 8
Learning Rate 6e-5
Optimizer AdamW
Warmup Steps 500
Weight Decay 0.01
FP16 βœ… Enabled
Best Model Metric mean_iou
Eval Strategy Per epoch

πŸ“ˆ Evaluation Results

Evaluated on the validation split (286 images) using COCO-style mean IoU.

Metric Value
Mean IoU 0.6529
Mean Accuracy 0.7592

Per-Class IoU

Class IoU
Trees 0.8517
Lush Bushes 0.6990
Dry Grass 0.7007
Dry Bushes 0.4873
Ground Clutter 0.3647
Flowers 0.7246
Logs 0.5591
Rocks 0.4544
Landscape 0.7014
Sky 0.9860

Best class: Sky (0.9860) β€” large uniform regions
Hardest class: Ground Clutter (0.3647) β€” small, heterogeneous objects


βš™οΈ Inference

from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
from PIL import Image
import torch
import torch.nn.functional as F

# Load model
processor = SegformerImageProcessor.from_pretrained("PUSHPENDAR/desert-segformer")
model = SegformerForSemanticSegmentation.from_pretrained("PUSHPENDAR/desert-segformer")
model.eval()

# Load image
image = Image.open("desert_scene.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits  # (1, num_classes, H/4, W/4)

# Upsample to original size
upsampled = F.interpolate(
    logits,
    size=(image.height, image.width),
    mode="bilinear",
    align_corners=False
)
pred_mask = upsampled.argmax(dim=1)[0].numpy()  # (H, W)
print("Predicted class map shape:", pred_mask.shape)

πŸ“¦ Repository Files

File / Folder Description
pytorch_model.bin Fine-tuned SegFormer weights
config.json Model configuration
preprocessor_config.json Image processor settings
outputs/validation_metrics.json Saved evaluation metrics
outputs/training_curves.png Loss and mIoU training curves
outputs/test_predictions/ Per-image prediction masks

πŸš€ Run Locally

git clone https://huggingface.co/PUSHPENDAR/desert-segformer
cd desert-segformer
pip install transformers torch pillow
python app.py

πŸ“ Citation

If you use this model or dataset, please cite:

@misc{desert-segformer-2025,
  title     = {Desert Semantic Segmentation with SegFormer (MiT-B2)},
  author    = {Pushpendar Choudhary},
  year      = {2025},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/PUSHPENDAR/desert-segformer}
}

πŸ“„ License

Apache 2.0 β€” see LICENSE for details.

Downloads last month
4
Safetensors
Model size
27.4M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using PUSHPENDAR/segformer-desert 1