🌵 Desert Semantic Segmentation using SegFormer (MiT-B2)

A SegFormer transformer model fine-tuned on the Offroad Segmentation Training Dataset for 10-class semantic segmentation of desert terrain — built for UGV (Unmanned Ground Vehicle) autonomous navigation in off-road environments.

🧠 Model Architecture

Component	Detail
Framework	HuggingFace Transformers
Model	SegFormer
Backbone	MiT-B2 (`nvidia/mit-b2`)
Parameters	27,354,314 (all trainable)
Decoder	Lightweight MLP Head
Classes	10
Input Size	512 × 512
GPU	NVIDIA A100-PCIE-40GB

🗂 Dataset Classes (10 Categories)

Class ID	Raw Mask Value	Label
0	100	Trees
1	200	Lush Bushes
2	300	Dry Grass
3	500	Dry Bushes
4	550	Ground Clutter
5	600	Flowers
6	700	Logs
7	800	Rocks
8	7100	Landscape
9	10000	Sky

📊 Dataset Statistics

Split	Samples	Proportion
Train	2,142	75%
Validation	286	10%
Test	429	15%
Total	2,857	—

Image resolution: 960 × 540 (RGB)
Mask format: uint16 with raw class value encoding
Total annotated instances: 16,951

🎨 Augmentation Pipeline

11 augmentations specifically chosen for desert and off-road conditions:

Augmentation	Purpose
Color Jitter	Handles varying sun angles and color temperatures
Gamma Change	Simulates over/under-exposed outdoor scenes
Gaussian Noise	Robustness to sensor noise in UGV cameras
Motion / Gaussian / Median Blur	Motion blur from vehicle movement
Random Shadows	Shadows from rocks, vegetation, terrain
Random Fog	Dust storms and atmospheric haze
Brightness/Contrast	Atmospheric and lighting variations
Texture Mixup	Prevents overfitting to specific terrain patterns
Horizontal Flip	Improves directional generalization
Shift / Scale / Rotate	Spatial robustness
Coarse Dropout	Simulates sensor occlusion

⚙️ Training Configuration

Parameter	Value
Epochs	50
Batch Size	8
Learning Rate	6e-5
Optimizer	AdamW
Warmup Steps	500
Weight Decay	0.01
FP16	✅ Enabled
Best Model Metric	mean_iou
Eval Strategy	Per epoch

📈 Evaluation Results

Evaluated on the validation split (286 images) using COCO-style mean IoU.

Metric	Value
Mean IoU	0.6529
Mean Accuracy	0.7592

Per-Class IoU

Class	IoU
Trees	0.8517
Lush Bushes	0.6990
Dry Grass	0.7007
Dry Bushes	0.4873
Ground Clutter	0.3647
Flowers	0.7246
Logs	0.5591
Rocks	0.4544
Landscape	0.7014
Sky	0.9860

Best class: Sky (0.9860) — large uniform regions
Hardest class: Ground Clutter (0.3647) — small, heterogeneous objects

⚙️ Inference

from transformers import SegformerImageProcessor, SegformerForSemanticSegmentation
from PIL import Image
import torch
import torch.nn.functional as F

# Load model
processor = SegformerImageProcessor.from_pretrained("PUSHPENDAR/desert-segformer")
model = SegformerForSemanticSegmentation.from_pretrained("PUSHPENDAR/desert-segformer")
model.eval()

# Load image
image = Image.open("desert_scene.jpg").convert("RGB")
inputs = processor(images=image, return_tensors="pt")

# Predict
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits  # (1, num_classes, H/4, W/4)

# Upsample to original size
upsampled = F.interpolate(
    logits,
    size=(image.height, image.width),
    mode="bilinear",
    align_corners=False
)
pred_mask = upsampled.argmax(dim=1)[0].numpy()  # (H, W)
print("Predicted class map shape:", pred_mask.shape)

📦 Repository Files

File / Folder	Description
`pytorch_model.bin`	Fine-tuned SegFormer weights
`config.json`	Model configuration
`preprocessor_config.json`	Image processor settings
`outputs/validation_metrics.json`	Saved evaluation metrics
`outputs/training_curves.png`	Loss and mIoU training curves
`outputs/test_predictions/`	Per-image prediction masks

🚀 Run Locally

git clone https://huggingface.co/PUSHPENDAR/desert-segformer
cd desert-segformer
pip install transformers torch pillow
python app.py

📝 Citation

If you use this model or dataset, please cite:

@misc{desert-segformer-2025,
  title     = {Desert Semantic Segmentation with SegFormer (MiT-B2)},
  author    = {Pushpendar Choudhary},
  year      = {2025},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/PUSHPENDAR/desert-segformer}
}

📄 License

Apache 2.0 — see LICENSE for details.

Downloads last month: 4

Safetensors

Model size

27.4M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

PUSHPENDAR
/

segformer-desert