🌴 Evaluating TerraMind's Thinking-in-Modality's capability for oil palm segmentation

#18
by geografif - opened

Background

If you like chocolate spread on your toast bread, chance is you consume palm oil. These oils are produced from oil palms, often grown in areas used to be rainforest, where cloud persists. Stakeholders interested in monitoring these palms, e.g.:

  • Environmental NGOs
  • Local governments and law enforcement
  • Sustainability certifiers
  • Service providers for the upcoming European Union's Deforestation Regulation compliance monitoring

are often work in a conventional way, relying on cloud-free optical images. This hinders frequent monitoring.

TerraMind foundation model, with its Thinking-in-Modality (TiM), can potentially be used to overcome the persistent cloud-cover problem in oil palm monitoring from optical images, by generating an intermediate Synthetic Aperture Radar (SAR) modality. SAR images are less prone to clouds and have strong signal from palms' unique canopy-shape, making it easier to differentiate oil palm from surrounding non-palm vegetation.

How we used TerraMind

Most of TerraMind's demonstrations have been focusing on objects that are clearly distinguishable on optical images, such as burn scars and flooded areas. We evaluated TerraMind's TiM's capability to segment oil palm from optical Sentinel-2 RGB images, as they looked as green as other vegetation. We obtained the oil palm mask and followed image preprocessing steps from Descals et al. (2021). Experiments were conducted with frozen terramind_v1_base_tim backbone, and we only fine-tuned the decoder. For comparison, we also trained U-Net baselines with Sentinel-2 RGB and Sentinel-1 SAR images. Models trained on optical images were evaluated on clear and cloudy test sets.

Results

Accuracy metrics on the test set

Model Input F1-score mIoU
U-Net Sentinel-2 RGB (clear) 0.89 0.51
U-Net Sentinel-2 RGB (cloudy) 0.79 0.29
U-Net Sentinel-1 SAR 0.95 0.74
TerraMind TiM base Sentinel-2 RGB (clear) 0.92 0.81
TerraMind TiM base Sentinel-2 RGB (cloudy) 0.88 0.74

Snapshots of model prediction on test images:
test_00001
testc_00001
test_00007
testc_00007

Lesson learned and future outlook

  • Cloudy optical images are suboptimal to be used with the U-Net baseline
  • However, TerraMind's with SAR TiM seems to improve the model performance despite cloud covers
  • We plan to test a larger version of TiM model and extend our evaluation to different regions
  • A little bit of a stretch, but it would be great to lower the barrier for stakeholders to use this fancy method in monitoring oil palm, by integrating the model into a desktop GIS plugin that is able to produce oil palm segkents when fed with cloudy, near-real-time optical images. The base model inference should be okay on a consumer-grade computers.

Authors

Afif Fauzan, student, University of Tartu (fauzan [at] ut [dot] ee)
Deha Umarhadi, student, Ludwig-Maximilians-Universität München (deha.agus.u [at] mail [dot] ugm [dot] ac [dot] id)

Acknowledgement

This is an extension of an earlier study supervised by Iris van Duren and Raian Vargas Maretto. We thank Serkan Girgin for recovering the data used in the study.

Relevant references:

geografif changed discussion title from 🌴Evaluating TerraMind Thinking-in-Modality's capability for oil palm segmentation in the tropics to 🌴 Evaluating TerraMind's Thinking-in-Modality's capability for oil palm segmentation

Sign up or log in to comment