Xobdo Boroxa 1.0 — Assamese Neural TTS Model

Xobdo Boroxa 1.0 is an Assamese neural Text-to-Speech (TTS) model developed as a collaborative project between Assam AI Initiative (AAII-আই) and Xobdo.org.

This model has been trained using around 4 hours of high-quality Assamese speech data and is built using an ESPnet-based FastSpeech2 + Multi-band MelGAN TTS pipeline.


Model Overview

The TTS system is based on:

  • Acoustic model: FastSpeech2
  • Vocoder: Multi-band MelGAN
  • Toolkit: ESPnet
  • Language: Assamese
  • Training data: Around 4 hours of high-quality Assamese speech
  • Input: Assamese Unicode text
  • Output: Assamese speech waveform

The goal of this model is to provide a lightweight and usable Assamese neural TTS system for browser-based and community-focused language technology applications.


Applications Built Using This Model

Using this model, we have developed two free public-facing Assamese reading tools.

1. Assamese Text-to-Speech Web App

A browser-based Assamese TTS web application that can read Assamese Unicode text directly from webpages or pasted text.

Key features:

  • Reads Assamese Unicode text
  • Supports webpage URL input
  • Supports direct text input
  • Runs speech synthesis inside the browser
  • No server-side API call required for synthesis
  • Model is cached in the browser after first load
  • Can work offline after the model is loaded
  • Useful for accessibility, reading support, education, and Assamese digital content consumption

Project page: https://www.xobdo.org/project_pathak/


2. Assamese TTS Chrome Extension

A Chrome extension for reading Assamese text directly from webpages.

Key features:

  • Select Assamese text on any webpage and listen to it
  • Right-click and choose full-page reading
  • Runs locally inside the browser
  • No API call required for speech synthesis
  • Model is downloaded once and cached locally
  • Works offline after the first model load
  • Designed for desktop Chrome users

Links:

Acknowledgement

This model and the related applications are part of a collaborative effort by Assam AI Initiative (AAII-আই) and Xobdo.org to support Assamese language technology and make Assamese digital content more accessible.


Disclaimer

This is an early version of the Assamese TTS model. The model is designed to generate clear and natural Assamese speech, but pronunciation, prosody, and expressiveness may still improve with more training data and future model updates.

Downloads last month
78
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support