Qwen3-ASR-1.7B — Core AI

Qwen3-ASR-1.7B speech-to-text converted for Apple Core AI, running on-device (iPhone + Mac). The zoo's first ASR model: an AuT audio encoder feeding a Qwen3 decoder on the pipelined engine (audio embeds bound to one static input buffer; {lang}<asr_text>{text} output). ≤30 s clips, 52 languages, automatic language detection.

Use it

▶️ Run it (source) — the Transcribe runner (GUI + CLI, one app for every speech-to-text model in the catalog):

git clone https://github.com/john-rocky/coreai-kit
open coreai-kit/Examples/Transcribe/Transcribe.xcodeproj
# → Run, then pick "Qwen3-ASR 1.7B" in the model picker

# agents / headless (macOS):
cd coreai-kit/Examples/Transcribe
swift run transcribe-cli --model qwen3-asr-1.7b --audio sample.wav

💻 Build with it — complete; the glue is kit API, copy-paste runs:

import CoreAIKit

let transcriber = try await KitTranscriber(catalog: "qwen3-asr-1.7b")
let samples = try AudioFile.pcm16kMono(url)  // any wav/m4a/mp3 → 16 kHz mono Float
let result = try await transcriber.transcribe(samples: samples)
// result.text, result.language (52 languages)

The take-home is Examples/Transcribe/Sources/QuickStart.swift — this exact code as one typed function, no UI; both the runner's GUI and its CLI call it. Recording? MicRecorder (kit API) captures mic audio as 16 kHz mono [Float] — the record button and permission prompt are your app's own chrome.

Integration checklist

SPM: https://github.com/john-rocky/coreai-kit → product CoreAIKit
Info.plist: NSMicrophoneUsageDescription — only if you record
Entitlements: none needed (macOS)
First run downloads the model — 3.1 GB (Mac) — then it loads from the local cache (Application Support; progress via the downloadProgress callback)
Measure in Release — Debug is ~3× slower on per-token host work

Driven by CoreAIKit KitASRModel:

let asr = try await KitASRModel(model: .qwen3ASR1_7B)
let r = try await asr.transcribe(samples: pcm16kMono)   // -> (language, text)

Layout: gpu-pipelined/ holds the decoder bundle (*_decode_int8hu_n390_s1, int8) + the paired AuT encoder (*_audio_encoder_fp16_k30, fp16). Same bundles on iOS and macOS.

App: coreai-audio (Transcribe tab — pick Qwen3-ASR or Whisper large-v3-turbo). Card: zoo/qwen3-asr.md.

Downloads last month: -; Downloads are not tracked for this model. How to track