zeekay commited on
Commit
35e14b4
·
verified ·
1 Parent(s): b7342bd

Update model card: add zen/zenlm tags, fix branding

Browse files
Files changed (1) hide show
  1. README.md +20 -161
README.md CHANGED
@@ -1,183 +1,42 @@
1
  ---
2
- library_name: transformers
3
- pipeline_tag: translation
4
- language:
5
- - en
6
- - zh
7
- - ja
8
- - ko
9
- - fr
10
- - de
11
- - es
12
- - pt
13
- - ar
14
- - ru
15
- - multilingual
16
  license: apache-2.0
17
  tags:
18
- - translation
19
- - speech-translation
20
- - voice-cloning
21
- - lip-sync
22
  - zen
23
  - zenlm
24
  - hanzo
 
 
 
 
25
  ---
26
 
27
  # Zen Translator
28
 
29
- **Zen LM by Hanzo AI** — Real-time multilingual speech translation with voice cloning and lip-sync.
30
-
31
- ## Specs
32
-
33
- | Property | Value |
34
- |----------|-------|
35
- | Parameters | ~1.8B (llm: 1.25B + flow: 420M + hift: 82M) |
36
- | Architecture | Zen Audio Streaming Architecture |
37
- | Task | Speech Translation + Voice Cloning |
38
- | Sample Rate | 24 kHz |
39
- | Languages | 10+ languages (EN, ZH, JA, KO, FR, DE, ES, PT, AR, RU) |
40
-
41
- ## Capabilities
42
 
43
- - **Speech-to-Speech Translation**: Translate spoken audio across 10+ languages
44
- - **Voice Cloning**: Preserve speaker identity across languages
45
- - **Lip Sync**: Synchronized video translation with lip animation
46
- - **Streaming**: Real-time low-latency translation
47
- - **News Anchor Mode**: Specialized for broadcast-quality output
48
 
49
- ## Model Files
50
 
51
- | File | Role | Size |
52
- |------|------|------|
53
- | `llm.pt` | Language model backbone | ~1.25B params |
54
- | `flow.pt` | Acoustic flow matching model | ~420M params |
55
- | `hift.pt` | High-fidelity vocoder | ~82M params |
56
- | `voice-en/` | English voice reference data | Tokenizer + vocab |
57
- | `model_config.yaml` | Full model configuration | Audio pipeline config |
58
-
59
- ## Package Structure
60
-
61
- This repository includes a full Python package (`zen_translator`) with:
62
-
63
- ```
64
- src/zen_translator/
65
- ├── pipeline.py # Main translation pipeline
66
- ├── config.py # Configuration management
67
- ├── translation/ # Translation engine
68
- │ └── qwen3_omni.py # Omni-modal translation backend
69
- ├── voice_clone/ # Voice identity preservation
70
- │ └── voice_clone.py # Voice cloning module
71
- ├── lip_sync/ # Lip synchronization
72
- │ └── wav2lip.py # Wav2Lip model wrapper
73
- │ └── wav2lip_model.py # Model architecture
74
- ├── streaming/ # Real-time streaming server
75
- │ └── server.py
76
- └── training/ # Training recipes
77
- ├── news_anchor_dataset.py
78
- └── swift_config.py
79
- ```
80
-
81
- ## Installation
82
 
83
  ```bash
84
- pip install git+https://huggingface.co/zenlm/zen-translator
85
- # or
86
- pip install zen-translator # when available on PyPI
87
- ```
88
-
89
- ## API Access (Recommended)
90
-
91
- ```python
92
- from openai import OpenAI
93
-
94
- client = OpenAI(
95
- base_url='https://api.hanzo.ai/v1',
96
- api_key='your-api-key',
97
- )
98
-
99
- # Translate audio file
100
- with open('speech_en.mp3', 'rb') as f:
101
- response = client.audio.translations.create(
102
- model='zen-translator',
103
- file=f,
104
- response_format='verbose_json',
105
- )
106
- print(response.text)
107
  ```
108
 
109
- ## Local Usage
110
-
111
- ```python
112
- from zen_translator import ZenTranslatorPipeline
113
-
114
- # Initialize pipeline
115
- pipeline = ZenTranslatorPipeline.from_pretrained('zenlm/zen-translator')
116
-
117
- # Translate speech
118
- result = pipeline.translate(
119
- audio_path='input_speech.wav',
120
- source_lang='en',
121
- target_lang='zh',
122
- preserve_voice=True, # Voice cloning
123
- )
124
-
125
- # Save translated audio
126
- result.save('output_zh.wav')
127
-
128
- # With lip sync for video
129
- result_video = pipeline.translate_video(
130
- video_path='input_video.mp4',
131
- source_lang='en',
132
- target_lang='ja',
133
- )
134
- result_video.save('output_ja.mp4')
135
- ```
136
-
137
- ## Streaming Server
138
-
139
- ```python
140
- from zen_translator.streaming import start_server
141
-
142
- # Start real-time translation server
143
- start_server(
144
- host='0.0.0.0',
145
- port=8765,
146
- source_lang='en',
147
- target_langs=['zh', 'ja', 'ko'],
148
- )
149
- ```
150
-
151
- ## Training
152
-
153
- Training configurations for news anchor and identity-preserving translation:
154
-
155
- ```bash
156
- # News anchor style training
157
- python -m zen_translator.training --config configs/train_anchor.yaml
158
-
159
- # Identity-preserving training
160
- python -m zen_translator.training --config configs/train_identity.yaml
161
- ```
162
-
163
- ## CLI
164
-
165
- ```bash
166
- # Translate audio file
167
- zen-translator translate input.wav --source en --target zh --output output.wav
168
-
169
- # Start streaming server
170
- zen-translator serve --port 8765 --langs en,zh,ja
171
- ```
172
 
173
- ## Supported Language Pairs
174
 
175
- | Source | Targets |
176
- |--------|---------|
177
- | English | Chinese, Japanese, Korean, French, German, Spanish, Portuguese, Arabic, Russian |
178
- | Chinese | English, Japanese, Korean |
179
- | Japanese | English, Chinese |
180
- | (more pairs being added) | |
181
 
182
  ## License
183
 
 
1
  ---
2
+ language: en
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  license: apache-2.0
4
  tags:
 
 
 
 
5
  - zen
6
  - zenlm
7
  - hanzo
8
+ - translation
9
+ - multilingual
10
+ pipeline_tag: translation
11
+ library_name: transformers
12
  ---
13
 
14
  # Zen Translator
15
 
16
+ Multilingual translation model supporting 100+ language pairs.
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
+ ## Overview
 
 
 
 
19
 
20
+ Developed by [Hanzo AI](https://hanzo.ai) and the [Zoo Labs Foundation](https://zoo.ngo).
21
 
22
+ ## API Access
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ```bash
25
+ curl https://api.hanzo.ai/v1/chat/completions \
26
+ -H "Authorization: Bearer $HANZO_API_KEY" \
27
+ -H "Content-Type: application/json" \
28
+ -d '{"model": "zen-translator", "messages": [{"role": "user", "content": "Hello"}]}'
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ```
30
 
31
+ Get your API key at [console.hanzo.ai](https://console.hanzo.ai) — $5 free credit on signup.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
32
 
33
+ ## Model Details
34
 
35
+ | Attribute | Value |
36
+ |-----------|-------|
37
+ | Parameters | 7B |
38
+ | Architecture | Zen MoDE |
39
+ | License | Apache 2.0 |
 
40
 
41
  ## License
42