Upload README.md
Browse files
README.md
CHANGED
|
@@ -7,9 +7,22 @@ License: **CC-BY-NC-4.0** — non-commercial use only. See the upstream model ca
|
|
| 7 |
## Files
|
| 8 |
|
| 9 |
- `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
- `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
|
| 11 |
-
- `
|
| 12 |
-
- `
|
| 13 |
- `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
|
|
|
|
|
|
|
|
|
|
| 14 |
|
| 15 |
Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.
|
|
|
|
| 7 |
## Files
|
| 8 |
|
| 9 |
- `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
|
| 10 |
+
- `voice-multi-b2048-r48000-z11.ts` — **Voice (Multi-speaker)**. Aggregated multi-speaker voice corpus. Wider speaker diversity than VocalSet — produces more 'average human' renders.
|
| 11 |
+
- `voice_hifitts_b2048_r48000_z16.ts` — **Voice (HiFi-TTS)**. HiFi-TTS — high-fidelity expressive English speech corpus. Cleaner, more articulate than the multi-speaker model.
|
| 12 |
+
- `voice_jvs_b2048_r44100_z16.ts` — **Voice (JVS, Japanese)**. JVS — Japanese multi-speaker voice corpus at 44.1 kHz. Use for Japanese-language sources or non-Latin phoneme structure.
|
| 13 |
+
- `voice_vctk_b2048_r44100_z22.ts` — **Voice (VCTK, English)**. VCTK — English multi-speaker corpus from CSTR Edinburgh, 44.1 kHz. High 22-dim latent — captures more speaker idiosyncrasies.
|
| 14 |
+
- `birds_motherbird_b2048_r48000_z16.ts` — **Birds (Motherbird)**. Bird-vocalization corpus — chirps + textural transients. The canonical 'weird' pick: produces wildly warped output for any arbitrary input.
|
| 15 |
+
- `birds_dawnchorus_b2048_r48000_z8.ts` — **Birds (Dawn Chorus)**. Dense overlapping bird vocalizations recorded at dawn. Smaller 8-dim latent — outputs lean ensemble-textural over individual calls.
|
| 16 |
+
- `birds_pluma_b2048_r48000_z12.ts` — **Birds (Pluma)**. Lighter, individual bird-call timbres. Mid-size 12-dim latent balances character + clarity.
|
| 17 |
+
- `humpbacks_pondbrain_b2048_r48000_z20.ts` — **Humpback Whales**. Humpback-whale song. Long, slow, hauntingly-deep vocal contours — pairs well with sustained input.
|
| 18 |
+
- `marinemammals_pondbrain_b2048_r48000_z20.ts` — **Marine Mammals**. Mixed marine-mammal vocalizations — dolphins, orcas, sea-life clicks and cries.
|
| 19 |
+
- `guitar_iil_b2048_r48000_z16.ts` — **Guitar (IIL)**. Acoustic / electric guitar timbre. Good demo for transferring voice or synth input into a plucked-string voice.
|
| 20 |
- `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
|
| 21 |
+
- `organ_archive_b2048_r48000_z16.ts` — **Organ (Archive)**. Historical pipe-organ recordings — broader, dustier textures than the Bach model. Good for film-score atmospheres.
|
| 22 |
+
- `sax_soprano_franziskaschroeder_b2048_r48000_z20.ts` — **Soprano Sax (Schroeder)**. Soprano-saxophone extended techniques by Franziska Schroeder. Multiphonics, growls, key clicks. 20-dim latent — captures fine-grained articulation.
|
| 23 |
- `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
|
| 24 |
+
- `magnets_b2048_r48000_z8.ts` — **Magnets**. Ferromagnetic / electromagnetic resonance textures — metallic hums, distant industrial buzz, magnetized-string ringing.
|
| 25 |
+
- `mrp_strengjavera_b2048_r44100_z16.ts` — **Magnetic Resonator Piano (Strengjavera)**. Magnetic Resonator Piano. Sustained metallic-string overtones produced by electromagnetically driving piano strings — 44.1 kHz.
|
| 26 |
+
- `crozzoli_bigensemblesmusic_18d.ts` — **Big Ensemble Music (Crozzoli)**. Big-ensemble orchestral music (M. Crozzoli). Broad 18-dim latent for hugely-textured renders. Sample rate not embedded in filename — defaults to 48000; override via panel if needed.
|
| 27 |
|
| 28 |
Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.
|