AEmotionStudio
/

rave-models

@@ -7,9 +7,22 @@ License: **CC-BY-NC-4.0** — non-commercial use only. See the upstream model ca
 ## Files
 - `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
 - `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
-- `guitar_iil_b2048_r48000_z16.ts` — **Guitar (IIL)**. Acoustic / electric guitar timbre. Good demo for transferring voice / synth input into a plucked-string voice.
-- `birds_motherbird_b2048_r48000_z16.ts` — **Birds (Motherbird)**. Bird-vocalization corpus — chirps + textural transients. The 'weird' pick: produces wildly warped output for arbitrary input.
 - `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
 Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.

 ## Files
 - `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
+- `voice-multi-b2048-r48000-z11.ts` — **Voice (Multi-speaker)**. Aggregated multi-speaker voice corpus. Wider speaker diversity than VocalSet — produces more 'average human' renders.
+- `voice_hifitts_b2048_r48000_z16.ts` — **Voice (HiFi-TTS)**. HiFi-TTS — high-fidelity expressive English speech corpus. Cleaner, more articulate than the multi-speaker model.
+- `voice_jvs_b2048_r44100_z16.ts` — **Voice (JVS, Japanese)**. JVS — Japanese multi-speaker voice corpus at 44.1 kHz. Use for Japanese-language sources or non-Latin phoneme structure.
+- `voice_vctk_b2048_r44100_z22.ts` — **Voice (VCTK, English)**. VCTK — English multi-speaker corpus from CSTR Edinburgh, 44.1 kHz. High 22-dim latent — captures more speaker idiosyncrasies.
+- `birds_motherbird_b2048_r48000_z16.ts` — **Birds (Motherbird)**. Bird-vocalization corpus — chirps + textural transients. The canonical 'weird' pick: produces wildly warped output for any arbitrary input.
+- `birds_dawnchorus_b2048_r48000_z8.ts` — **Birds (Dawn Chorus)**. Dense overlapping bird vocalizations recorded at dawn. Smaller 8-dim latent — outputs lean ensemble-textural over individual calls.
+- `birds_pluma_b2048_r48000_z12.ts` — **Birds (Pluma)**. Lighter, individual bird-call timbres. Mid-size 12-dim latent balances character + clarity.
+- `humpbacks_pondbrain_b2048_r48000_z20.ts` — **Humpback Whales**. Humpback-whale song. Long, slow, hauntingly-deep vocal contours — pairs well with sustained input.
+- `marinemammals_pondbrain_b2048_r48000_z20.ts` — **Marine Mammals**. Mixed marine-mammal vocalizations — dolphins, orcas, sea-life clicks and cries.
+- `guitar_iil_b2048_r48000_z16.ts` — **Guitar (IIL)**. Acoustic / electric guitar timbre. Good demo for transferring voice or synth input into a plucked-string voice.
 - `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
+- `organ_archive_b2048_r48000_z16.ts` — **Organ (Archive)**. Historical pipe-organ recordings — broader, dustier textures than the Bach model. Good for film-score atmospheres.
+- `sax_soprano_franziskaschroeder_b2048_r48000_z20.ts` — **Soprano Sax (Schroeder)**. Soprano-saxophone extended techniques by Franziska Schroeder. Multiphonics, growls, key clicks. 20-dim latent — captures fine-grained articulation.
 - `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
+- `magnets_b2048_r48000_z8.ts` — **Magnets**. Ferromagnetic / electromagnetic resonance textures — metallic hums, distant industrial buzz, magnetized-string ringing.
+- `mrp_strengjavera_b2048_r44100_z16.ts` — **Magnetic Resonator Piano (Strengjavera)**. Magnetic Resonator Piano. Sustained metallic-string overtones produced by electromagnetically driving piano strings — 44.1 kHz.
+- `crozzoli_bigensemblesmusic_18d.ts` — **Big Ensemble Music (Crozzoli)**. Big-ensemble orchestral music (M. Crozzoli). Broad 18-dim latent for hugely-textured renders. Sample rate not embedded in filename — defaults to 48000; override via panel if needed.
 Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.