AEmotionStudio commited on
Commit
38455ae
·
verified ·
1 Parent(s): e3b0627

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -2
README.md CHANGED
@@ -7,9 +7,22 @@ License: **CC-BY-NC-4.0** — non-commercial use only. See the upstream model ca
7
  ## Files
8
 
9
  - `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
 
 
 
 
 
 
 
 
 
 
10
  - `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
11
- - `guitar_iil_b2048_r48000_z16.ts` — **Guitar (IIL)**. Acoustic / electric guitar timbre. Good demo for transferring voice / synth input into a plucked-string voice.
12
- - `birds_motherbird_b2048_r48000_z16.ts` — **Birds (Motherbird)**. Bird-vocalization corpus chirps + textural transients. The 'weird' pick: produces wildly warped output for arbitrary input.
13
  - `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
 
 
 
14
 
15
  Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.
 
7
  ## Files
8
 
9
  - `voice_vocalset_b2048_r48000_z16.ts` — **Voice (VocalSet)**. Voice timbre trained on the VocalSet corpus — covers vocal techniques across multiple singers. Use for the canonical 'make this sound like a voice' transfer.
10
+ - `voice-multi-b2048-r48000-z11.ts` — **Voice (Multi-speaker)**. Aggregated multi-speaker voice corpus. Wider speaker diversity than VocalSet — produces more 'average human' renders.
11
+ - `voice_hifitts_b2048_r48000_z16.ts` — **Voice (HiFi-TTS)**. HiFi-TTS — high-fidelity expressive English speech corpus. Cleaner, more articulate than the multi-speaker model.
12
+ - `voice_jvs_b2048_r44100_z16.ts` — **Voice (JVS, Japanese)**. JVS — Japanese multi-speaker voice corpus at 44.1 kHz. Use for Japanese-language sources or non-Latin phoneme structure.
13
+ - `voice_vctk_b2048_r44100_z22.ts` — **Voice (VCTK, English)**. VCTK — English multi-speaker corpus from CSTR Edinburgh, 44.1 kHz. High 22-dim latent — captures more speaker idiosyncrasies.
14
+ - `birds_motherbird_b2048_r48000_z16.ts` — **Birds (Motherbird)**. Bird-vocalization corpus — chirps + textural transients. The canonical 'weird' pick: produces wildly warped output for any arbitrary input.
15
+ - `birds_dawnchorus_b2048_r48000_z8.ts` — **Birds (Dawn Chorus)**. Dense overlapping bird vocalizations recorded at dawn. Smaller 8-dim latent — outputs lean ensemble-textural over individual calls.
16
+ - `birds_pluma_b2048_r48000_z12.ts` — **Birds (Pluma)**. Lighter, individual bird-call timbres. Mid-size 12-dim latent balances character + clarity.
17
+ - `humpbacks_pondbrain_b2048_r48000_z20.ts` — **Humpback Whales**. Humpback-whale song. Long, slow, hauntingly-deep vocal contours — pairs well with sustained input.
18
+ - `marinemammals_pondbrain_b2048_r48000_z20.ts` — **Marine Mammals**. Mixed marine-mammal vocalizations — dolphins, orcas, sea-life clicks and cries.
19
+ - `guitar_iil_b2048_r48000_z16.ts` — **Guitar (IIL)**. Acoustic / electric guitar timbre. Good demo for transferring voice or synth input into a plucked-string voice.
20
  - `organ_bach_b2048_r48000_z16.ts` — **Organ (Bach)**. Pipe-organ timbre trained on Bach repertoire. Sustained harmonic textures — pairs well with melodic input.
21
+ - `organ_archive_b2048_r48000_z16.ts` — **Organ (Archive)**. Historical pipe-organ recordings broader, dustier textures than the Bach model. Good for film-score atmospheres.
22
+ - `sax_soprano_franziskaschroeder_b2048_r48000_z20.ts` — **Soprano Sax (Schroeder)**. Soprano-saxophone extended techniques by Franziska Schroeder. Multiphonics, growls, key clicks. 20-dim latent captures fine-grained articulation.
23
  - `water_pondbrain_b2048_r48000_z16.ts` — **Water (PondBrain)**. Water / aquatic textures. Treats any input as if it were running through liquid — bubbles, ripples, splashes.
24
+ - `magnets_b2048_r48000_z8.ts` — **Magnets**. Ferromagnetic / electromagnetic resonance textures — metallic hums, distant industrial buzz, magnetized-string ringing.
25
+ - `mrp_strengjavera_b2048_r44100_z16.ts` — **Magnetic Resonator Piano (Strengjavera)**. Magnetic Resonator Piano. Sustained metallic-string overtones produced by electromagnetically driving piano strings — 44.1 kHz.
26
+ - `crozzoli_bigensemblesmusic_18d.ts` — **Big Ensemble Music (Crozzoli)**. Big-ensemble orchestral music (M. Crozzoli). Broad 18-dim latent for hugely-textured renders. Sample rate not embedded in filename — defaults to 48000; override via panel if needed.
27
 
28
  Each `.ts` checkpoint is accompanied by a `<stem>.json` sidecar with name, license, sample-rate, latent-dim, source URL, and a one-line description.