Julia K's picture

13 6

Julia K

juliak115

AI & ML interests

None yet

Recent Activity

upvoted a paper 23 days ago

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

upvoted a paper 23 days ago

daVinci-Dev: Agent-native Mid-training for Software Engineering

upvoted a paper 28 days ago

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

View all activity

Organizations

None yet

upvoted 2 papers 23 days ago

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

Paper • 2601.17640 • Published 25 days ago • 5

daVinci-Dev: Agent-native Mid-training for Software Engineering

Paper • 2601.18418 • Published 24 days ago • 124

upvoted 9 papers 28 days ago

Quantifying Speaker Embedding Phonological Rule Interactions in Accented Speech Synthesis

Paper • 2601.14417 • Published 30 days ago • 5

HeartMuLa: A Family of Open Sourced Music Foundation Models

Paper • 2601.10547 • Published Jan 15 • 42

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 47

Motion Attribution for Video Generation

Paper • 2601.08828 • Published Jan 13 • 71

mHC: Manifold-Constrained Hyper-Connections

Paper • 2512.24880 • Published Dec 31, 2025 • 307

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Paper • 2601.06943 • Published Jan 11 • 212

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 196

STEP3-VL-10B Technical Report

Paper • 2601.09668 • Published Jan 14 • 193

Rethinking Video Generation Model for the Embodied World

Paper • 2601.15282 • Published 29 days ago • 43

upvoted 2 papers 7 months ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published Aug 4, 2025 • 272

Voxlect: A Speech Foundation Model Benchmark for Modeling Dialects and Regional Languages Around the Globe

Paper • 2508.01691 • Published Aug 3, 2025 • 10

liked 6 models 9 months ago

speechbrain/spkrec-xvect-voxceleb

Audio Classification • Updated Feb 25, 2024 • 26.3k • 66

tiantiaf/whisper-large-v3-narrow-accent

Audio Classification • 2B • Updated Aug 10, 2025 • 230 • 4

tiantiaf/whisper-large-v3-msp-podcast-emotion

Audio Classification • 2B • Updated Aug 10, 2025 • 948 • 5

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated Jan 9 • 561 • 1.18k

nvidia/parakeet-tdt-0.6b-v2

Automatic Speech Recognition • Updated Nov 27, 2025 • 194k • 1.42k

mistralai/Devstral-Small-2505

24B • Updated Aug 18, 2025 • 92.9k • 861