Egor Petrov

moderntalker

2 1

https://moderntalker.github.io/

AI & ML interests

None yet

Recent Activity

authored a paper 2 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

upvoted a paper 3 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

updated a model 3 days ago

moderntalker/efficient_pretrain_checkpoints

View all activity

Organizations

None yet

authored a paper 2 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 4 days ago • 20

upvoted a paper 3 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 4 days ago • 20

updated a model 3 days ago

moderntalker/efficient_pretrain_checkpoints

Updated 3 days ago

published a model about 2 months ago

moderntalker/efficient_pretrain_checkpoints

Updated 3 days ago

liked a model 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 11 days ago • 1.18M • • 5.13k

upvoted a paper 3 months ago

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Paper • 2604.01161 • Published Apr 1 • 32