Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up
Egor Petrov's picture

Egor Petrov

moderntalker
2 1
https://moderntalker.github.io/
  • egor__petrov
  • modernTalker

AI & ML interests

None yet

Recent Activity

authored a paper 2 days ago
One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining
upvoted a paper 3 days ago
One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining
updated a model 3 days ago
moderntalker/efficient_pretrain_checkpoints
View all activity

Organizations

None yet

authored a paper 2 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 4 days ago • 20
upvoted a paper 3 days ago

One-Step Gradient Delay is Not a Barrier for Large-Scale Asynchronous Pipeline Parallel LLM Pretraining

Paper • 2606.30634 • Published 4 days ago • 20
updated a model 3 days ago

moderntalker/efficient_pretrain_checkpoints

Updated 3 days ago
published a model about 2 months ago

moderntalker/efficient_pretrain_checkpoints

Updated 3 days ago
liked a model 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 11 days ago • 1.18M • • 5.13k
upvoted a paper 3 months ago

Reasoning Shift: How Context Silently Shortens LLM Reasoning

Paper • 2604.01161 • Published Apr 1 • 32
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs