Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
64
3
2
Enrico Shippole
conceptofmind
Follow
KikuDesai's profile picture
bachnhien0's profile picture
a480's profile picture
160 followers
·
3 following
https://www.teraflopai.com/
EnricoShippole
conceptofmind
AI & ML interests
None yet
Recent Activity
reacted
to
tomaarsen
's
post
with 🔥
about 2 hours ago
🤗 Announcing the Ettin Reranker family: six new state-of-the-art CrossEncoder rerankers for search from 17M to 1B parameters, plus the full training data and the ~150-line recipe. Built on the Ettin ModernBERT encoders, Apache 2.0. Details: All six were trained with the same single-stage pointwise MSE distillation recipe, with mixedbread-ai/mxbai-rerank-large-v2 (1.54B) as the teacher. Only the learning rate and per-device batch size change between sizes. The 1B student matches the teacher within 0.0001 NDCG@10 on MTEB(eng, v2) Retrieval, the 150M is the strongest reranker I tested in the under-600M range, and the 17M beats the 33M ms-marco-MiniLM-L12-v2 by +0.051 NDCG@10 at roughly half the parameter count. Speed matters as much as quality for a reranker, since it determines whether the model fits the latency budget between retrieval and showing results. Our 17M is the fastest reranker in the whole comparison at 7517 pairs/sec on an H100. Our 150M runs 2.3x faster than the two other 150M ModernBERT-base rerankers (gte-reranker-modernbert-base and granite-embedding-reranker-english-r2) because the modular Transformer module propagates unpadded inputs through every layer rather than just the FA2 attention kernel. And our 1B is 2.4x faster than its 1.5B teacher while matching it on quality. I bootstrapped the training recipe with the new train-sentence-transformers Agent Skill shipped in Sentence Transformers v5.5.0. Install it with `hf skills add train-sentence-transformers --claude` and ask Claude Code (or Codex / Cursor / Gemini CLI) to fine-tune a SentenceTransformer, CrossEncoder, or SparseEncoder model on your data. I wrote a blog post walking through usage, results across six embedder pairings, the speed story, and the complete training script. Check it out, or just point your Agent to the URL: https://huggingface.co/blog/ettin-reranker Collection: https://huggingface.co/collections/cross-encoder/ettin-rerankers
updated
a dataset
about 5 hours ago
TeraflopAI/caselaw-evaluation
published
a dataset
about 8 hours ago
TeraflopAI/caselaw-evaluation
View all activity
Organizations
conceptofmind
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
perplexity-ai/pplx-embed-v1-0.6b
10 days ago
AttributeError: module 'transformers_modules.pplx_hyphen_embed_hyphen_v1_hyphen_0_dot_6b.654137389e247d9f.st_quantize' has no attribute 'FlexibleQuantizer'
#12 opened 10 days ago by
conceptofmind
New activity in
TeraflopAI/SEC-EDGAR
about 1 month ago
Corrupted files
2
#2 opened about 1 month ago by
Draknof
New activity in
nvidia/parakeet-tdt-0.6b-v2
8 months ago
Bug: NumPy 2.0 breaks nvidia/parakeet-tdt-0.6b-v2
9
#12 opened about 1 year ago by
david44099
New activity in
free-law/Caselaw_Access_Project
9 months ago
Approval?
👍
➕
3
8
#2 opened 11 months ago by
waters17
New activity in
common-pile/caselaw_access_project
10 months ago
Set this up with ApertureDB Croissant ingestion and build RAG
1
#2 opened 10 months ago by
vishakha041
New activity in
TeraflopAI/test
over 1 year ago
Set image type
#1 opened over 1 year ago by
lhoestq
New activity in
free-law/Caselaw_Access_Project
about 2 years ago
Dataset documentation
8
#1 opened about 2 years ago by
yjernite
New activity in
HuggingFaceM4/WebSight
over 2 years ago
Prompt the LLM to always use Tailwind
❤️
11
2
#6 opened over 2 years ago by
julien-c
New activity in
4bit/Qwen-VL-Chat-Int4
over 2 years ago
Qwen base model weights
1
#1 opened over 2 years ago by
conceptofmind
New activity in
fastllm/Qwen-7B-Chat-int8.flm
over 2 years ago
Qwen base model weights
#1 opened over 2 years ago by
conceptofmind
New activity in
YeungNLP/firefly-qwen-7b
over 2 years ago
Base Qwen Weights
#3 opened over 2 years ago by
conceptofmind
New activity in
X-D-Lab/MindChat-Qwen-7B
over 2 years ago
Qwen base model weights
2
#1 opened over 2 years ago by
conceptofmind
New activity in
elyza/ELYZA-japanese-Llama-2-7b-fast-instruct
over 2 years ago
Average of vectors
5
#2 opened over 2 years ago by
conceptofmind
New activity in
openerotica/Qwen-7B-Chat-GPTQ
over 2 years ago
Qwen base model weights
3
#1 opened over 2 years ago by
conceptofmind
Load more