Enrico Shippole

conceptofmind

·

https://www.teraflopai.com/

AI & ML interests

None yet

Recent Activity

updated a model 25 days ago

TeraflopAI/t128k_uncased_tokenizer

published a model about 1 month ago

TeraflopAI/t128k_uncased_tokenizer

reacted to tomaarsen's post with 🔥 about 2 months ago

🤗 Announcing the Ettin Reranker family: six new state-of-the-art CrossEncoder rerankers for search from 17M to 1B parameters, plus the full training data and the ~150-line recipe. Built on the Ettin ModernBERT encoders, Apache 2.0. Details: All six were trained with the same single-stage pointwise MSE distillation recipe, with mixedbread-ai/mxbai-rerank-large-v2 (1.54B) as the teacher. Only the learning rate and per-device batch size change between sizes. The 1B student matches the teacher within 0.0001 NDCG@10 on MTEB(eng, v2) Retrieval, the 150M is the strongest reranker I tested in the under-600M range, and the 17M beats the 33M ms-marco-MiniLM-L12-v2 by +0.051 NDCG@10 at roughly half the parameter count. Speed matters as much as quality for a reranker, since it determines whether the model fits the latency budget between retrieval and showing results. Our 17M is the fastest reranker in the whole comparison at 7517 pairs/sec on an H100. Our 150M runs 2.3x faster than the two other 150M ModernBERT-base rerankers (gte-reranker-modernbert-base and granite-embedding-reranker-english-r2) because the modular Transformer module propagates unpadded inputs through every layer rather than just the FA2 attention kernel. And our 1B is 2.4x faster than its 1.5B teacher while matching it on quality. I bootstrapped the training recipe with the new train-sentence-transformers Agent Skill shipped in Sentence Transformers v5.5.0. Install it with `hf skills add train-sentence-transformers --claude` and ask Claude Code (or Codex / Cursor / Gemini CLI) to fine-tune a SentenceTransformer, CrossEncoder, or SparseEncoder model on your data. I wrote a blog post walking through usage, results across six embedder pairings, the speed story, and the complete training script. Check it out, or just point your Agent to the URL: https://huggingface.co/blog/ettin-reranker Collection: https://huggingface.co/collections/cross-encoder/ettin-rerankers

View all activity

Organizations

New activity in perplexity-ai/pplx-embed-v1-0.6b 2 months ago

AttributeError: module 'transformers_modules.pplx_hyphen_embed_hyphen_v1_hyphen_0_dot_6b.654137389e247d9f.st_quantize' has no attribute 'FlexibleQuantizer'

#12 opened 2 months ago by

New activity in TeraflopAI/SEC-EDGAR 3 months ago

Corrupted files

#2 opened 3 months ago by

New activity in nvidia/parakeet-tdt-0.6b-v2 10 months ago

Bug: NumPy 2.0 breaks nvidia/parakeet-tdt-0.6b-v2

#12 opened about 1 year ago by

New activity in free-law/Caselaw_Access_Project 11 months ago

Approval?

#2 opened about 1 year ago by

New activity in common-pile/caselaw_access_project 12 months ago

Set this up with ApertureDB Croissant ingestion and build RAG

#2 opened 12 months ago by

New activity in TeraflopAI/test over 1 year ago

Set image type

#1 opened over 1 year ago by

New activity in free-law/Caselaw_Access_Project over 2 years ago

Dataset documentation

#1 opened over 2 years ago by

New activity in HuggingFaceM4/WebSight over 2 years ago

Prompt the LLM to always use Tailwind

#6 opened over 2 years ago by

New activity in 4bit/Qwen-VL-Chat-Int4 almost 3 years ago

Qwen base model weights

#1 opened almost 3 years ago by

New activity in fastllm/Qwen-7B-Chat-int8.flm almost 3 years ago

Qwen base model weights

#1 opened almost 3 years ago by

New activity in YeungNLP/firefly-qwen-7b almost 3 years ago

Base Qwen Weights

#3 opened almost 3 years ago by

New activity in X-D-Lab/MindChat-Qwen-7B almost 3 years ago

Qwen base model weights

#1 opened almost 3 years ago by

New activity in elyza/ELYZA-japanese-Llama-2-7b-fast-instruct almost 3 years ago

Average of vectors

#2 opened almost 3 years ago by

New activity in openerotica/Qwen-7B-Chat-GPTQ almost 3 years ago

Qwen base model weights

#1 opened almost 3 years ago by