6 5 38

Alexandros Liapatis

alexliap

AI & ML interests

Generative AI + Traditional ML

Recent Activity

upvoted a collection 7 days ago

ILSP Greek Evaluation Suite

liked a model 10 days ago

google/diffusiongemma-26B-A4B-it

new activity 12 days ago

nvidia/nemotron-3.5-asr-streaming-0.6b:Lora funetuning

View all activity

Organizations

upvoted a collection 7 days ago

ILSP Greek Evaluation Suite

Collection

A collection of test sets for evaluating base and chat LLMs (incl. VLMs) on Greek generation and understanding capabilities • 23 items • Updated 7 days ago • 7

liked a model 10 days ago

google/diffusiongemma-26B-A4B-it

Image-Text-to-Text • 26B • Updated 12 days ago • 874k • 1.04k

New activity in nvidia/nemotron-3.5-asr-streaming-0.6b 12 days ago

Lora funetuning

#10 opened 12 days ago by

alexliap

reacted to eabdullin's post with 🤗 12 days ago

Post

5708

I’m doing a PhD in AI, which sounds impressive until you realize it mostly means I spend three years trying to make a computer say something slightly less stupid than it said yesterday.

People hear "AI researcher" and they think I’m building the future. No. I’m in a basement at 2 a.m. Googling, "CUDA error what the f**k does this mean."

And the worst part about AI research now is compute. You don’t even ask, "Is this idea good?" anymore. You ask, "Can I afford for this idea to be wrong?"

My advisor comes to me one day and says, "I think we should fine-tune our own language model."

I said, "Professor, with what money? I’m a PhD student. I have two bank accounts: checking and emotionally checking."

He goes, "Don’t worry. We have compute."

Now, in academia, "don’t worry" is never the beginning of a good sentence.

I said, "What do you mean we have compute?"

He said, "My friend knows the cluster admin. He can get us on the GPUs."

I said, "Okay… what do we have to do?"

He goes, "Nothing crazy. Just be very grateful in the acknowledgements."

I said, "How grateful?"

He said, "Maybe put him as co-author."

I said, "Co-author? Are we using the cluster, or is the cluster using us?"

Because at that point, that’s not a favor. That’s academic child support.

So I go to the server room, and the cluster admin walks up to me and goes, "So you’re the NLP student."

And in my head I’m like, "No, tonight you’re the principal investigator. You’re the provider. I’m just a little token waiting to be attended to."

Because whoever controls the GPUs controls the relationship. That’s lab romance.

He starts setting things up, and I’m trying to act casual, but I don’t understand any of the numbers he’s saying.

He’s like, "Yeah, I can probably give you four H100s for the weekend."

I’m nodding like, "Mmm. Four. Weekend. H. One hundred. Absolutely."

Inside I’m like, "Is that good? Is that prison time? Why did he say it like he was offering me organs?"

[Continue in comments...]

1 reply

liked a model 13 days ago

Supertone/supertonic-3

Text-to-Speech • Updated May 18 • 55.5k • 846

liked a model 14 days ago

xxrickyxx/Ailo152m-v2

Text Generation • 0.2B • Updated 17 days ago • 935 • 2

liked a model 19 days ago

lightonai/LightOnOCR-2-1B-base

Image-Text-to-Text • 1B • Updated Jan 21 • 7.34k • 14

updated a dataset 20 days ago

alexliap/greek-synth-v1

Viewer • Updated 20 days ago • 2.06M • 162 • 1

liked a dataset 21 days ago

openbmb/UltraData-SFT-2605

Updated 25 days ago • 47.9k • 350

New activity in alexliap/greek-synth-v1 29 days ago

[bot] Conversion to Parquet

#1 opened 30 days ago by

parquet-converter

upvoted a paper about 1 month ago

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Paper • 2605.22791 • Published May 21 • 33

published a dataset about 1 month ago

alexliap/greek-synth-v1

Viewer • Updated 20 days ago • 2.06M • 162 • 1

liked a dataset about 1 month ago

Crownelius/High-Coder-SFT-Medium

Preview • Updated Mar 16 • 90 • 12

reacted to Crownelius's post with 🔥 about 1 month ago

Post

4687

Howdy,
CompactAI-O is launching a tiny Model Golf, and the winner walks away with $50 in RunPod credits. Monthly. Every month. Show up, build, somebody wins.

What it is

Build the best language model you can under 100 million parameters, with at least a 1028-token context window. That's it. Any architecture, any tokenizer, any training scheme you can dream up at 3am. The only catch is it's gotta be open source (MIT, GPL, Apache, AGPL) take your pick.

It scratches the same itch as a Kaggle comp without the dataset\leaderboard nonsense. No fixed benchmark to game. No llama.cpp compatibility hoops. If you wanna train a 50M-param MoE with five experts and a tokenizer built on cookbooks, you can do that. Nothing stopping you.

The rules are listed in the discord and on the organization page if you're interested.

Why $50????

It's symbolic. It ain't gonna make anyone rich. But it's enough to cover a weekend of GPU time, enough to keep enthusiasts coming back, and not so much that it pulls in people who are just there for the money. Enthusiasts build interesting things. Interesting things move the field forward. A little incentive. I'd do it for $50 lol.

How to join

First round opens soon. Landing page is here:

→ https://huggingface.co/spaces/CompactAI-O/Tiny-model-golf

For questions or to swap ideas, the Discord's open:

→ https://discord.gg/y2jTct6Cxv

Excited to see what yall come up with. ♥

— Shane

8 replies

New activity in ilsp/llama-krikri-8b-ag-mg-qlora about 1 month ago

Dataset sources

#1 opened about 1 month ago by

alexliap

reacted to HannesVonEssen's post with ❤️ about 1 month ago

Post

11649

📣 Hugging Face Visualizer, now as Chrome extension!
https://hfviewer.com

✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!

🔗 Link:
https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej

Thanks for all the nice feedback so far! ❤️

5 replies

liked a model about 1 month ago

ilsp/llama-krikri-8b-ag-mg-qlora

Translation • Updated May 14 • 2

reacted to qgallouedec's post with 🔥 about 1 month ago

Post

10395

Shipped hf-sandbox! 🥡

🧪 Running an eval that executes model-generated C on a few thousand prompts? You probably don't want any of that on your laptop.
Just shipped hf-sandbox, a Modal-style sandbox API on top of Hugging Face Jobs. Spin up an isolated, ephemeral container, run untrusted code, get the result back. No Docker on your laptop, no infra to manage.

Just pip install hf-sandbox.

Early days (v0.1); feedback and issues very welcome:
👉 https://github.com/huggingface/hf-sandbox