HAQ NAWAZ MALIK's picture

Building on HF

HAQ NAWAZ MALIK

Omarrran

·

https://haq-nawaz-malik.github.io/

AI & ML interests

None yet

Recent Activity

updated a model about 16 hours ago

Omarrran/Kashmiri_Tokenizers

published a model about 16 hours ago

Omarrran/Kashmiri_Tokenizers

authored a paper 3 days ago

ks-pret-5m: a 5 million word, 12 million token kashmiri pretraining dataset

View all activity

Organizations

authored a paper 3 days ago

ks-pret-5m: a 5 million word, 12 million token kashmiri pretraining dataset

Paper • 2604.11066 • Published 4 days ago

authored 3 papers 3 months ago

synthocr-gen: A synthetic ocr dataset generator for low-resource languages- breaking the data barrier

Paper • 2601.16113 • Published Jan 22

ks-lit-3m: A 3.1 million word kashmiri text dataset for large language model pretraining

Paper • 2601.01091 • Published Jan 3

600k-ks-ocr: a large-scale synthetic dataset for optical character recognition in kashmiri script

Paper • 2601.01088 • Published Jan 3