Norelec
Norelec7
AI & ML interests
None yet
Organizations
Update dataset card for v2: Scylla v2 + SP4096/8192/12288/16384
1
#4 opened 2 months ago
by
Norelec7
Add fineweb_sp16384/ (SentencePiece BPE 16384) (fineweb_train_000000.bin)
1
#9 opened 2 months ago
by
Norelec7
Add fineweb_sp12288/ (SentencePiece BPE 12288) (fineweb_train_000000.bin)
1
#8 opened 2 months ago
by
Norelec7
Add fineweb_byte260/ (pure-byte PureByteTokenizer pad=256 bos=257) (fineweb_train_000000.bin)
1
#7 opened 2 months ago
by
Norelec7
Add fineweb_sp4096/ (SentencePiece BPE 4096) (fineweb_train_000000.bin)
1
#6 opened 2 months ago
by
Norelec7
Add fineweb_sp8192/ (SentencePiece BPE 8192) (fineweb_train_000000.bin)
1
#5 opened 2 months ago
by
Norelec7
Add fineweb_scylla_v2/ (corrected byte-exact 1254-token vocab, PR #1314)
1
#3 opened 2 months ago
by
Norelec7
Add tokenizers/scylla_v2/scylla.meta.npz (PR #1314 byte-exact Scylla)
1
#2 opened 2 months ago
by
Norelec7
Add tokenizers/scylla_v2/scylla.yaml (PR #1314 byte-exact Scylla)
1
#1 opened 2 months ago
by
Norelec7
Add fineweb_sp16384/ (SentencePiece BPE 16384) (fineweb_train_000000.bin)
1
#9 opened 2 months ago
by
Norelec7
Add fineweb_sp12288/ (SentencePiece BPE 12288) (fineweb_train_000000.bin)
1
#8 opened 2 months ago
by
Norelec7
Add fineweb_byte260/ (pure-byte PureByteTokenizer pad=256 bos=257) (fineweb_train_000000.bin)
1
#7 opened 2 months ago
by
Norelec7
Add fineweb_sp4096/ (SentencePiece BPE 4096) (fineweb_train_000000.bin)
1
#6 opened 2 months ago
by
Norelec7
Add fineweb_sp8192/ (SentencePiece BPE 8192) (fineweb_train_000000.bin)
1
#5 opened 2 months ago
by
Norelec7
Update dataset card for v2: Scylla v2 + SP4096/8192/12288/16384
1
#4 opened 2 months ago
by
Norelec7
Add fineweb_scylla_v2/ (corrected byte-exact 1254-token vocab, PR #1314)
1
#3 opened 2 months ago
by
Norelec7
Add tokenizers/scylla_v2/scylla.meta.npz (PR #1314 byte-exact Scylla)
1
#2 opened 2 months ago
by
Norelec7
Add tokenizers/scylla_v2/scylla.yaml (PR #1314 byte-exact Scylla)
1
#1 opened 2 months ago
by
Norelec7