Initial upload: distiluse-base-multilingual-cased-v2 XNNPACK fp32 for RNE v0.9.0
Browse files- .gitattributes +1 -0
- README.md +33 -0
- config.json +25 -0
- tokenizer.json +0 -0
- tokenizer_config.json +1 -0
- xnnpack/distiluse-base-multilingual-cased-v2_xnnpack_fp32.pte +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
xnnpack/distiluse-base-multilingual-cased-v2_xnnpack_fp32.pte filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Introduction
|
| 6 |
+
|
| 7 |
+
This repository hosts the [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2/tree/main) model for the [React Native ExecuTorch](https://www.npmjs.com/package/react-native-executorch) library. It includes the model exported for xnnpack in `.pte` format, ready for use in the **ExecuTorch** runtime.
|
| 8 |
+
|
| 9 |
+
If you'd like to run this model in your own ExecuTorch runtime, refer to the [official documentation](https://pytorch.org/executorch/stable/index.html) for setup instructions.
|
| 10 |
+
|
| 11 |
+
## Compatibility
|
| 12 |
+
|
| 13 |
+
If you intend to use this model outside of React Native ExecuTorch, make sure your runtime is compatible with the **ExecuTorch** version used to export the `.pte` files. For more details, see the compatibility note in the [ExecuTorch GitHub repository](https://github.com/pytorch/executorch/blob/11d1742fdeddcf05bc30a6cfac321d2a2e3b6768/runtime/COMPATIBILITY.md?plain=1#L4). If you work with React Native ExecuTorch, the constants from the library will guarantee compatibility with the runtime used behind the scenes.
|
| 14 |
+
|
| 15 |
+
This model was exported using React Native ExecuTorch `v0.9.0`, which ships an ExecuTorch runtime derived from the `v1.2.0` release branch. **No forward compatibility** is guaranteed — older versions of the runtime may not work with these files.
|
| 16 |
+
|
| 17 |
+
## Repository Structure
|
| 18 |
+
|
| 19 |
+
- `xnnpack/distiluse-base-multilingual-cased-v2_xnnpack_fp32.pte` — ExecuTorch program partitioned for the XNNPACK delegate, fp32. Wire this to the `modelSource` argument.
|
| 20 |
+
- `tokenizer.json` — HuggingFace fast-tokenizer dump (WordPiece + BertNormalizer). Wire this to `tokenizerSource`.
|
| 21 |
+
- `config.json`, `tokenizer_config.json` — upstream model/tokenizer configs, kept for reference and for non-RNE consumers.
|
| 22 |
+
|
| 23 |
+
## Model details
|
| 24 |
+
|
| 25 |
+
- Architecture: DistilBERT multilingual cased + mean pooling + Dense (768→512, Tanh) + L2 norm.
|
| 26 |
+
- Output dimension: **512**.
|
| 27 |
+
- Max sequence length: **126** tokens (128 − 2 for `[CLS]` / `[SEP]`).
|
| 28 |
+
- Languages: 50+ (multilingual).
|
| 29 |
+
- Typical strength: cross-lingual sentence similarity and medium-length sentence retrieval. Short single-word queries in non-English languages are this model's weakest case — for those, longer sentences and/or English inputs give markedly better ranking.
|
| 30 |
+
|
| 31 |
+
## Export notes
|
| 32 |
+
|
| 33 |
+
The exported program skips HuggingFace's internal attention-mask-to-4D conversion because the RNE runtime never pads at inference (single sentence, no batching). This preserves bit-exactness with the PyTorch reference (RMSE 0 on fp32 random input) while trimming ~27% off the forward wall-time and keeping XNNPACK delegation around 89–91% of graph runtime.
|
config.json
ADDED
|
@@ -0,0 +1,25 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"_name_or_path": "old_models/distiluse-base-multilingual-cased-v2/0_DistilBERT",
|
| 3 |
+
"activation": "gelu",
|
| 4 |
+
"architectures": [
|
| 5 |
+
"DistilBertModel"
|
| 6 |
+
],
|
| 7 |
+
"attention_dropout": 0.1,
|
| 8 |
+
"dim": 768,
|
| 9 |
+
"dropout": 0.1,
|
| 10 |
+
"hidden_dim": 3072,
|
| 11 |
+
"initializer_range": 0.02,
|
| 12 |
+
"max_position_embeddings": 512,
|
| 13 |
+
"model_type": "distilbert",
|
| 14 |
+
"n_heads": 12,
|
| 15 |
+
"n_layers": 6,
|
| 16 |
+
"output_hidden_states": true,
|
| 17 |
+
"output_past": true,
|
| 18 |
+
"pad_token_id": 0,
|
| 19 |
+
"qa_dropout": 0.1,
|
| 20 |
+
"seq_classif_dropout": 0.2,
|
| 21 |
+
"sinusoidal_pos_embds": false,
|
| 22 |
+
"tie_weights_": true,
|
| 23 |
+
"transformers_version": "4.7.0",
|
| 24 |
+
"vocab_size": 119547
|
| 25 |
+
}
|
tokenizer.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
{"do_lower_case": false, "unk_token": "[UNK]", "sep_token": "[SEP]", "pad_token": "[PAD]", "cls_token": "[CLS]", "mask_token": "[MASK]", "tokenize_chinese_chars": true, "strip_accents": null, "max_len": 512, "special_tokens_map_file": "/home/reimers/.cache/torch/sentence_transformers/sbert.net_models_distiluse-base-multilingual-cased/0_DistilBERT/special_tokens_map.json", "full_tokenizer_file": null, "name_or_path": "old_models/distiluse-base-multilingual-cased-v2/0_DistilBERT", "do_basic_tokenize": true, "never_split": null}
|
xnnpack/distiluse-base-multilingual-cased-v2_xnnpack_fp32.pte
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:9cb996370e33f8e76b1e597cdab904bf4562dcdd0237efcda41eacadba87a0a0
|
| 3 |
+
size 540641536
|