VIT: Optimized for Qualcomm Devices

VIT is a machine learning model that can classify images from the Imagenet dataset. It can also be used as a backbone in building more complex models for specific use cases.

This is based on the implementation of VIT found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
ONNX w8a16 Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
ONNX w8a8 Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
ONNX w8a8_mixed_int16 Universal QAIRT 2.42, ONNX Runtime 1.25.0 Download
QNN_DLC float Universal QAIRT 2.45 Download
QNN_DLC w8a16 Universal QAIRT 2.45 Download
QNN_DLC w8a8 Universal QAIRT 2.45 Download
TFLITE float Universal QAIRT 2.45 Download
TFLITE w8a8 Universal QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit VIT on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for VIT on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.image_classification

Model Stats:

  • Model checkpoint: Imagenet
  • Input resolution: 224x224
  • Number of parameters: 86.6M
  • Model size (float): 330 MB
  • Model size (w8a16): 86.2 MB
  • Model size (w8a8): 83.2 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
VIT ONNX float Snapdragon® 8 Elite Gen 5 Mobile 2.697 ms 1 - 339 MB NPU
VIT ONNX float Snapdragon® X2 Elite 3.034 ms 181 - 181 MB NPU
VIT ONNX float Snapdragon® X Elite 7.464 ms 170 - 170 MB NPU
VIT ONNX float Snapdragon® 8 Gen 3 Mobile 5.029 ms 1 - 364 MB NPU
VIT ONNX float Qualcomm® QCS8550 (Proxy) 7.121 ms 0 - 203 MB NPU
VIT ONNX float Snapdragon® 8 Elite For Galaxy Mobile 3.392 ms 0 - 338 MB NPU
VIT ONNX float Qualcomm® QCS9075 10.178 ms 1 - 46 MB NPU
VIT ONNX float Qualcomm® QCS8750 3.392 ms 0 - 338 MB NPU
VIT ONNX float Qualcomm® QCS7181 7.464 ms 170 - 170 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Elite Gen 5 Mobile 3.119 ms 0 - 288 MB NPU
VIT ONNX w8a16 Snapdragon® X2 Elite 3.182 ms 182 - 182 MB NPU
VIT ONNX w8a16 Snapdragon® X Elite 8.564 ms 150 - 150 MB NPU
VIT ONNX w8a16 Snapdragon® 8 Gen 3 Mobile 5.714 ms 0 - 362 MB NPU
VIT ONNX w8a16 Qualcomm® QCS6490 1103.987 ms 45 - 59 MB CPU
VIT ONNX w8a16 Qualcomm® QCS8550 (Proxy) 8.233 ms 0 - 114 MB NPU
VIT ONNX w8a16 Snapdragon® 7 Gen 4 Mobile 578.785 ms 58 - 74 MB CPU
VIT ONNX w8a16 Snapdragon® 8 Elite For Galaxy Mobile 4.399 ms 0 - 281 MB NPU
VIT ONNX w8a16 Qualcomm® QCM6690 590.634 ms 98 - 116 MB CPU
VIT ONNX w8a16 Qualcomm® QCS9075 8.765 ms 0 - 45 MB NPU
VIT ONNX w8a16 Qualcomm® QCS7790 578.785 ms 58 - 74 MB CPU
VIT ONNX w8a16 Qualcomm® QCS8750 4.399 ms 0 - 281 MB NPU
VIT ONNX w8a16 Qualcomm® QCS7181 8.564 ms 150 - 150 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.57 ms 0 - 425 MB NPU
VIT ONNX w8a8 Snapdragon® X Elite 13.583 ms 151 - 151 MB NPU
VIT ONNX w8a8 Snapdragon® 8 Gen 3 Mobile 8.856 ms 0 - 537 MB NPU
VIT ONNX w8a8 Qualcomm® QCS6490 300.53 ms 20 - 68 MB CPU
VIT ONNX w8a8 Qualcomm® QCS8550 (Proxy) 13.037 ms 0 - 162 MB NPU
VIT ONNX w8a8 Qualcomm® QCS9075 13.86 ms 0 - 45 MB NPU
VIT ONNX w8a8 Snapdragon® 7 Gen 4 Mobile 124.909 ms 36 - 58 MB CPU
VIT ONNX w8a8 Snapdragon® 8 Elite For Galaxy Mobile 6.999 ms 0 - 427 MB NPU
VIT ONNX w8a8 Qualcomm® QCM6690 132.52 ms 21 - 44 MB CPU
VIT ONNX w8a8 Qualcomm® QCS7790 124.909 ms 36 - 58 MB CPU
VIT ONNX w8a8 Qualcomm® QCS8750 6.999 ms 0 - 427 MB NPU
VIT ONNX w8a8 Qualcomm® QCS7181 13.583 ms 151 - 151 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite Gen 5 Mobile 50.474 ms 47 - 345 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Gen 3 Mobile 69.609 ms 57 - 452 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS6490 758.29 ms 99 - 128 MB CPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS9075 105.247 ms 52 - 110 MB NPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 7 Gen 4 Mobile 380.895 ms 92 - 115 MB CPU
VIT ONNX w8a8_mixed_int16 Snapdragon® 8 Elite For Galaxy Mobile 59.936 ms 60 - 340 MB NPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCM6690 395.353 ms 103 - 124 MB CPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS7790 380.895 ms 92 - 115 MB CPU
VIT ONNX w8a8_mixed_int16 Qualcomm® QCS8750 59.936 ms 60 - 340 MB NPU
VIT QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 2.919 ms 1 - 156 MB NPU
VIT QNN_DLC float Snapdragon® X2 Elite 3.522 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® X Elite 8.311 ms 1 - 1 MB NPU
VIT QNN_DLC float Snapdragon® 8 Gen 3 Mobile 5.402 ms 0 - 249 MB NPU
VIT QNN_DLC float Qualcomm® QCS8275 34.803 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® QCS8550 (Proxy) 7.711 ms 1 - 3 MB NPU
VIT QNN_DLC float Qualcomm® SA8775P 10.485 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® SA8650P 10.485 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® SA8255P 10.485 ms 1 - 168 MB NPU
VIT QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 3.744 ms 0 - 155 MB NPU
VIT QNN_DLC float Qualcomm® QCS8450 (Proxy) 13.028 ms 0 - 228 MB NPU
VIT QNN_DLC float Qualcomm® SA7255P 34.803 ms 1 - 168 MB NPU
VIT QNN_DLC float Qualcomm® QCS9075 10.64 ms 3 - 5 MB NPU
VIT QNN_DLC float Qualcomm® SA8295P 12.955 ms 1 - 145 MB NPU
VIT QNN_DLC float Qualcomm® QCS8750 3.744 ms 0 - 155 MB NPU
VIT QNN_DLC float Qualcomm® QCS7181 8.311 ms 1 - 1 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Elite Gen 5 Mobile 2.83 ms 0 - 276 MB NPU
VIT QNN_DLC w8a16 Snapdragon® X2 Elite 3.582 ms 0 - 0 MB NPU
VIT QNN_DLC w8a16 Snapdragon® X Elite 8.281 ms 0 - 0 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Gen 3 Mobile 5.079 ms 0 - 320 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8275 17.354 ms 0 - 258 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8550 (Proxy) 7.654 ms 0 - 2 MB NPU
VIT QNN_DLC w8a16 Qualcomm® SA8775P 7.768 ms 0 - 259 MB NPU
VIT QNN_DLC w8a16 Qualcomm® SA8650P 7.768 ms 0 - 259 MB NPU
VIT QNN_DLC w8a16 Qualcomm® SA8255P 7.768 ms 0 - 259 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 7 Gen 4 Mobile 13.167 ms 0 - 403 MB NPU
VIT QNN_DLC w8a16 Snapdragon® 8 Elite For Galaxy Mobile 3.903 ms 0 - 260 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCM6690 119.493 ms 0 - 404 MB NPU
VIT QNN_DLC w8a16 Qualcomm® SA7255P 17.354 ms 0 - 258 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS9075 7.949 ms 2 - 4 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS7790 13.167 ms 0 - 403 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS8750 3.903 ms 0 - 260 MB NPU
VIT QNN_DLC w8a16 Qualcomm® QCS7181 8.281 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Elite Gen 5 Mobile 3.493 ms 0 - 209 MB NPU
VIT QNN_DLC w8a8 Snapdragon® X2 Elite 4.472 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® X Elite 10.247 ms 0 - 0 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Gen 3 Mobile 6.51 ms 0 - 306 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS6490 50.478 ms 0 - 2 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8275 28.499 ms 0 - 202 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8550 (Proxy) 9.598 ms 0 - 7 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA8775P 8.655 ms 0 - 202 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA8650P 8.655 ms 0 - 202 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA8255P 8.655 ms 0 - 202 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS9075 9.409 ms 0 - 2 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8450 (Proxy) 12.938 ms 0 - 311 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 7 Gen 4 Mobile 19.767 ms 0 - 355 MB NPU
VIT QNN_DLC w8a8 Snapdragon® 8 Elite For Galaxy Mobile 5.438 ms 0 - 203 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCM6690 221.972 ms 2 - 498 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA8295P 15.579 ms 0 - 212 MB NPU
VIT QNN_DLC w8a8 Qualcomm® SA7255P 28.499 ms 0 - 202 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS7790 19.767 ms 0 - 355 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS8750 5.438 ms 0 - 203 MB NPU
VIT QNN_DLC w8a8 Qualcomm® QCS7181 10.247 ms 0 - 0 MB NPU
VIT TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 3.029 ms 0 - 169 MB NPU
VIT TFLITE float Snapdragon® 8 Gen 3 Mobile 5.285 ms 0 - 311 MB NPU
VIT TFLITE float Qualcomm® QCS8275 34.603 ms 0 - 168 MB NPU
VIT TFLITE float Qualcomm® QCS8550 (Proxy) 7.235 ms 0 - 3 MB NPU
VIT TFLITE float Qualcomm® SA8775P 10.172 ms 0 - 184 MB NPU
VIT TFLITE float Qualcomm® SA8650P 10.172 ms 0 - 184 MB NPU
VIT TFLITE float Qualcomm® SA8255P 10.172 ms 0 - 184 MB NPU
VIT TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 3.684 ms 0 - 176 MB NPU
VIT TFLITE float Qualcomm® QCS8450 (Proxy) 12.977 ms 0 - 291 MB NPU
VIT TFLITE float Qualcomm® SA7255P 34.603 ms 0 - 168 MB NPU
VIT TFLITE float Qualcomm® QCS9075 10.819 ms 0 - 173 MB NPU
VIT TFLITE float Qualcomm® SA8295P 13.28 ms 0 - 156 MB NPU
VIT TFLITE float Qualcomm® QCS8750 3.684 ms 0 - 176 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite Gen 5 Mobile 4.536 ms 0 - 379 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Gen 3 Mobile 8.66 ms 0 - 486 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS6490 133.07 ms 1 - 100 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8275 35.05 ms 0 - 385 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8550 (Proxy) 12.383 ms 0 - 3 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8775P 12.217 ms 0 - 388 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8650P 12.217 ms 0 - 388 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8255P 12.217 ms 0 - 388 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS9075 13.795 ms 0 - 88 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8450 (Proxy) 20.321 ms 0 - 448 MB NPU
VIT TFLITE w8a8 Snapdragon® 7 Gen 4 Mobile 29.571 ms 1 - 214 MB NPU
VIT TFLITE w8a8 Snapdragon® 8 Elite For Galaxy Mobile 6.972 ms 0 - 380 MB NPU
VIT TFLITE w8a8 Qualcomm® QCM6690 202.521 ms 2 - 282 MB NPU
VIT TFLITE w8a8 Qualcomm® SA8295P 19.243 ms 0 - 345 MB NPU
VIT TFLITE w8a8 Qualcomm® SA7255P 35.05 ms 0 - 385 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS7790 29.571 ms 1 - 214 MB NPU
VIT TFLITE w8a8 Qualcomm® QCS8750 6.972 ms 0 - 380 MB NPU

License

  • The license for the original implementation of VIT can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for qualcomm/VIT