Sync license-plate-recognition from metro-analytics-catalog

Browse files

Files changed (3) hide show

LICENSE +56 -0
README.md +230 -5
export_and_quantize.sh +89 -0

LICENSE CHANGED Viewed

	@@ -0,0 +1,56 @@

+This directory contains two categories of content under different licenses.
+Scripts and Documentation
+-------------------------
+The scripts (export_and_quantize.sh) and documentation (README.md) in this
+directory are original works by Intel Corporation, licensed under the
+MIT License.
+    Copyright (C) Intel Corporation
+    Permission is hereby granted, free of charge, to any person obtaining a copy
+    of this software and associated documentation files (the "Software"), to deal
+    in the Software without restriction, including without limitation the rights
+    to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+    copies of the Software, and to permit persons to whom the Software is
+    furnished to do so, subject to the following conditions:
+    The above copyright notice and this permission notice shall be included in
+    all copies or substantial portions of the Software.
+    THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+    IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+    FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+    AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+    LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+    OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+    THE SOFTWARE.
+License Plate Detector Model (yolov8_license_plate_detector)
+------------------------------------------------------------
+The YOLOv8 license plate detector weights are distributed by the Intel Edge
+AI Resources project and are based on the Ultralytics YOLOv8 framework,
+licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
+    Source:  https://github.com/open-edge-platform/edge-ai-resources
+    Upstream framework: https://github.com/ultralytics/ultralytics
+    License: https://github.com/ultralytics/ultralytics/blob/main/LICENSE
+    Docs:    https://docs.ultralytics.com/models/yolov8/
+Users must comply with the AGPL-3.0 license terms when using, modifying,
+or distributing the YOLOv8 model weights or Ultralytics software.
+For commercial licensing options, see https://www.ultralytics.com/license.
+OCR Model (ch_PP-OCRv4_rec_infer)
+---------------------------------
+The PaddleOCR PP-OCRv4 recognition model is developed by PaddlePaddle and
+licensed under the Apache License, Version 2.0.
+    Source:  https://github.com/PaddlePaddle/PaddleOCR
+    License: https://github.com/PaddlePaddle/PaddleOCR/blob/main/LICENSE

README.md CHANGED Viewed

@@ -1,5 +1,230 @@
----
-license: other
-license_name: other
-license_link: LICENSE
----

+# License Plate Recognition -- Detection and OCR on Intel Hardware
+> **Reference pipeline:** [DLStreamer License Plate Recognition sample](https://github.com/open-edge-platform/dlstreamer/tree/main/samples/gstreamer/gst_launch/license_plate_recognition)
+>
+> **Validated with:** OpenVINO 2026.0.0, NNCF 3.0.0, DLStreamer 2025.2, Python 3.11+
+| Property | Value |
+|---|---|
+| **Category** | Object Detection + Optical Character Recognition |
+| **Source Framework** | PyTorch (Ultralytics YOLOv8), PaddlePaddle (PP-OCRv4) |
+| **Supported Precisions** | FP32, FP16-INT8 (detector) |
+| **Inference Engine** | OpenVINO |
+| **Hardware** | CPU, GPU, NPU |
+---
+## Overview
+License Plate Recognition (LPR) is a Metro Analytics use case that locates vehicle license plates in a video stream and reads their alphanumeric content.
+The pipeline composes two specialized models:
+- **License Plate Detector** -- [`yolov8_license_plate_detector`](https://github.com/open-edge-platform/edge-ai-resources), a YOLOv8 model fine-tuned to localize license plates as oriented bounding boxes.
+- **OCR Recognizer** -- [`ch_PP-OCRv4_rec_infer`](https://github.com/PaddlePaddle/PaddleOCR), the PaddleOCR PP-OCRv4 multilingual text recognizer that converts each cropped plate into a text string.
+Typical Metro deployments include:
+- **Tolling and Access Control** -- read plates at gates, depots, and parking entries.
+- **Vehicle Search and Forensics** -- index plates seen at a station for investigative lookup.
+- **Fleet and Bus Monitoring** -- correlate detected plates with operational schedules.
+The detector returns one bounding box per plate; the OCR stage runs as a downstream classifier on the cropped region, attaching the recognized string as inference metadata on the frame.
+> **Note:** Plate detector accuracy depends on the regional distribution of training data.
+> The bundled detector was trained primarily on European and US plates.
+> For other regions, fine-tune the YOLOv8 detector on a representative dataset before quantization.
+---
+## Prerequisites
+- [Install OpenVINO 2026.0.0](https://docs.openvino.ai/2026/get-started/install-openvino.html)
+- [Install Intel DLStreamer](https://dlstreamer.github.io/get_started/install/install-guide-ubuntu.html)
+---
+## Getting Started
+### Download and Quantize the Detector
+Run the provided script to download the license plate detector OpenVINO IR and quantize it to INT8:
+```bash
+chmod +x export_and_quantize.sh
+./export_and_quantize.sh ./models
+```
+The script performs the following steps:
+1. Installs dependencies (`openvino`, `nncf`).
+2. Downloads the `license-plate-reader` archive from the Intel Edge AI Resources project and extracts it under `./models/yolov8_license_plate_detector/license-plate-reader/`.
+   The archive bundles both the YOLOv8 plate detector (`models/yolov8n/yolov8n_retrained.xml`) and the converted PaddleOCR recognizer (`models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml`), so no separate OCR download step is required.
+3. Quantizes the detector to INT8 using NNCF post-training quantization, producing `./models/yolov8_license_plate_detector/yolov8_license_plate_detector_int8.xml`.
+4. Runs `benchmark_app` to validate detector throughput.
+> **Note:** For production accuracy, replace the random calibration tensors in
+> `export_and_quantize.sh` with a representative sample of frames from the
+> target deployment site.
+> The INT8 detector produced from random calibration in the bundled script may
+> miss small or low-contrast plates; if you need maximum recall before tuning
+> calibration, point the pipeline at the FP32 IR
+> (`models/yolov8_license_plate_detector/license-plate-reader/models/yolov8n/yolov8n_retrained.xml`).
+### Locating the OCR Recognizer
+The PaddleOCR recognizer ships inside the same archive:
+```text
+./models/yolov8_license_plate_detector/license-plate-reader/models/ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml
+```
+> **Note:** PaddleOCR PP-OCRv4 is a CTC sequence model.
+> To convert its raw tensor output into a recognized plate string, DLStreamer's
+> `gvaclassify` element requires a `model-proc` JSON with a CTC decoder
+> converter and a character labels file.
+> Neither file is bundled with the archive nor with the DLStreamer 2026.0.0
+> sample model_procs.
+> Without it the pipeline runs end-to-end and produces per-plate ROI metadata,
+> but the OCR `label` field on each detected plate is an empty string.
+> For a production deployment, supply your own `model-proc` (see
+> [DLStreamer model_proc reference](https://dlstreamer.github.io/dev_guide/model_proc_file.html))
+> with the PaddleOCR character dictionary; until then, treat the OCR stage as
+> a placeholder.
+### DLStreamer Sample
+The sample below builds the two-stage detection plus OCR pipeline using the Python GStreamer bindings.
+The `gvadetect` element runs the license plate detector; `gvaclassify` then runs the PaddleOCR recognizer on each detected plate region.
+A buffer probe extracts the recognized text from the `GstGVAJSONMeta` payload attached to each frame.
+```python
+import json
+import os
+import gi
+gi.require_version("Gst", "1.0")
+from gi.repository import Gst
+Gst.init(None)
+MODELS_DIR = os.path.abspath("./models/yolov8_license_plate_detector")
+DETECTOR_XML = f"{MODELS_DIR}/yolov8_license_plate_detector_int8.xml"
+OCR_XML = (
+    f"{MODELS_DIR}/license-plate-reader/models/"
+    "ch_PP-OCRv4_rec_infer/ch_PP-OCRv4_rec_infer.xml"
+)
+INPUT_VIDEO = "test_video.mp4"
+pipeline_str = (
+    f"filesrc location={INPUT_VIDEO} ! decodebin3 ! videoconvert ! "
+    f"video/x-raw,format=BGR ! "
+    f"gvadetect model={DETECTOR_XML} device=CPU threshold=0.5 ! queue ! "
+    f"gvaclassify model={OCR_XML} device=CPU inference-region=roi-list ! "
+    f"queue ! gvametaconvert format=json add-tensor-data=false ! "
+    f"gvawatermark ! videoconvert ! autovideosink name=sink"
+)
+pipeline = Gst.parse_launch(pipeline_str)
+def on_buffer(pad, info):
+    buf = info.get_buffer()
+    meta_iter = buf.iterate_meta()
+    while True:
+        ok, meta = meta_iter.next()
+        if not ok:
+            break
+        if meta.__gtype__.name != "GstGVAJSONMetaAPI":
+            continue
+        try:
+            payload = json.loads(meta.get_message())
+        except (AttributeError, ValueError):
+            continue
+        for obj in payload.get("objects", []):
+            label = obj.get("detection", {}).get("label", "")
+            text = ""
+            for tensor in obj.get("tensors", []):
+                if tensor.get("layer_name") and "label" in tensor:
+                    text = tensor["label"]
+                    break
+            if label and text:
+                print(f"Plate: {text}  bbox={obj.get('x')},{obj.get('y')}")
+    return Gst.PadProbeReturn.OK
+sink = pipeline.get_by_name("sink")
+sink_pad = sink.get_static_pad("sink")
+sink_pad.add_probe(Gst.PadProbeType.BUFFER, on_buffer)
+pipeline.set_state(Gst.State.PLAYING)
+bus = pipeline.get_bus()
+bus.timed_pop_filtered(
+    Gst.CLOCK_TIME_NONE,
+    Gst.MessageType.EOS | Gst.MessageType.ERROR,
+)
+pipeline.set_state(Gst.State.NULL)
+```
+To run on integrated GPU, change both `device=CPU` properties to `device=GPU` and prepend `vapostproc` after `decodebin3` for zero-copy color conversion.
+### Try It on a Sample Video
+Download a publicly hosted Intel sample clip that contains vehicles with visible license plates:
+```bash
+wget -O test_video.mp4 \
+  https://github.com/intel-iot-devkit/sample-videos/raw/master/car-detection.mp4
+```
+Run the DLStreamer sample above.
+A window opened by `autovideosink` shows each decoded frame with a green bounding box drawn by `gvawatermark` around every detected plate.
+The buffer probe prints one line per detected plate per frame.
+> **Note:** The INT8 detector built by `export_and_quantize.sh` with random
+> calibration tensors typically detects only one or two plates across this
+> short clip at the documented `threshold=0.5`.
+> For a richer demo run, swap `DETECTOR_XML` to the bundled FP32 IR and lower
+> the threshold:
+>
+> ```python
+> DETECTOR_XML = (
+>     f"{MODELS_DIR}/license-plate-reader/models/yolov8n/"
+>     "yolov8n_retrained.xml"
+> )
+> ```
+>
+> and change `threshold=0.5` to `threshold=0.3` in `pipeline_str`.
+Without a custom `model-proc` for PP-OCRv4 (see the OCR note above), the recognized `text` field is empty even though the detector and the OCR network both run on every plate ROI:
+```text
+Plate:   bbox=395,373
+Plate:   bbox=520,419
+```
+Once you supply a CTC model-proc and PaddleOCR character labels, the same lines will include the decoded plate string, for example:
+```text
+Plate: ABC1234  bbox=812,442
+Plate: ZN98YX   bbox=305,388
+```
+If you only need the structured output and not the live preview, replace `autovideosink` with `fakesink` in `pipeline_str` and pipe the console output to a file.
+---
+## License
+Copyright (C) Intel Corporation. All rights reserved.
+Licensed under the MIT License. See [LICENSE](LICENSE) for details.
+## References
+- [DLStreamer License Plate Recognition Sample](https://github.com/open-edge-platform/dlstreamer/tree/main/samples/gstreamer/gst_launch/license_plate_recognition)
+- [Intel Edge AI Resources -- License Plate Reader Model](https://github.com/open-edge-platform/edge-ai-resources)
+- [PaddleOCR PP-OCRv4](https://github.com/PaddlePaddle/PaddleOCR)
+- [Ultralytics YOLOv8 Documentation](https://docs.ultralytics.com/models/yolov8/)
+- [OpenVINO Documentation](https://docs.openvino.ai/)
+- [NNCF Post-Training Quantization](https://docs.openvino.ai/latest/nncf_ptq_introduction.html)
+- [Intel DLStreamer](https://dlstreamer.github.io/)

export_and_quantize.sh ADDED Viewed

	@@ -0,0 +1,89 @@

+#!/usr/bin/env bash
+# SPDX-License-Identifier: MIT
+# Copyright (C) Intel Corporation
+#
+# Download the YOLOv8 license plate detector, quantize it to INT8 with NNCF,
+# and stage the PaddleOCR PP-OCRv4 recognizer for use with Intel DLStreamer.
+# Usage: ./export_and_quantize.sh [MODELS_DIR]
+# Example: ./export_and_quantize.sh ./models
+set -euo pipefail
+MODELS_DIR="${1:-./models}"
+LP_DETECTOR_NAME="yolov8_license_plate_detector"
+LP_DETECTOR_URL="https://github.com/open-edge-platform/edge-ai-resources/raw/main/models/license-plate-reader.zip"
+OCR_NAME="ch_PP-OCRv4_rec_infer"
+mkdir -p "${MODELS_DIR}"
+echo "--- Installing dependencies ---"
+pip install -qU "openvino>=2026.0.0" "nncf>=3.0.0"
+echo "--- Downloading ${LP_DETECTOR_NAME} (OpenVINO IR) ---"
+LP_DIR="${MODELS_DIR}/${LP_DETECTOR_NAME}"
+mkdir -p "${LP_DIR}"
+TMP_ZIP="$(mktemp --suffix=.zip)"
+trap 'rm -f "${TMP_ZIP}"' EXIT
+curl -fsSL "${LP_DETECTOR_URL}" -o "${TMP_ZIP}"
+unzip -oq "${TMP_ZIP}" -d "${LP_DIR}"
+LP_XML="$(find "${LP_DIR}" -name "*retrained*.xml" -o -name "*license*plate*.xml" | head -n1)"
+if [[ -z "${LP_XML}" ]]; then
+    LP_XML="$(find "${LP_DIR}" -path "*/yolov8n/*.xml" | head -n1)"
+fi
+if [[ -z "${LP_XML}" ]]; then
+    echo "Error: license plate detector .xml not found under ${LP_DIR}" >&2
+    exit 1
+fi
+echo "Found detector model: ${LP_XML}"
+echo "--- Quantizing license plate detector to INT8 with NNCF ---"
+LP_INT8_XML="${LP_DIR}/${LP_DETECTOR_NAME}_int8.xml"
+python3 - <<PY
+import nncf
+import numpy as np
+import openvino as ov
+core = ov.Core()
+model = core.read_model("${LP_XML}")
+input_shape = model.inputs[0].partial_shape
+h = int(input_shape[2].get_length()) if input_shape[2].is_static else 640
+w = int(input_shape[3].get_length()) if input_shape[3].is_static else 640
+def transform_fn(_):
+    return np.random.rand(1, 3, h, w).astype(np.float32)
+calibration_dataset = nncf.Dataset(list(range(300)), transform_fn)
+quantized = nncf.quantize(
+    model,
+    calibration_dataset,
+    preset=nncf.QuantizationPreset.MIXED,
+    subset_size=300,
+)
+ov.save_model(quantized, "${LP_INT8_XML}")
+print("Quantization complete: ${LP_INT8_XML}")
+PY
+echo "--- Staging OCR model (${OCR_NAME}) ---"
+OCR_DIR="${MODELS_DIR}/${OCR_NAME}"
+if [[ -f "${OCR_DIR}/${OCR_NAME}.xml" ]]; then
+    echo "OCR model already present at ${OCR_DIR}"
+else
+    cat <<EOM
+The PaddleOCR PP-OCRv4 recognizer requires Paddle to OpenVINO IR conversion.
+Use the official Intel DLStreamer downloader to fetch and convert it:
+    export MODELS_PATH="\$(pwd)/${MODELS_DIR}"
+    /opt/intel/dlstreamer/samples/download_public_models.sh ${OCR_NAME}
+The converted model will be placed under \${MODELS_PATH}/public/${OCR_NAME}/.
+EOM
+fi
+echo "--- Benchmarking license plate detector ---"
+benchmark_app -m "${LP_INT8_XML}" -d CPU -niter 50 -api async
+echo "--- Done ---"