| --- |
| language: |
| - en |
| base_model: |
| - ds4sd/docling-models |
| pipeline_tag: object-detection |
| --- |
| # Docling Model for Layout |
|
|
| This is the **Docling model for layout detection**, designed to facilitate easy importing and usage like any other Hugging Face model. |
|
|
| This model is part of the [Docling repository](https://huggingface.co/ds4sd/docling-models), which provides document layout analysis tools. |
|
|
| ## **Usage Example** |
| Here's how you can load and use the model: |
|
|
| ```python |
| import torch |
| from PIL import Image |
| from transformers import RTDetrForObjectDetection, RTDetrImageProcessor |
| |
| # Load the model and processor |
| image_processor = RTDetrImageProcessor.from_pretrained("HuggingPanda/docling-layout") |
| model = RTDetrForObjectDetection.from_pretrained("HuggingPanda/docling-layout") |
| |
| # Load an image |
| image = Image.open("hocr_output_page-0001.jpg") |
| |
| # Preprocess the image |
| resize = {"height":640, "width":640} |
| inputs = image_processor( |
| images=image, |
| return_tensors="pt", |
| size=resize, |
| ) |
| |
| # Perform inference |
| with torch.no_grad(): |
| outputs = model(**inputs) |
| |
| # Post-process results |
| results = image_processor.post_process_object_detection( |
| outputs, |
| target_sizes=torch.tensor([image.size[::-1]]), |
| threshold=0.3 |
| ) |
| |
| # Print detected objects |
| for result in results: |
| for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]): |
| score, label = score.item(), label_id.item() |
| box = [round(i, 2) for i in box.tolist()] |
| print(f"{model.config.id2label[label+1]}: {score:.2f} {box}") |
| |
| ``` |
|
|
|
|
| ## **Model Information** |
| - **Base Model:** RT-DETR (Robust Transformer-based Object Detector) |
| - **Intended Use:** Layout detection for documents |
| - **Framework:** [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) |
| - **Dataset Used:** Internal dataset for document structure recognition |
| - **License:** Apache 2.0 |
|
|
| ## **Citing This Model** |
| If you use this model in your work, please cite the main **Docling repository**: |
|
|
| ``` |
| @misc{docling2024, title={Docling Models for Document Layout Analysis}, author={DS4SD Team}, year={2024}, howpublished={Hugging Face Repository}, url={https://huggingface.co/ds4sd/docling-models} } |
| ``` |
|
|
| For more details, visit the main repo: [ds4sd/docling-models](https://huggingface.co/ds4sd/docling-models). |
|
|
| ## **Contact** |
| For questions or issues, please open a discussion on Hugging Face or contact [pandahd75@gmail.com]. |