Library/docling
Library/doclingForked

docling-project/docling

docling

Get your documents ready for gen AI

Builder

docling-project

docling-project

docling-project • individual

Stars

56,953

Using upstream star count

Forks

3,861

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jul 9, 2024

Project creation date

README Summary

Docling is a Python library that converts documents from various formats (PDF, DOCX, PPTX, images, HTML) into structured formats optimized for generative AI applications. It provides advanced PDF understanding capabilities including layout analysis, table structure recognition, and metadata extraction. The tool offers both programmatic APIs and CLI interfaces for seamless document processing workflows.

AI Dev Skills

Unmapped

Document ProcessingOptical Character Recognition (OCR)Computer Vision for Document AnalysisMulti-modal Content ExtractionRetrieval-Augmented Generation (RAG) PreprocessingDocument Layout AnalysisTable Structure RecognitionPDF ProcessingImage-to-Text Conversion

Tags

Document ProcessingOptical Character Recognition (OCR)Computer Vision for Document AnalysisMulti-modal Content ExtractionRetrieval-Augmented Generation (RAG) PreprocessingDocument Layout AnalysisTable Structure RecognitionPDF ProcessingImage-to-Text ConversionImageDocument Digitization and ArchivalSelf-hostedFinancial ServicesRetrieval-Augmented Generation (RAG)RAG System Data PreparationTabularEnterprise Document ManagementMultimodal AIDocument Content Search and IndexingTextAutomated Document Processing PipelinesTable Data ExtractionDocument Question AnsweringDocker ContainersHealthcareEnterprise Knowledge Base CreationMultimodalResearch and AcademiaCompliance and AuditDocument AICompound AI SystemsCloud APIOn-premiseLegal TechMulti-format Document IngestionPythonCLI

Taxonomy

Recent Activity

Updated 24 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

beta
Quality
high
Maturity
beta

Categories

MLOps & InfrastructurePrimaryDev Tools & AutomationLearning ResourcesRAG & RetrievalEvals & BenchmarkingNLP & TextML Platform & InfrastructureHealthcare & BiologyFinance & LegalMultimodal AIEdge & Mobile AISearch & KnowledgeOther AI / MLFoundation ModelsComputer Vision

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Jul 9, 2024
Forked
Mar 22, 2026
Your last push
24 days ago
Upstream last push
8 days ago
Tracked since
Mar 20, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…