Library/PaddleOCR
Library/PaddleOCRForked

PaddlePaddle/PaddleOCR

PaddleOCR

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.

Builder

PaddlePaddle

PaddlePaddle

PaddlePaddle • individual

Stars

74,770

Using upstream star count

Forks

10,170

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

May 8, 2020

Project creation date

README Summary

PaddleOCR is a comprehensive OCR (Optical Character Recognition) toolkit developed by PaddlePaddle that can extract text from images and PDF documents. It supports over 100 languages and provides both text detection and recognition capabilities with pretrained models. The toolkit is designed to convert visual documents into structured text data that can be easily processed by AI systems and large language models.

AI Dev Skills

Unmapped

Optical Character Recognition (OCR)Computer VisionDeep LearningText Detection and RecognitionDocument ProcessingConvolutional Neural NetworksModel OptimizationMobile AI DeploymentMultilingual NLPDocument Layout Analysis

Tags

Optical Character Recognition (OCR)Computer VisionDeep LearningText Detection and RecognitionDocument ProcessingConvolutional Neural NetworksModel OptimizationMobile AI DeploymentMultilingual NLPDocument Layout AnalysisSelf-hostedCross-lingual AIDocument DigitizationForm RecognitionContent ModerationImageInvoice ProcessingLegal TechCloud APIDocument AIAutomated Data EntryModel CompressionPDF Text ExtractionDocument Question AnsweringGovernmentInsuranceTextPublishingEdge AIBusiness Card ProcessingEdge/MobileOn-premiseHealthcareMultimodalLicense Plate RecognitionEducationDockerMultimodal AIOn-device AIFinTechServerlessDocument ManagementReceipt ScanningPython

Taxonomy

Recent Activity

Updated 28 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

MLOps & InfrastructurePrimaryIndustry: FinTechRAG & RetrievalInference & ServingHealthcare & BiologyFinance & LegalMultimodal AIEdge & Mobile AISearch & KnowledgeOther AI / MLNLP & TextComputer VisionRoboticsFoundation Models

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
May 8, 2020
Forked
Mar 16, 2026
Your last push
28 days ago
Upstream last push
7 days ago
Tracked since
Mar 16, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…