tesseract-ocr/tesseract
tesseract
Tesseract Open Source OCR Engine (main repository)
Builder

tesseract-ocr
tesseract-ocr • individual
Stars
73,290
Using upstream star count
Forks
10,573
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Aug 12, 2014
Project creation date
README Summary
Tesseract is an open-source Optical Character Recognition (OCR) engine originally developed by HP and now maintained by Google. It supports over 100 languages and can recognize and extract text from images and PDF documents. The engine provides both command-line tools and library APIs for integration into applications.
AI Dev Skills
Unmapped
Optical Character RecognitionComputer VisionImage PreprocessingFeature ExtractionPattern RecognitionNeural Networks for Text RecognitionLanguage ModelingDeep Learning for OCR
Tags
Optical Character RecognitionComputer VisionImage PreprocessingFeature ExtractionPattern RecognitionNeural Networks for Text RecognitionLanguage ModelingDeep Learning for OCRText Extraction from ImagesAutomated Data EntryDocument DigitizationAccessibility Text ReadingGovernmentOn-premiseEdge ComputingSelf-hostedCloud APITraditional Machine LearningEducationEdge/MobileHistorical Document PreservationOn-device AIDocument ManagementPublishingForm ProcessingLicense Plate RecognitionHealthcareFinancial ServicesImageLegal TechArchival ServicesInvoice ProcessingTextC++CLI
Taxonomy
Deployment Context
Industries
Skill Areas
Recent Activity
Updated 28 days ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
Healthcare & BiologyPrimaryFinance & LegalEdge & Mobile AIOther AI / MLComputer Vision
PM Skills
Product Discovery
Languages
C++100.0%
Timeline
- Project created
- Aug 12, 2014
- Forked
- Mar 16, 2026
- Your last push
- 28 days ago
- Upstream last push
- 15 days ago
- Tracked since
- Mar 16, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…