Library/pytesseract
Library/pytesseractForked

madmaze/pytesseract

pytesseract

A Python wrapper for Google Tesseract

Builder

madmaze

madmaze

madmaze • individual

Stars

6,328

Using upstream star count

Forks

753

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 27, 2010

Project creation date

README Summary

pytesseract is a Python wrapper for Google's Tesseract-OCR Engine that allows developers to extract text from images using optical character recognition. The library provides a simple interface to perform OCR operations on various image formats and includes functionality for detecting text orientation, script identification, and confidence scoring. It supports multiple output formats including plain text, bounding boxes, and structured data formats.

AI Dev Skills

Unmapped

Optical Character Recognition (OCR)Computer VisionImage PreprocessingText ExtractionDocument Analysis

Tags

Optical Character Recognition (OCR)Computer VisionImage PreprocessingText ExtractionDocument AnalysisOn-premiseDocument DigitizationTextReceipt ParsingFinancial ServicesText Extraction from ScreenshotsPublishingGovernmentEducationEdge ComputingForm ProcessingSelf-hostedDocument ManagementPDF Text ExtractionHealthcareEdge/MobileCloud APIDocument AIImageInvoice ProcessingLicense Plate RecognitionLegal TechOn-device AIHandwritten Text RecognitionPython

Taxonomy

Recent Activity

Updated 28 days ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

NLP & TextPrimaryHealthcare & BiologyFinance & LegalEdge & Mobile AIOther AI / MLComputer Vision

PM Skills

Product Discovery

Languages

Python100.0%

Timeline

Project created
Oct 27, 2010
Forked
Mar 23, 2026
Your last push
28 days ago
Upstream last push
28 days ago
Tracked since
Mar 16, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…