Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/OCRmyPDF
Library/OCRmyPDFForked

ocrmypdf/OCRmyPDF

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

View on GitHub↗Upstream ocrmypdf/OCRmyPDF↗

Builder

ocrmypdf

ocrmypdf

ocrmypdf • individual

Stars

33,740

Using upstream star count

Forks

2,338

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Dec 20, 2013

Project creation date

README Summary

<!-- SPDX-FileCopyrightText: 2014 Julien Pfefferkorn --> <!-- SPDX-FileCopyrightText: 2015 James R. Barlow --> <!-- SPDX-License-Identifier: CC-BY-SA-4.0 -->

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Computer Vision for Text RecognitionDocument Image ProcessingImage Preprocessing and EnhancementOptical Character Recognition (OCR)PDF Manipulation

Tags

Computer Vision for Text RecognitionDocument Image ProcessingImage Preprocessing and EnhancementOptical Character Recognition (OCR)PDF ManipulationApple VisionCLI ToolDockerForkedNode.jsOpen SourcePyTorchPython

Taxonomy

AI Trends

On-device AIDocument AI

category

Dev Tools & AutomationModel TrainingSpatial & XRMLOps & InfrastructureLearning Resources

Deployment Context

Self-hostedOn-premiseCloud

Industries

Legal TechHealthcareFinanceGovernmentEducationPublishingRecords Management

Modalities

ImageText

Skill Areas

Optical Character Recognition (OCR)Document Image ProcessingPDF ManipulationComputer Vision for Text RecognitionImage Preprocessing and Enhancement

tag

Apple VisionCLI ToolDockerForkedNode.jsOpen SourcePyTorchPython

Use Cases

Document DigitizationLegacy Document SearchPDF Text ExtractionArchive DigitizationCompliance Document ProcessingLegal Discovery

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

3

Fix verapdf NotADirectoryError crash on some platforms

James R. Barlow • Mar 10, 2026

57bb554

Add --no-overwrite / -n option to prevent overwriting output files

James R. Barlow • Mar 10, 2026

5b9d6f9

Fix optimize=2/3 crash when using Python API

James R. Barlow • Mar 10, 2026

b588e3b

Quality

production
Quality
high
Maturity
production

Categories

Dev Tools & AutomationPrimarySpatial & XRMLOps & InfrastructureLearning ResourcesModel TrainingComputer Vision

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Dec 20, 2013
Forked
Mar 16, 2026
Your last push
2 months ago
Upstream last push
22 days ago
Tracked since
Mar 16, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…