Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/PyMuPDF
Library/PyMuPDFForked

pymupdf/PyMuPDF

PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

View on GitHub↗Upstream pymupdf/PyMuPDF↗

Builder

pymupdf

pymupdf

pymupdf • individual

Stars

9,857

Using upstream star count

Forks

729

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 6, 2012

Project creation date

README Summary

**PyMuPDF** is a high performance **Python** library for data extraction, analysis, conversion & manipulation of [PDF (and other) documents](https://pymupdf.readthedocs.io/en/latest/the-basics.html#supported-file-types).

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Binary File HandlingComputer VisionComputer Vision for Document AnalysisContent Analysis and TransformationData ExtractionData Extraction from DocumentsDocument ProcessingDocument UnderstandingFormat ConversionImage Recognition and OCR PreparationLayout AnalysisMetadata ExtractionPDF ParsingPDF Parsing and AnalysisPDF ProcessingPerformance OptimizationText Extraction and AnalysisText Recognition

Tags

Binary File HandlingComputer VisionComputer Vision for Document AnalysisContent Analysis and TransformationData ExtractionData Extraction from DocumentsDocument ProcessingDocument UnderstandingFormat ConversionImage Recognition and OCR PreparationLayout AnalysisMetadata ExtractionPDF ParsingPDF Parsing and AnalysisPDF ProcessingPerformance OptimizationText Extraction and AnalysisText RecognitionForkedGraspingPython

Taxonomy

AI Trends

Document AICompound AI SystemsMultimodal ReasoningRetrieval-Augmented GenerationDocument IntelligenceMultimodal Data Processing

category

Robotics

Deployment Context

Self-hostedOn-premiseServerlessCloud APIBatch Processing

Industries

Legal TechFinTechHealthcareDocument ManagementEnterprise AutomationEducationEnterprise Content ManagementHealthcare Records ProcessingArchival SystemsContent Management Systems

Modalities

TextImageTabularDocument

Skill Areas

Document UnderstandingData ExtractionComputer VisionText RecognitionPDF ProcessingFormat ConversionMetadata ExtractionLayout AnalysisPDF ParsingDocument ProcessingPDF Parsing and AnalysisData Extraction from DocumentsImage Recognition and OCR PreparationContent Analysis and TransformationText Extraction and AnalysisComputer Vision for Document AnalysisBinary File HandlingPerformance Optimization

tag

ActiveForkedGraspingPython

Use Cases

Document data extraction from PDFsAutomated form processingDocument format conversionText and image extraction from documentsDocument analysis and parsingOCR preparation and supportDocument manipulation and editingMetadata extraction and analysisPDF Data ExtractionDocument Conversion and Format TransformationText and Image Extraction from DocumentsDocument Analysis and Metadata ExtractionPreparation of Documents for Machine LearningAutomated Document Processing PipelinesContent Indexing and SearchAutomated PDF data extractionText and content extraction from PDFsDocument analysis and indexingForm field extraction and processingDocument preprocessing for ML pipelinesBatch document processing

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

20

tests/test_pixmap.py: fix test_4435() with latest mupdf master.

Julian Smith • Mar 24, 2026

564f14d

tests/test_codespell.py: reduce length of codespell command line.

Julian Smith • Mar 23, 2026

b2b528d

docs/faq/index.rst: new item: Can I use multithreading with PyMuPDF, perhaps with free-threading Pyt

Julian Smith • Mar 23, 2026

a531669

Quality

production
Quality
high
Maturity
production

Categories

RoboticsPrimary

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
Oct 6, 2012
Forked
Mar 29, 2026
Your last push
2 months ago
Upstream last push
23 days ago
Tracked since
Mar 27, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…