Library/opendataloader-pdf
Library/opendataloader-pdfForked

opendataloader-project/opendataloader-pdf

opendataloader-pdf

PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.

Builder

opendataloader-project

opendataloader-project

opendataloader-project • individual

Stars

11,019

Using upstream star count

Forks

827

Using upstream fork count

Open Issues

0

Activity Score

0/100

131 commits in 30d

Created

May 13, 2025

Project creation date

README Summary

OpenDataLoader PDF is an open-source Java-based PDF parser specifically designed to convert PDF documents into AI-ready data formats. The project focuses on automating PDF accessibility and making document content easily consumable by AI systems and data processing pipelines.

AI Dev Skills

Unmapped

Document ProcessingData ExtractionText ProcessingDocument UnderstandingData Pipeline Engineering

Tags

Document ProcessingData ExtractionText ProcessingDocument UnderstandingData Pipeline EngineeringGovernmentAI Data PreparationPDF Accessibility EnhancementDocument DigitizationDocument Preprocessing for AIData Pipeline AutomationLegal TechHealthcareDocument AIDocumentStructured Data GenerationTextEducationOn-premiseFinancial ServicesDocument Content ExtractionSelf-hostedJava

Taxonomy

Recent Activity

Updated 24 days ago

7 Days

13

30 Days

131

90 Days

246

Quality

prototype
Quality
medium
Maturity
prototype

Categories

MLOps & InfrastructurePrimaryDev Tools & AutomationRAG & RetrievalML Platform & InfrastructureHealthcare & BiologyFinance & LegalOther AI / ML

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Java100.0%

Timeline

Project created
May 13, 2025
Forked
Mar 22, 2026
Your last push
24 days ago
Upstream last push
7 days ago
Tracked since
Mar 20, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…