opendataloader-project/opendataloader-pdf
opendataloader-pdf
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
Builder

opendataloader-project
opendataloader-project • individual
Stars
11,019
Using upstream star count
Forks
827
Using upstream fork count
Open Issues
0
Activity Score
0/100
131 commits in 30d
Created
May 13, 2025
Project creation date
README Summary
OpenDataLoader PDF is an open-source Java-based PDF parser specifically designed to convert PDF documents into AI-ready data formats. The project focuses on automating PDF accessibility and making document content easily consumable by AI systems and data processing pipelines.
AI Dev Skills
Unmapped
Document ProcessingData ExtractionText ProcessingDocument UnderstandingData Pipeline Engineering
Tags
Document ProcessingData ExtractionText ProcessingDocument UnderstandingData Pipeline EngineeringGovernmentAI Data PreparationPDF Accessibility EnhancementDocument DigitizationDocument Preprocessing for AIData Pipeline AutomationLegal TechHealthcareDocument AIDocumentStructured Data GenerationTextEducationOn-premiseFinancial ServicesDocument Content ExtractionSelf-hostedJava
Taxonomy
Deployment Context
Skill Areas
Recent Activity
Updated 24 days ago
7 Days
13
30 Days
131
90 Days
246
Quality
prototype- Quality
- medium
- Maturity
- prototype
Categories
MLOps & InfrastructurePrimaryDev Tools & AutomationRAG & RetrievalML Platform & InfrastructureHealthcare & BiologyFinance & LegalOther AI / ML
PM Skills
Scale & ReliabilityDeveloper Platform
Languages
Java100.0%
Timeline
- Project created
- May 13, 2025
- Forked
- Mar 22, 2026
- Your last push
- 24 days ago
- Upstream last push
- 7 days ago
- Tracked since
- Mar 20, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…