Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/PDF-Extract-Kit
Library/PDF-Extract-KitForked

opendatalab/PDF-Extract-Kit

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

View on GitHub↗Upstream opendatalab/PDF-Extract-Kit↗

Builder

opendatalab

opendatalab

opendatalab • individual

Stars

9,682

Using upstream star count

Forks

730

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jun 27, 2024

Project creation date

README Summary

<p align="center"> <img src="assets/readme/pdf-extract-kit_logo.png" width="220px" style="vertical-align:middle;"> </p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Computer VisionContent ExtractionDeep Learning for Document AIDocument Layout AnalysisMulti-modal Document UnderstandingOptical Character RecognitionPDF Processing

Tags

Computer VisionContent ExtractionDeep Learning for Document AIDocument Layout AnalysisMulti-modal Document UnderstandingOptical Character RecognitionPDF ProcessingDocument ProcessingEvalsFinTechForkedHuggingFaceLarge Language ModelsMultimodal AIOpen SourcePythonResearch / PapersSynthetic DataTutorial

Taxonomy

AI Trends

Document AIMultimodal ReasoningCompound AI Systems

category

Foundation ModelsRAG & RetrievalModel TrainingEvals & BenchmarkingComputer VisionLearning ResourcesIndustry: FinTech

Deployment Context

Self-hostedOn-premiseCloud API

Industries

Legal TechFinancial ServicesHealthcareEducationPublishingGovernmentConsulting

Modalities

TextImageTabularMultimodal

Skill Areas

Computer VisionDocument Layout AnalysisOptical Character RecognitionContent ExtractionMulti-modal Document UnderstandingPDF ProcessingDeep Learning for Document AI

tag

Computer VisionDocument ProcessingEvalsFinTechForkedHuggingFaceLarge Language ModelsMultimodal AIOpen SourcePythonResearch / PapersSynthetic DataTutorial

Use Cases

Document DigitizationPDF Content MigrationAutomated Data EntryDocument AnalysisText Mining from PDFsTable ExtractionImage Extraction from Documents

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

prototype
Quality
medium
Maturity
prototype

Categories

RAG & RetrievalPrimaryEvals & BenchmarkingLearning ResourcesIndustry: FinTechFoundation ModelsModel TrainingComputer VisionFinance & LegalMultimodal AISearch & KnowledgeOther AI / ML

PM Skills

User ExperienceData & EvaluationProduct Discovery

Languages

Python100.0%

Timeline

Project created
Jun 27, 2024
Forked
Mar 16, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Jan 3, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…