Library/PDF-Extract-Kit
Library/PDF-Extract-KitForked

opendatalab/PDF-Extract-Kit

PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Builder

opendatalab

opendatalab

opendatalab • individual

Stars

9,536

Using upstream star count

Forks

719

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jun 27, 2024

Project creation date

README Summary

PDF-Extract-Kit is a comprehensive Python toolkit designed for high-quality PDF content extraction, providing advanced capabilities for parsing and extracting text, images, tables, and other elements from PDF documents. The toolkit offers a unified interface for various extraction methods and supports multiple output formats for downstream processing. It aims to solve common challenges in PDF processing by providing robust and accurate extraction tools.

AI Dev Skills

Unmapped

Computer VisionDocument Layout AnalysisOptical Character RecognitionContent ExtractionMulti-modal Document UnderstandingPDF ProcessingDeep Learning for Document AI

Tags

Computer VisionDocument Layout AnalysisOptical Character RecognitionContent ExtractionMulti-modal Document UnderstandingPDF ProcessingDeep Learning for Document AIPublishingDocument AnalysisCompound AI SystemsLegal TechImage Extraction from DocumentsImageHealthcareAutomated Data EntryEducationText Mining from PDFsCloud APISelf-hostedMultimodal ReasoningTextDocument DigitizationTable ExtractionOn-premiseMultimodalGovernmentDocument AIConsultingTabularPDF Content MigrationFinancial ServicesPython

Taxonomy

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Dev Tools & AutomationPrimaryHealthcare & BiologyFinance & LegalMultimodal AIOther AI / MLFoundation ModelsComputer Vision

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
Jun 27, 2024
Forked
Mar 16, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Jan 3, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…