opendatalab/PDF-Extract-Kit
PDF-Extract-Kit
A Comprehensive Toolkit for High-Quality PDF Content Extraction
Builder

opendatalab
opendatalab • individual
Stars
9,536
Using upstream star count
Forks
719
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Jun 27, 2024
Project creation date
README Summary
PDF-Extract-Kit is a comprehensive Python toolkit designed for high-quality PDF content extraction, providing advanced capabilities for parsing and extracting text, images, tables, and other elements from PDF documents. The toolkit offers a unified interface for various extraction methods and supports multiple output formats for downstream processing. It aims to solve common challenges in PDF processing by providing robust and accurate extraction tools.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 1 years ago
7 Days
0
30 Days
0
90 Days
0
Quality
prototype- Quality
- medium
- Maturity
- prototype
Categories
PM Skills
Languages
Timeline
- Project created
- Jun 27, 2024
- Forked
- Mar 16, 2026
- Your last push
- 1 years ago
- Upstream last push
- 1 years ago
- Tracked since
- Jan 3, 2025
Similar Repos
pgvector cosine similarity · $0
Loading…