Unstructured-IO/unstructured
unstructured
Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.
Builder

Unstructured-IO
Unstructured-IO • individual
Stars
14,383
Using upstream star count
Forks
1,209
Using upstream fork count
Open Issues
0
Activity Score
0/100
0 commits in 30d
Created
Sep 26, 2022
Project creation date
README Summary
Unstructured is an open-source ETL solution that transforms complex documents into clean, structured formats optimized for language models and AI applications. The platform provides tools for document partitioning, text extraction, chunking, and data enrichment across various file formats including PDFs, images, and office documents. It offers both open-source tools and enterprise-grade platform solutions for production workflows.
AI Dev Skills
Unmapped
Tags
Taxonomy
Deployment Context
Modalities
Skill Areas
Recent Activity
Updated 24 days ago
7 Days
0
30 Days
0
90 Days
0
Quality
production- Quality
- high
- Maturity
- production
Categories
PM Skills
Languages
Timeline
- Project created
- Sep 26, 2022
- Forked
- Mar 21, 2026
- Your last push
- 24 days ago
- Upstream last push
- 7 days ago
- Tracked since
- Mar 20, 2026
Similar Repos
pgvector cosine similarity · $0
Loading…