Library/table-transformer
Library/table-transformerForked

microsoft/table-transformer

table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.

Builder

Microsoft

Microsoft

microsoft • big-tech

Stars

2,884

Using upstream star count

Forks

311

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

May 17, 2021

Project creation date

README Summary

Table Transformer (TATR) is a deep learning model designed to extract and structure tables from unstructured documents including PDFs and images. This repository serves as the official source for the PubTables-1M dataset, which contains over one million annotated tables, and includes the GriTS evaluation metric for assessing table structure recognition performance. The model uses transformer architecture to detect table boundaries and extract their structural elements for downstream processing.

AI Dev Skills

Unmapped

Transformer ArchitectureComputer VisionDocument UnderstandingObject DetectionTable Structure RecognitionDataset CurationEvaluation Metrics DesignDeep Learning Model Training

Tags

Transformer ArchitectureComputer VisionDocument UnderstandingObject DetectionTable Structure RecognitionDataset CurationEvaluation Metrics DesignDeep Learning Model TrainingOn-premiseImagePDF Table ExtractionVision TransformersSelf-hostedFinancial Report ProcessingResearch Paper AnalysisDocument DigitizationHealthcareDocument AIDocument ManagementFinancial ServicesAcademic ResearchCloud APIBusiness IntelligenceDocumentMultimodal ReasoningAutomated Data EntryLegal TechPython

Taxonomy

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Foundation ModelsPrimaryModel TrainingEvals & BenchmarkingComputer VisionHealthcare & BiologyLearning ResourcesFinance & LegalMultimodal AISearch & KnowledgeOther AI / ML

PM Skills

Product Discovery

Languages

Python100.0%

Timeline

Project created
May 17, 2021
Forked
Mar 23, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Jun 24, 2024

Similar Repos

pgvector cosine similarity · $0

Loading…