Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/marker
Library/markerForked

datalab-to/marker

marker

Convert PDF to markdown + JSON quickly with high accuracy

View on GitHub↗Upstream datalab-to/marker↗

Builder

datalab-to

datalab-to

datalab-to • individual

Stars

35,555

Using upstream star count

Forks

2,459

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Oct 30, 2023

Project creation date

README Summary

Marker converts documents to markdown, JSON, chunks, and HTML quickly and accurately.

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Computer VisionDocument AIDocument Layout AnalysisDocument Structure RecognitionOptical Character RecognitionPDF ProcessingText Extraction

Tags

Computer VisionDocument AIDocument Layout AnalysisDocument Structure RecognitionOptical Character RecognitionPDF ProcessingText ExtractionAnthropic / ClaudeAzure AIBenchmarkingClaudeDeep LearningEvalsFinTechForkedGPU / CUDAGoogle AIGoogle CloudHuggingFaceLarge Language ModelsOllamaOpen SourceOpenAIPyTorchPydanticPythonPython Web FrameworkResearch / PapersRoadmapStructured OutputTransformers

Taxonomy

AI Trends

Document AICompound AI Systems

category

Foundation ModelsAI AgentsModel TrainingEvals & BenchmarkingInference & ServingDev Tools & AutomationCloud & PlatformsLearning ResourcesIndustry: FinTech

Deployment Context

Self-hostedCloud APIOn-premise

Industries

Legal TechDocument ManagementPublishingResearchKnowledge Management

Modalities

TextImage

Skill Areas

Document AIComputer VisionOptical Character RecognitionDocument Layout AnalysisPDF ProcessingText ExtractionDocument Structure Recognition

tag

Anthropic / ClaudeAzure AIBenchmarkingClaudeDeep LearningEvalsFinTechForkedGPU / CUDAGoogle AIGoogle CloudHuggingFaceLarge Language ModelsOllamaOpen SourceOpenAIPyTorchPydanticPythonPython Web FrameworkResearch / PapersRoadmapStructured OutputTransformers

Use Cases

Document DigitizationPDF Content ExtractionDocument Processing PipelineText Mining from PDFsDocument Format ConversionAcademic Paper Processing

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

2

@EurFelux has signed the CLA in datalab-to/marker#1009

github-actions[bot] • Mar 10, 2026

d63e3d9

@jcs-zfc has signed the CLA in datalab-to/marker#1004

github-actions[bot] • Mar 4, 2026

28854f0

@Br1an67 has signed the CLA in datalab-to/marker#994

github-actions[bot] • Mar 1, 2026

38a2670

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Evals & BenchmarkingPrimaryInference & ServingDev Tools & AutomationCloud & PlatformsLearning ResourcesIndustry: FinTechFoundation ModelsAI AgentsModel TrainingFinance & LegalSearch & KnowledgeOther AI / ML

PM Skills

Data & EvaluationDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Oct 30, 2023
Forked
Mar 16, 2026
Your last push
2 months ago
Upstream last push
29 days ago
Tracked since
Mar 10, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…