Library/langextract
Library/langextractForked

google/langextract

langextract

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Builder

Google

Google

google • big-tech

Stars

35,196

Using upstream star count

Forks

2,383

Using upstream fork count

Open Issues

0

Activity Score

0/100

10 commits in 30d

Created

Jul 8, 2025

Project creation date

README Summary

Langextract is a Python library that leverages Large Language Models (LLMs) to extract structured information from unstructured text documents. It provides precise source grounding by maintaining references to the original text locations and offers interactive visualization capabilities for exploring the extracted data. The library is designed to make it easy to transform raw text into structured formats while preserving traceability back to the source material.

AI Dev Skills

Unmapped

Information ExtractionLarge Language Model IntegrationSource Attribution and GroundingStructured Data GenerationText Processing and AnalysisInteractive Data VisualizationNatural Language Understanding

Tags

Information ExtractionLarge Language Model IntegrationSource Attribution and GroundingStructured Data GenerationText Processing and AnalysisInteractive Data VisualizationNatural Language UnderstandingDocument Information ExtractionAutomated Data Entry from TextDocument ProcessingDocument ManagementFinTechResearch Paper ProcessingText MiningAI ExplainabilityContent Analysis with Source TrackingSelf-hostedBusiness Intelligence from Unstructured DataHealthcareLarge Language ModelsContent ManagementCompound AI SystemsLegal Document AnalysisOn-premiseRetrieval-Augmented GenerationResearch and AcademiaTextCloud APINatural Language ProcessingLegal TechPython

Taxonomy

Recent Activity

Updated 3 months ago

7 Days

2

30 Days

10

90 Days

11

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Foundation ModelsPrimaryRAG & RetrievalEvals & BenchmarkingNLP & TextSafety & AlignmentData Science & AnalyticsHealthcare & BiologyFinance & LegalSearch & KnowledgeOther AI / MLDev Tools & AutomationLearning ResourcesIndustry: FinTech

PM Skills

Scale & Reliability

Languages

Python100.0%

Timeline

Project created
Jul 8, 2025
Forked
Feb 17, 2026
Your last push
3 months ago
Upstream last push
7 days ago
Tracked since
Dec 29, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…