Library/reporium-ingestion
Library/reporium-ingestionBuilt

perditioinc/reporium-ingestion

reporium-ingestion

Local data ingestion and analysis scripts for Reporium : fetch, process, and generate embeddings for repositories, communicating with Reporium API. AI-native analysis, embeddings, scraping, tagging. Pushes updates to API. Standalone and private by default.

Builder

perditioinc

perditioinc

perditioinc • individual

Stars

1

Repository stars

Forks

0

Repository forks

Open Issues

0

Activity Score

0/100

50 commits in 30d

Created

README Summary

Reporium-ingestion is a Python-based local data processing system that fetches, analyzes, and generates embeddings for code repositories. It provides AI-native analysis capabilities including scraping, tagging, and embedding generation while maintaining standalone and private-by-default operation. The system communicates with the Reporium API to push processed updates and repository insights.

AI Dev Skills

Unmapped

Code Embedding GenerationRepository AnalysisText PreprocessingAPI IntegrationWeb ScrapingNatural Language ProcessingVector Database OperationsAutomated Tagging SystemsContent ExtractionBatch Processing

Tags

Code Embedding GenerationRepository AnalysisText PreprocessingAPI IntegrationWeb ScrapingNatural Language ProcessingVector Database OperationsAutomated Tagging SystemsContent ExtractionBatch ProcessingSoftware DevelopmentOn-premiseAutomated Code AnalysisDeveloper ToolsCodeCode IntelligenceDeveloper Productivity AnalyticsTextCode Repository AnalysisSelf-hostedDeveloper AI ToolsAutomated Code DocumentationRepository Similarity MatchingSemantic Code SearchCodebase Insights GenerationCode Search and DiscoveryPython

Taxonomy

Recent Activity

Updated 14 days ago

7 Days

0

30 Days

50

90 Days

50

fix(fetcher): fetch languages in QUICK mode when uncached (#31) * feat(ci): add manual ingestion workflow dispatch Adds workflow_dispatch trigger to run the main ingestion pipeline in quick/weekly/full mode without waiting for scheduled runs. Full mode fetches GitHub topics + READMEs for all repos, which restores granular tags like specific technology names that the taxonomy enricher doesn't generate. Also supports fix_repos input to re-ingest specific repos by name. Co-Authored-By: Claude S

kimmymakesmovesMar 26, 2026

8fcfef9

feat(ci): add manual ingestion workflow dispatch (#30) Adds workflow_dispatch trigger to run the main ingestion pipeline in quick/weekly/full mode without waiting for scheduled runs. Full mode fetches GitHub topics + READMEs for all repos, which restores granular tags like specific technology names that the taxonomy enricher doesn't generate. Also supports fix_repos input to re-ingest specific repos by name. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

kimmymakesmovesMar 25, 2026

df5b7cb

fix(ingestion): include stargazers_count in API payload for built repos (#29) * fix: align ai_enricher with production schema (KAN-40) The enricher was writing to JSONB columns (skill_areas, industries, use_cases, modalities, ai_trends, deployment_context, maturity_level, quality_assessment, dependencies) that have never existed in production. Migration 014 also dropped dependencies. This caused immediate failure on any enrichment run against the live DB. Changes: - Remove SELECT of dependenc

kimmymakesmovesMar 25, 2026

adc41a7

Quality

prototype
Quality
medium
Maturity
prototype

Categories

Dev Tools & AutomationPrimaryRAG & RetrievalNLP & TextData Science & AnalyticsSearch & KnowledgeOther AI / ML

PM Skills

Developer Platform

Languages

Python100.0%

Timeline

Project created
Forked
Your last push
18 days ago
Upstream last push
Tracked since
Mar 30, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…