Library/ydata-profiling
Library/ydata-profilingForked

Data-Centric-AI-Community/ydata-profiling

ydata-profiling

1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.

Builder

Data-Centric-AI-Community

Data-Centric-AI-Community

Data-Centric-AI-Community • individual

Stars

13,464

Using upstream star count

Forks

1,776

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Jan 9, 2016

Project creation date

README Summary

ydata-profiling is a Python library that generates comprehensive data quality reports and exploratory data analysis with just one line of code. It works with both Pandas and Spark DataFrames to automatically create detailed HTML reports containing statistics, distributions, correlations, and data quality insights. The tool is designed to accelerate the data understanding phase of machine learning projects by providing instant visibility into dataset characteristics.

AI Dev Skills

Unmapped

Exploratory Data AnalysisData Quality AssessmentStatistical ProfilingData PreprocessingFeature EngineeringData ValidationAutomated ReportingData Visualization

Tags

Exploratory Data AnalysisData Quality AssessmentStatistical ProfilingData PreprocessingFeature EngineeringData ValidationAutomated ReportingData VisualizationE-commerceDataset ProfilingManufacturingFinTechMissing Value DetectionAutoMLData Documentation GenerationData Pipeline ValidationTabularData-Centric AIOn-premiseCorrelation AnalysisMarketing AnalyticsFeature Distribution AnalysisSelf-hostedAutomated Data Quality ReportingMLOpsHealthcareResearchCloudJupyter NotebooksML Model Input ValidationPython

Taxonomy

Recent Activity

Updated 1 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

production
Quality
high
Maturity
production

Categories

Model TrainingPrimaryHealthcare & BiologyFinance & LegalMLOps & InfrastructureLearning ResourcesIndustry: FinTechML Platform & InfrastructureData Science & AnalyticsSearch & KnowledgeOther AI / ML

PM Skills

Scale & ReliabilityDeveloper Platform

Languages

Python100.0%

Timeline

Project created
Jan 9, 2016
Forked
Mar 22, 2026
Your last push
1 months ago
Upstream last push
1 months ago
Tracked since
Mar 3, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…