Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/whisperX
Library/whisperXForked

m-bain/whisperX

whisperX

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

View on GitHub↗Upstream m-bain/whisperX↗

Builder

m-bain

m-bain

m-bain • individual

Stars

22,157

Using upstream star count

Forks

2,283

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Dec 9, 2022

Project creation date

README Summary

If you’re looking for a transcription API for meetings, consider checking out [Recall.ai's Meeting Transcription API](https://www.recall.ai/product/meeting-transcription-api?utm_source=github&utm_medium=sponsorship&utm_campaign=mbain-whisperx), an API that works with Zoom, Google Meet, Microsoft Teams, and more. Recall.ai diarizes by pulling the speaker data and separate audio streams from the meeting platforms, which means 100% accurate speaker diarization with actual speaker names.

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Audio Feature ExtractionAudio Signal ProcessingAutomatic Speech RecognitionForced AlignmentMulti-model Pipeline IntegrationSpeaker DiarizationTransformer Fine-tuningVoice Activity Detection

Tags

Audio Feature ExtractionAudio Signal ProcessingAutomatic Speech RecognitionForced AlignmentMulti-model Pipeline IntegrationSpeaker DiarizationTransformer Fine-tuningVoice Activity DetectionAI SafetyBenchmarkingCourseForkedGPU / CUDAHuggingFaceOpenAIPyTorchPythonResearch / PapersRustSpeech to TextTutorial

Taxonomy

AI Trends

Compound AI SystemsMulti-model OrchestrationOn-device AIAudio-Language Models

category

Learning ResourcesFoundation ModelsModel TrainingEvals & BenchmarkingInference & ServingGenerative MediaSecurity & Safety

Deployment Context

Self-hostedCloud APIOn-premiseLocal Processing

Industries

Media & EntertainmentLegal TechHealthcareEducationMarket ResearchBroadcasting

Modalities

Audio

Skill Areas

Automatic Speech RecognitionSpeaker DiarizationVoice Activity DetectionForced AlignmentAudio Signal ProcessingTransformer Fine-tuningMulti-model Pipeline IntegrationAudio Feature Extraction

tag

AI SafetyBenchmarkingCourseForkedGPU / CUDAHuggingFaceOpenAIPyTorchPythonResearch / PapersRustSpeech to TextTutorial

Use Cases

Meeting Transcription with Speaker LabelsSubtitle Generation with Precise TimingInterview Analysis and ProcessingPodcast Transcription and IndexingCall Center Audio AnalysisLecture and Educational Content Processing

Recent Activity

Updated 2 months ago

7 Days

0

30 Days

0

90 Days

5

fix: remove dead model_bytes read that leaked file handle

Claude-Assistant • Mar 17, 2026

646f511

feat: add progress_callback to transcribe, align, and diarize

Claude-Assistant • Mar 11, 2026

d00ec69

chore: bump version

Barabazs • Mar 10, 2026

6d3edb1

Quality

beta
Quality
high
Maturity
beta

Categories

Learning ResourcesPrimaryEvals & BenchmarkingInference & ServingSecurity & SafetyFoundation ModelsModel TrainingGenerative MediaSafety & AlignmentSearch & KnowledgeOther AI / ML

PM Skills

Safety & AlignmentUser ExperienceData & Evaluation

Languages

Python100.0%

Timeline

Project created
Dec 9, 2022
Forked
Mar 22, 2026
Your last push
2 months ago
Upstream last push
2 months ago
Tracked since
Mar 17, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…