Library/Qwen2.5-Omni
Library/Qwen2.5-OmniForked

QwenLM/Qwen2.5-Omni

Qwen2.5-Omni

Qwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.

Builder

Qwen / Alibaba

Qwen / Alibaba

QwenLM • ai-lab

Stars

3,966

Using upstream star count

Forks

323

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Mar 22, 2025

Project creation date

README Summary

Qwen2.5-Omni is an end-to-end multimodal AI model developed by Alibaba Cloud's Qwen team that can process and understand multiple input types including text, audio, vision, and video. The model features real-time speech generation capabilities and represents a comprehensive multimodal AI solution. It's implemented primarily in Jupyter Notebook format for research and development purposes.

AI Dev Skills

Unmapped

Multimodal Machine LearningTransformer ArchitectureSpeech GenerationComputer VisionNatural Language ProcessingAudio ProcessingVideo UnderstandingEnd-to-End Model TrainingReal-time InferenceLarge Language Models

Tags

Multimodal Machine LearningTransformer ArchitectureSpeech GenerationComputer VisionNatural Language ProcessingAudio ProcessingVideo UnderstandingEnd-to-End Model TrainingReal-time InferenceLarge Language ModelsAudioVideoUnified Foundation ModelsVideo Content AnalysisMultimodal ReasoningSpeech SynthesisReal-time Speech GenerationCloud APIMultimodal AIImageTextEnd-to-end LearningMultimodalReal-time AIAudio-Visual Question AnsweringCross-modal Content UnderstandingMultimodal Conversational AICross-modal AttentionInteractive Voice AssistantsSelf-hostedEnd-to-end TrainingJupyter Notebook

Taxonomy

Recent Activity

Updated 10 months ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
medium
Maturity
research

Categories

Inference & ServingPrimaryNLP & TextCoding & Dev ToolsData Science & AnalyticsMultimodal AISearch & KnowledgeOther AI / MLGenerative MediaComputer VisionFoundation ModelsModel TrainingRobotics

PM Skills

Scale & Reliability

Languages

Jupyter Notebook100.0%

Timeline

Project created
Mar 22, 2025
Forked
Mar 13, 2026
Your last push
10 months ago
Upstream last push
10 months ago
Tracked since
Jun 12, 2025

Similar Repos

pgvector cosine similarity · $0

Loading…