Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/Qwen3-VL
Library/Qwen3-VLForked

QwenLM/Qwen3-VL

Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

View on GitHub↗Upstream QwenLM/Qwen3-VL↗

Builder

Qwen / Alibaba

Qwen / Alibaba

QwenLM • ai-lab

Stars

19,264

Using upstream star count

Forks

1,773

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Aug 29, 2024

Project creation date

README Summary

<p align="center"> <img src="https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3-VL/qwen3vllogo.png" width="400"/> <p>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Computer VisionCross-modal UnderstandingLarge Language Model ArchitectureMultimodal Machine LearningMultimodal ReasoningNatural Language ProcessingTransformer ArchitectureVision-Language ModelsVision Transformer

Tags

Computer VisionCross-modal UnderstandingLarge Language Model ArchitectureMultimodal Machine LearningMultimodal ReasoningNatural Language ProcessingTransformer ArchitectureVision-Language ModelsVision TransformerAI SafetyDockerDocument ProcessingEmbeddingsEvalsForkedGPU / CUDAHuggingFaceJupyterLLM ServingLarge Language ModelsLong ContextMobileMultimodal AIMusic TechOpen SourceOpenAIPyTorchPythonQuantizationQwenResearch / PapersSGLangTransformersTutorialvLLM

Taxonomy

AI Trends

Multimodal ReasoningLarge Language ModelsVision-Language IntegrationFoundation ModelsUnified Multimodal Understanding

category

Foundation ModelsRAG & RetrievalModel TrainingEvals & BenchmarkingInference & ServingMLOps & InfrastructureLearning ResourcesIndustry: Audio & MusicSecurity & SafetyData Science & Analytics

Deployment Context

Self-hostedCloud APIOn-premise

Industries

EducationHealthcareMedia and EntertainmentE-commerceContent CreationDocument Processing

Modalities

TextImageMultimodal

Skill Areas

Multimodal Machine LearningVision-Language ModelsLarge Language Model ArchitectureComputer VisionNatural Language ProcessingTransformer ArchitectureVision TransformerMultimodal ReasoningCross-modal Understanding

tag

AI SafetyDockerDocument ProcessingEmbeddingsEvalsForkedGPU / CUDAHuggingFaceJupyterLLM ServingLarge Language ModelsLong ContextMobileMultimodal AIMusic TechOpen SourceOpenAIPyTorchPythonQuantizationQwenResearch / PapersSGLangTransformersTutorialvLLM

Use Cases

Visual Question AnsweringImage CaptioningDocument UnderstandingVisual Content AnalysisMultimodal ChatbotsImage-to-Text GenerationVisual Reasoning Tasks

Recent Activity

Updated 4 months ago

7 Days

0

30 Days

0

90 Days

0

Merge pull request #1971 from 2003jiahang/patch-1

ShuaiBai623 • Jan 30, 2026

9658872

Merge pull request #2000 from Zhaohai-Li/main

ShuaiBai623 • Jan 30, 2026

565103d

Quality

research
Quality
medium
Maturity
research

Categories

RAG & RetrievalPrimaryEvals & BenchmarkingInference & ServingMLOps & InfrastructureLearning ResourcesIndustry: Audio & MusicSecurity & SafetyData Science & AnalyticsFoundation ModelsModel TrainingGenerative MediaSafety & AlignmentMultimodal AIEdge & Mobile AISearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencySafety & AlignmentUser ExperienceScale & ReliabilityData & EvaluationProduct Discovery

Languages

Jupyter Notebook100.0%

Timeline

Project created
Aug 29, 2024
Forked
Mar 13, 2026
Your last push
4 months ago
Upstream last push
4 months ago
Tracked since
Jan 30, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…