Reporium
GraphWikiTaxonomyStacksInsightsTrendsArchitectureAI-NativeFAQ
Ask anything about the repo library…
Loading repo…
←Library/LLaVA
Library/LLaVAForked

haotian-liu/LLaVA

LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

View on GitHub↗Upstream haotian-liu/LLaVA↗

Builder

haotian-liu

haotian-liu

haotian-liu • individual

Stars

24,842

Using upstream star count

Forks

2,768

Using upstream fork count

Open Issues

0

Activity Score

0/100

0 commits in 30d

Created

Apr 17, 2023

Project creation date

README Summary

*Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.*

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Computer Vision IntegrationInstruction FollowingLarge Language Model Fine-tuningModel AlignmentMultimodal LearningMultimodal ReasoningTransformer ArchitectureVision-Language Model ArchitectureVisual Instruction TuningVisual Question Answering

Tags

Computer Vision IntegrationInstruction FollowingLarge Language Model Fine-tuningModel AlignmentMultimodal LearningMultimodal ReasoningTransformer ArchitectureVision-Language Model ArchitectureVisual Instruction TuningVisual Question AnsweringAI SafetyAutoGenBenchmarkingC++DPODeepSpeedEvalsFine-TuningForkedGPTGPU / CUDAGoogle AIHealthcare AIHuggingFaceJupyterLarge Language ModelsLoRA / PEFTMultimodal AIOpenAIPythonQuantizationQwenRLHFReal-Time / StreamingReinforcement LearningResearch / PapersSGLangTutorialllama.cpp

Taxonomy

AI Trends

Multimodal ReasoningVisual Language ModelsFoundation ModelsInstruction TuningAI Alignment

category

Foundation ModelsAI AgentsModel TrainingEvals & BenchmarkingInference & ServingCloud & PlatformsLearning ResourcesIndustry: HealthcareSecurity & SafetyData Science & Analytics

Deployment Context

Self-hostedCloud APIOn-premise

Industries

EducationHealthcareRoboticsContent CreationAccessibility Technology

Modalities

TextImageMultimodal

Skill Areas

Multimodal LearningVisual Instruction TuningLarge Language Model Fine-tuningVision-Language Model ArchitectureTransformer ArchitectureComputer Vision IntegrationInstruction FollowingVisual Question AnsweringMultimodal ReasoningModel Alignment

tag

AI SafetyAutoGenBenchmarkingC++DPODeepSpeedEvalsFine-TuningForkedGPTGPU / CUDAGoogle AIHealthcare AIHuggingFaceJupyterLarge Language ModelsLoRA / PEFTMultimodal AIOpenAIPythonQuantizationQwenRLHFReal-Time / StreamingReinforcement LearningResearch / PapersSGLangTutorialllama.cpp

Use Cases

Visual Question AnsweringImage Description GenerationMultimodal ChatbotsVisual Content AnalysisEducational AI TutoringAccessibility Tools for Vision ImpairedDocument UnderstandingScene Understanding and Reasoning

Recent Activity

Updated 1 years ago

7 Days

0

30 Days

0

90 Days

0

Quality

research
Quality
high
Maturity
research

Categories

Evals & BenchmarkingPrimaryInference & ServingCloud & PlatformsLearning ResourcesIndustry: HealthcareSecurity & SafetyData Science & AnalyticsFoundation ModelsAI AgentsModel TrainingSafety & AlignmentHealthcare & BiologyMultimodal AISearch & KnowledgeOther AI / ML

PM Skills

Cost & EfficiencySafety & AlignmentUser ExperienceScale & ReliabilityData & Evaluation

Languages

Python100.0%

Timeline

Project created
Apr 17, 2023
Forked
Mar 13, 2026
Your last push
1 years ago
Upstream last push
1 years ago
Tracked since
Aug 12, 2024

Similar Repos

pgvector cosine similarity · $0

Loading…