Library/helmForked

stanford-crfm/helm

helm

Holistic Evaluation of Language Models (HELM) is an open source Python framework created by the Center for Research on Foundation Models (CRFM) at Stanford for holistic, reproducible and transparent evaluation of foundation models, including large language models (LLMs) and multimodal models.

View on GitHub↗Upstream stanford-crfm/helm↗

Builder

Stanford

stanford-crfm • research

Stars

2,856

Using upstream star count

Forks

404

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Nov 29, 2021

Project creation date

README Summary

[comment]: <> (When using the img tag, which allows us to specify size, src has to be a URL.) <img src="https://github.com/stanford-crfm/helm/raw/v0.5.4/helm-frontend/src/assets/helm-logo.png" alt="HELM logo" width="480"/>

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Foundation Models Large Language Models Multimodal AI AI Safety Model Evaluation Responsible AI

Recent Activity

Updated 4 months ago

7 Days

30 Days

90 Days

Add metric for Arabic legal scenarios (#4123)

Yifan Mai • Mar 19, 2026

7c2f30b

Add Arabic legal scenario (#4122)

Yifan Mai • Mar 19, 2026

eefc96b

Bump @types/node from 25.4.0 to 25.5.0 in /helm-frontend in the npm group (#4121)

dependabot[bot] • Mar 19, 2026

579f617

Quality

research

Quality: high
Maturity: research

PM Skills

User ExperienceData & Evaluation

Languages

Python100.0%

Timeline

Project created: Nov 29, 2021
Forked: Mar 22, 2026
Your last push: 4 months ago
Upstream last push: 2 months ago
Tracked since: Mar 20, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/helmForked

stanford-crfm/helm

helm

View on GitHub↗Upstream stanford-crfm/helm↗

Builder

Stanford

stanford-crfm • research

Stars

2,856

Using upstream star count

Forks

404

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Nov 29, 2021

Project creation date

README Summary

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Foundation Models Large Language Models Multimodal AI AI Safety Model Evaluation Responsible AI

Recent Activity

Updated 4 months ago

7 Days

30 Days

90 Days

Add metric for Arabic legal scenarios (#4123)

Yifan Mai • Mar 19, 2026

7c2f30b

Add Arabic legal scenario (#4122)

Yifan Mai • Mar 19, 2026

eefc96b

Bump @types/node from 25.4.0 to 25.5.0 in /helm-frontend in the npm group (#4121)

dependabot[bot] • Mar 19, 2026

579f617

Quality

research

Quality: high
Maturity: research

PM Skills

User ExperienceData & Evaluation

Languages

Python100.0%

Timeline

Project created: Nov 29, 2021
Forked: Mar 22, 2026
Your last push: 4 months ago
Upstream last push: 2 months ago
Tracked since: Mar 20, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

helm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

helm

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos