Library/skillForked

pinchbench/skill

skill

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

View on GitHub↗Upstream pinchbench/skill↗

Builder

pinchbench

pinchbench • individual

Stars

1,209

Using upstream star count

Forks

133

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Feb 11, 2026

Project creation date

README Summary

[![Leaderboard](https://img.shields.io/badge/leaderboard-pinchbench.com-blue)](https://pinchbench.com) [![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Agent-Based ArchitectureAgentic AI SystemsBenchmark Design and MethodologyCode Generation AssessmentCode Generation BenchmarkingCode Generation ModelsCoding Task EvaluationLarge Language Model EvaluationLLM-as-Agent ArchitectureLLM Evaluation and BenchmarkingModel Comparison and AnalysisModel Performance MeasurementModel Performance MetricsPrompt EngineeringSoftware Engineering AgentsSoftware Engineering AI

Taxonomy

AI Trends

Agentic AI LLM as Agents Code Generation Model Evaluation LLM Evaluation and Benchmarking AI Agent Frameworks Coding AI Systems LLM Evaluation Code-Generating Agents Model Benchmarking

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

Merge pull request #73 from luccathescientist/fix-judge-total-normalization

Brendan O'Leary • Mar 24, 2026

1e2ba6b

Normalize judge totals to 0-1 scale

Lucca • Mar 21, 2026

4359719

Merge pull request #71 from pinchbench/lint-and-complie

Brendan O'Leary • Mar 19, 2026

e8e833b

Quality

prototype

Quality: medium
Maturity: prototype

PM Skills

Data & EvaluationAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Feb 11, 2026
Forked: Mar 28, 2026
Your last push: 2 months ago
Upstream last push: 20 days ago
Tracked since: Mar 24, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

Library/skillForked

pinchbench/skill

skill

PinchBench is a benchmarking system for evaluating LLM models as OpenClaw coding agents. Made with 🦀 by the humans at https://kilo.ai

View on GitHub↗Upstream pinchbench/skill↗

Builder

pinchbench

pinchbench • individual

Stars

1,209

Using upstream star count

Forks

133

Using upstream fork count

Open Issues

Activity Score

0/100

0 commits in 30d

Created

Feb 11, 2026

Project creation date

README Summary

[![Leaderboard](https://img.shields.io/badge/leaderboard-pinchbench.com-blue)](https://pinchbench.com) [![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)

Community Evaluation

Loading…

AI Dev Skills

Unmapped

Taxonomy

AI Trends

Agentic AI LLM as Agents Code Generation Model Evaluation LLM Evaluation and Benchmarking AI Agent Frameworks Coding AI Systems LLM Evaluation Code-Generating Agents Model Benchmarking

Recent Activity

Updated 2 months ago

7 Days

30 Days

90 Days

Merge pull request #73 from luccathescientist/fix-judge-total-normalization

Brendan O'Leary • Mar 24, 2026

1e2ba6b

Normalize judge totals to 0-1 scale

Lucca • Mar 21, 2026

4359719

Merge pull request #71 from pinchbench/lint-and-complie

Brendan O'Leary • Mar 19, 2026

e8e833b

Quality

prototype

Quality: medium
Maturity: prototype

PM Skills

Data & EvaluationAI-Native Architecture

Languages

Python100.0%

Timeline

Project created: Feb 11, 2026
Forked: Mar 28, 2026
Your last push: 2 months ago
Upstream last push: 20 days ago
Tracked since: Mar 24, 2026

Similar Repos

pgvector cosine similarity · $0

Loading…

skill

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos

skill

README Summary

Community Evaluation

AI Dev Skills

Tags

Taxonomy

Recent Activity

Quality

Categories

PM Skills

Languages

Timeline

Similar Repos