Popular repositories Loading
-
senior-swe-bench-v2026.06
senior-swe-bench-v2026.06 PublicHarbor dataset for Senior SWE-Bench (v2026.06)
Python 26
-
-
terminal-bench
terminal-bench PublicForked from harbor-framework/terminal-bench
A benchmark for LLMs on complicated tasks in the terminal
Python 4
-
Snorkel-Wordle-Benchmark
Snorkel-Wordle-Benchmark PublicBenchmarking LLM performance in playing the game of Wordle.
-
pydantic
pydantic PublicForked from pydantic/pydantic
Data validation using Python type hints
Python 3
-
Repositories
- aws-assume-role-with-web-identity-buildkite-plugin Public Forked from buildkite-plugins/aws-assume-role-with-web-identity-buildkite-plugin
A Buildkite plugin to assume-role-with-web-identity using a Buildkite OIDC token before running the build command
snorkel-ai/aws-assume-role-with-web-identity-buildkite-plugin’s past year of commit activity - llm-interview-proxy Public
snorkel-ai/llm-interview-proxy’s past year of commit activity - Toolathlon Public Forked from hkust-nlp/Toolathlon
[ICLR 2026] The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution
snorkel-ai/Toolathlon’s past year of commit activity - OpenEnv Public Forked from huggingface/OpenEnv
An interface library for RL post training with environments.
snorkel-ai/OpenEnv’s past year of commit activity - UI-Elements-Visualizer Public
snorkel-ai/UI-Elements-Visualizer’s past year of commit activity - mai-banking-hr-data-viewer Public
snorkel-ai/mai-banking-hr-data-viewer’s past year of commit activity - UI-Elements-Training Public
snorkel-ai/UI-Elements-Training’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…