livecodebench

Here is 1 public repository matching this topic...

SS47816 / AGI-Elo

AGI-Elo: How Far Are We From Mastering A Task?

benchmark leaderboard agi imagenet coco artificial-general-intelligence datasets evaluation-metrics elo-rating rating-system evaluation-framework sota ai-benchmarks waymo-open-dataset mmlu vision-language-action ai-evaluation-framework livecodebench navsim

Updated May 21, 2025
Python

Improve this page

Add a description, image, and links to the livecodebench topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the livecodebench topic, visit your repo's landing page and select "manage topics."

Learn more

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly