Submit repository
Discover trends that matter
Trending repositories
Daily
Weekly
Monthly
Yearly
Live mentions
Topics
GitHub trending
Repositories
Developers
Insights
Stats
AmandineFlachs/opus-4-8-blind-multi-judge-eval — GitHub trending stats & insights | Trendshift
Sponsor spot open
·
promote your product
AmandineFlachs/opus-4-8-blind-multi-judge-eval
#
NLP
A blind, multi-judge behavioural benchmark to evaluate Opus 4.8
Visit GitHub
HTML
2 contributors
MIT License
Social mentions
Recent discussions about this repository across the web
>Scoring was blind. I used four judges: Claude, GPT-5.4, Kimi K2.6 & me The point was to avoid making this a “Claude grades Claude” situation. Across all four judges, Opus 4.8 ranked above 4.7. >>…
@AmandineFlachs · x.com
No trending activity
This repository has not yet been featured on GitHub Trending
Repository activities
repository's daily and monthly activities across stars, forks, merged PRs, issues, and closed issues
Data is not available yet
Recent activity data for stars, forks, merged PRs, issues, and closed issues will appear here once available