This website requires javascript to properly function. Consider activating javascript to get access to all site functionality.
Effective Altruism Forum
Topics
EA Forum
Login
Sign up
AI evaluations and standards
•
Applied to
I read every major AI lab’s safety plan so you don’t have to
2d
ago
•
Applied to
OpenAI's o1 tried to avoid being shut down, and lied about it, in evals
12d
ago
•
Applied to
OpenAI's CBRN tests seem unclear
1mo
ago
•
Applied to
College technical AI safety hackathon retrospective - Georgia Tech
1mo
ago
•
Applied to
Comparing AI Labs and Pharmaceutical Companies
1mo
ago
•
Applied to
The current state of RSPs
1mo
ago
•
Applied to
Trendlines in AIxBio evals
2mo
ago
•
Applied to
Announcing ForecastBench, a new benchmark for AI and human forecasting abilities
3mo
ago
•
Applied to
Join the $10K AutoHack 2024 Tournament
3mo
ago
•
Applied to
Model evals for dangerous capabilities
3mo
ago
•
Applied to
Submit Your Toughest Questions for Humanity's Last Exam
3mo
ago
•
Applied to
Thinking About Propensity Evaluations
4mo
ago
•
Applied to
A Taxonomy Of AI System Evaluations
4mo
ago
•
Applied to
Case studies on social-welfare-based standards in various industries
6mo
ago
•
Applied to
[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
6mo
ago
•
Applied to
Demonstrate and evaluate risks from AI to society at the AI x Democracy research hackathon
8mo
ago
•
Applied to
LLM Evaluators Recognize and Favor Their Own Generations
8mo
ago
•
Applied to
OMMC Announces RIP
9mo
ago
•
Applied to
Join the AI Evaluation Tasks Bounty Hackathon
9mo
ago
•
Applied to
Introducing METR's Autonomy Evaluation Resources
9mo
ago