As part of its Preparedness Framework, OpenAI is investing in the development of improved evaluation methods for AI-enabled safety risks, specifically focusing on the potential for AI systems to assist in creating biological threats.
OpenAI conducted a study with 100 human participants, comprising 50 biology experts with PhDs and wet lab experience, and 50 student-level participants with at least one university-level biology course.
The study assessed uplifts in performance across five metrics (accuracy, completeness, innovation, time taken, and self-rated difficulty) and five stages in the biological threat creation process (ideation, acquisition, magnification, formulation, and release).
OpenAI shares its evaluation procedure, results, methodological insights related to capability elicitation and security considerations for running such evaluations with frontier models at scale.
This study represents one of the largest to-date human evaluations of AI's impact on biosecurity risk and information access related to biological threat creation.
Ask anything...