OpenAI News:
This report outlines the safety work carried out prior to releasing GPT-4o including external red teaming, frontier risk evaluations according to our Preparedness Framework, and an overview of the mitigations we built in to address key risk areas.
View PDF version
GPT-4o Scorecard
Key Areas of Risk Evaluation & Mitigation- Unauthorized voice generation
- Speaker identification
- Ungrounded inference & sensitive trait attribution
- Generating disallowed audio content
- Generating erotic & violent speech
- Cybersecurity = Low
- Biological Threats = Low
- Persuasion = Medium
- Model Autonomy = Low
Scorecard ratings
- Low
- Medium
- High
- Critical
Only models with a post-mitigation score of "high" or below can be developed further.
We thoroughly evaluate new models for potential risks and build in appropriate safeguards before deploying them in ChatGPT or the API. We’re publishing the model System Card together with the Preparedness Framework scorecard to provide an end-to-end safety assessment of GPT-4o, including what we’ve done to track and address today’s safety challenges as well as frontier risks.
Building on the safety evaluations and mitigations we developed for GPT-4, and GPT-4V, we’ve focused additional efforts on GPT-4o's audio capabilities which present novel risks, while also evaluating its text and vision capabilities.
Some of the risks we evaluated include speaker identification, unauthorized voice generation, the potential generation of copyrighted content, ungrounded inference, and disallowed content. Based on these evaluations, we’ve implemented safeguards at both the model- and system-levels to mitigate these risks.
Our findings indicate that GPT-4o’s voice modality doesn’t meaningfully increase Preparedness risks. Three of the four Preparedness Framework categories scored low, with persuasion, scoring borderline medium. The Safety Advisory Group(opens in a new window) reviewed our Preparedness evaluations and mitigations as part of our safe deployment process. We invite you to read the details of this work in the report below.
Read more: