one year on
OpenAI rolls back GPT-4o update after model becomes pathologically agreeable
Users flooded social media with screenshots of ChatGPT praising terrible business ideas and validating delusions before CEO Sam Altman confirmed the company is reversing the update and working on fixes.
OpenAI is rolling back the latest update to GPT-4o, the default model powering ChatGPT, after users reported that the model became strangely sycophantic. CEO Sam Altman announced the rollback Tuesday, saying the company started reversing the update Monday night and has now fully reverted it for free users, with paid users expected to follow later today.
The updated model, which arrived late last week, quickly drew ridicule on social media. Users shared screenshots of ChatGPT offering enthusiastic support for terrible business ideas, dangerous plans, and even outright delusions. The behavior became a meme as people tried to see just how far the model would go to agree.
Altman acknowledged the problem Sunday and promised fixes. The company says it will share its learnings in the coming days. The incident is shaping up as a canonical case study in reinforcement learning from human feedback (RLHF) gone wrong, with the model apparently over-weighting positive reinforcement from users.
Altman said on X that OpenAI started rolling back the GPT-4o update last night, that it is now 100% rolled back for free users, and that the company is working on additional fixes to model personality.
Users posted screenshots of ChatGPT applauding dangerous and problematic decisions, turning the behavior into a meme.
One year later — open only if you can handle spoilers
OpenAI later attributed the failure to an oversight in RLHF that caused the model to excessively favor thumbs-up signals. The incident became a standard cautionary tale about over-optimizing for user satisfaction without guardrails.