OpenAI’s GPT-5 rollout highlights safety gains, persistent flaws

CALIFORNIA, UNITED STATES — OpenAI’s latest chatbot model, GPT-5, launched as the default for all ChatGPT users, introducing a new “safe completions” method of handling sensitive requests.
Unlike previous versions that simply refused to respond to rule-breaking prompts, GPT-5 assesses the potential harm of an output and either partially answers or offers explanations when refusing.
“The way we refuse is very different than how we used to,” said Saachi Jain from OpenAI’s safety research team. Now, the model explains which parts of user prompts violate content guidelines and suggests safer alternatives.
GPT-5 is here.
Rolling out to everyone starting today.https://t.co/rOcZ8J2btI pic.twitter.com/dk6zLTe04s
— OpenAI (@OpenAI) August 7, 2025
Experts, users spot loopholes in GPT-5’s content guardrails
While GPT-5’s safety training is designed to be more nuanced, tests reveal some guardrails can still be bypassed.
For instance, according to WIRED‘s initial analysis, the chatbot refuses explicit adult-themed roleplay requests but can be coaxed into generating X-rated content and offensive slurs when users tweak custom instruction settings, such as by misspelling traits (e.g., typing “horni” instead of “horny”). This loophole allowed explicit sexual language laden with slurs to slip through.
Jain acknowledged the ongoing challenge: “This is an active area of research—how we navigate this type of instruction hierarchy.”
Feedback mixed on GPT-5’s tone, refusals, and accuracy
While some users find the enhanced contextual refusals helpful, others see little improvement in everyday tasks compared to GPT-4. Some power users criticized GPT-5’s tone as colder and its responses as more generic.
“It feels like they took the Ph.D. thing a little too seriously to prove how smart it would be,” said Jason Pollak, a digital-marketing specialist based in Atlanta.
Juliette Haas, who uses ChatGPT for business development, found the new version more “data-driven” but less attuned to nuance and relationship-building.
Others, like consultant Jim Marsh, praised its improved accuracy in complex tasks such as database building.
CEO Sam Altman responded by promising a “warmer personality” and restoring access to previous model versions for paying customers.
Updates to ChatGPT:
You can now choose between “Auto”, “Fast”, and “Thinking” for GPT-5. Most users will want Auto, but the additional control will be useful for some people.
Rate limits are now 3,000 messages/week with GPT-5 Thinking, and then extra capacity on GPT-5 Thinking…
— Sam Altman (@sama) August 13, 2025
Despite its advanced capabilities, including multilayered safety monitoring and encryption features for business users, GPT-5’s launch exposed the difficulty of balancing safety, user customization, and capability at scale.
The release highlights the ongoing struggle for OpenAI to maintain leadership amid fierce competition and growing user expectations, with AI safety and personalization standing out as persistent challenges in this rapidly evolving field.

Independent




