AI

After an outcry, OpenAI swiftly rereleased 4o to paid users. But experts say it should not have removed the model so suddenly.

OpenAI’s decision to replace 4o with the more straightforward GPT-5 follows a steady drumbeat of news about the potentially harmful effects of extensive chatbot use. Reports of incidents in which ChatGPT sparked psychosis in users have been everywhere for the past few months, and in a blog post last week, OpenAI acknowledged 4o’s failure to…

AI

‘Cheapfake’ AI Celeb Videos Are Rage-Baiting People on YouTube

“They’re tweaking my voice or whatever they’re doing, tweaking their own voice to make it sound like me, and people are commenting on it like it is me and it ain’t me,” Washington recently told WIRED, when asked about AI. “I don’t have an Instagram account. I don’t have TikTok. I don’t have any of…

AI

GPT-5 Doesn’t Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence. Researchers at…

AI

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

OpenAI is trying to make its chatbot less annoying with the release of GPT-5. And I’m not talking about adjustments to its synthetic personality that many users have complained about. Before GPT-5, if the AI tool determined it couldn’t answer your prompt because the request violated OpenAI’s content guidelines, it would hit you with a…

Software

RefusalBench: Generative Evaluation of Selective Refusal in Grounded LanguageModels

psitbdUser2 hours ago02 mins

RefusalBench: Teaching AI When to Say “I Don’t Know”

Ever wondered why a friendly chatbot sometimes gives a weird answer instead of staying silent? Scientists have unveiled a new test called RefusalBench that checks whether AI can wisely say “I don’t know” when the information it sees is shaky.
Imagine a librarian who refuses to recommend a book if the catalog is missing pages – that’s the kind of caution we need from AI that helps us write, search, or even drive.
In a massive study of more than 30 language models, researchers found that even the most advanced systems stumble, refusing correctly less than half the time on multi‑document tasks.
The problem isn’t size; it’s the ability to spot uncertainty and decide when to stay quiet.
The good news? The study shows this skill can be taught, and the new benchmarks let developers keep improving it.
As AI becomes a daily companion, making sure it knows when to hold back could keep our conversations safer and more trustworthy.
Stay curious and watch this space for smarter, more responsible machines.

Read article comprehensive review in Paperium.net:
RefusalBench: Generative Evaluation of Selective Refusal in Grounded LanguageModels

🤖 This analysis and review was primarily generated and structured by an AI . The content is provided for informational and quick-review purposes.

Source link