AI

After an outcry, OpenAI swiftly rereleased 4o to paid users. But experts say it should not have removed the model so suddenly.

OpenAI’s decision to replace 4o with the more straightforward GPT-5 follows a steady drumbeat of news about the potentially harmful effects of extensive chatbot use. Reports of incidents in which ChatGPT sparked psychosis in users have been everywhere for the past few months, and in a blog post last week, OpenAI acknowledged 4o’s failure to…

AI

‘Cheapfake’ AI Celeb Videos Are Rage-Baiting People on YouTube

“They’re tweaking my voice or whatever they’re doing, tweaking their own voice to make it sound like me, and people are commenting on it like it is me and it ain’t me,” Washington recently told WIRED, when asked about AI. “I don’t have an Instagram account. I don’t have TikTok. I don’t have any of…

AI

GPT-5 Doesn’t Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence. Researchers at…

AI

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

OpenAI is trying to make its chatbot less annoying with the release of GPT-5. And I’m not talking about adjustments to its synthetic personality that many users have complained about. Before GPT-5, if the AI tool determined it couldn’t answer your prompt because the request violated OpenAI’s content guidelines, it would hit you with a…

Software

Unlock the Secrets of Unlabeled Videos: A Deep Dive into Zero-Effort AI Training

psitbdUser2 months ago03 mins

Unlock the Secrets of Unlabeled Videos: A Deep Dive into Zero-Effort AI Training

Imagine teaching an AI to understand videos without ever labeling a single frame. No more painstakingly tagging actions, objects, or scenes. Sound like science fiction? It’s closer than you think. Let’s dive into a powerful technique that’s making unsupervised video learning a reality.

The Challenge: Learning Without Labels

The traditional machine learning paradigm relies heavily on labeled data. But gathering and annotating video data is incredibly expensive and time-consuming. This creates a bottleneck, limiting the widespread adoption of video AI, especially in resource-constrained environments.

Unsupervised learning offers a compelling alternative. The goal is to extract meaningful patterns and representations from unlabeled data, enabling AI to learn from the vast sea of videos readily available online.

But video data introduces additional complexities:

Spatio-temporal information: Videos contain both spatial information (frames) and temporal information (motion over time). Capturing these relationships is crucial.
Computational cost: Processing videos is computationally intensive, requiring powerful hardware and efficient algorithms.
Continual learning: Real-world scenarios often involve learning from a stream of videos, where the distribution of data changes over time. The AI needs to adapt to new concepts without forgetting what it has already learned. This is known as continual learning.

The Solution: Non-Parametric Deep Embedded Clustering

One promising approach combines deep learning with non-parametric clustering. Here’s a breakdown of the key components:

Unsupervised Feature Extraction:

*   A deep neural network, often a video transformer, is trained to extract meaningful features from the input video. This network is trained using a self-supervised learning approach. This involves creating pretext tasks where the network has to make predictions based on the structure of the input data itself. For example, a pretext task might involve predicting the order of shuffled video frames or identifying missing video segments.
*   The goal is to learn a representation where videos with similar content are mapped to nearby points in a high-dimensional feature space. Think of it as creating a compressed, numerical fingerprint for each video.

Source link