Turning Trends into Viral Clips in Google Colab
🧠 The Idea
Reddit is a treasure trove of viral content — from jaw-dropping political debates to hilarious short clips and trending podcasts.
But scrolling through r/all or r/videos to find the moments that actually matter is tedious. Even when you do, manually cutting clips from YouTube takes hours.
I asked myself: what if we could automate it?
➡️ Discover trending posts → locate videos → extract the best moments → make shareable highlight reels — all in one Colab notebook.
That’s how the AI Reddit Sensational Video Summarizer was born — a lightweight, fully automated pipeline that takes raw Reddit trends and turns them into polished, bite-sized videos.
📌 Project Overview
This pipeline does it all:
- Scrapes trending Reddit posts from high-signal subreddits.
- Searches and downloads YouTube videos linked (or inferred) from posts.
- Transcribes videos with OpenAI’s Whisper.
- Identifies highlight-worthy segments using AI (Gemini).
- Compiles dynamic montages ready for sharing or research.
- Archives everything in Google Drive for easy access.
It’s all in Google Colab, requires no paid APIs, and runs on free or pro-tier GPU resources.
🔧 What This Project Does
- Scrapes trending Reddit posts from high-activity subreddits like politics, news, videos, and podcasts.
- Applies keyword and viral-phrase filtering to find high-signal content (e.g., “slams”, “goes viral”, “full clip”).
- Extracts or searches for YouTube video links.
- Filters out videos longer than 60 minutes.
- Downloads up to 3 clean videos, saves them, and exports associated metadata.
- Archives everything to Google Drive for easy access.
🛠️ Tools & Libraries Used
Feature | Tool/Library | Why Use It? |
---|---|---|
Reddit Scraper | praw |
Access Reddit posts and metadata easily |
YouTube Search | serpapi |
Find relevant videos via YouTube Search |
Video Downloader | yt-dlp |
Fast, reliable video download tool |
Data Handling | pandas |
Clean and manage Reddit + video data |
Cloud Storage |
shutil + Drive |
Store results safely in Google Drive |
Runtime | Google Colab | Free GPU and fast prototyping |
🔐 Secure API Access
Instead of hardcoding sensitive API keys, I used Python’s getpass
module to collect:
- Reddit API credentials (
client_id
,client_secret
) - SerpAPI Key (
api_key
for YouTube search)
import getpass
reddit_api_id = getpass.getpass("Enter Reddit API ID: ")
reddit_api_secret = getpass.getpass("Enter Reddit API Secret: ")
serp_api_key = getpass.getpass("Enter SerpAPI Key: ")
⚙️ Setting Up Reddit
import praw
import torch
reddit = praw.Reddit(
client_id=reddit_api_id,
client_secret=reddit_api_secret,
user_agent="trending-video-finder by /u/your_username"
)
Tip: Always use a unique and descriptive user_agent when working with Reddit’s API.
🤖 Smart Reddit Scraping
We target high-activity, high-signal subreddits like r/politics
, r/news
, r/videos
, and r/podcasts
.
A custom Python function queries these subreddits for keywords and viral phrases:
df = get_smart_reddit_trends(
subreddits=["politics", "news", "videos", "podcasts"],
keywords=["speech", "interview", "debate", "podcast"],
signal_keywords=["goes viral", "slams", "clip", "debate"],
days_back=7,
limit=50
)
This gives us only high-engagement posts likely to be tied to meaningful or viral YouTube videos.
🔗 Add YouTube Links via SerpAPI (if Missing)
updated_links = []
for _, row in df.iterrows():
if row.get("youtube_link"):
updated_links.append(row["youtube_link"])
else:
yt_link = search_youtube_via_serpapi(row["title"], serp_api_key)
updated_links.append(yt_link)
time.sleep(1.5)
df["final_youtube_link"] = updated_links
This ensures no viral moment gets missed, even if Reddit users only share the title.
We use SerpAPI to search YouTube for video links using the Reddit post titles when no direct link exists.
🎯 Filter and Download Up to 3 Valid Videos (or More)
max_downloads = 3
downloaded_count = 0
filtered_rows = []
for i, row in df.iterrows():
if downloaded_count >= max_downloads:
break
url = row.get("final_youtube_link")
title = row.get("title", f"video_{i}")
# Metadata check
...
# Skip videos > 60 mins
...
# Download using yt-dlp
...
downloaded_count += 1
⚠️ Optional: For Age-Restricted or Region-Locked Content
- Sometimes YouTube videos are age-restricted, region-locked, or require login.
- To handle these, you can use a
cookies.txt
file.
👉 Only the first 3 valid videos under 60 minutes are downloaded and stored with sanitized filenames.
📄 Note on cookies.txt
(Optional)
If you want to download age-restricted, region-locked, or logged-in-only YouTube content, you’ll need a cookies.txt
file.
"cookiefile": "cookies.txt"
Never share your cookies.txt.
## Archive Videos + Metadata
!zip -r downloads.zip downloads/
df.to_csv("video_metadata.csv", index=False)
This saves the downloaded videos and metadata as downloads.zip and video_metadata.csv.
Save to Google Drive
destination_folder = "/content/drive/MyDrive/sensational_video_of_the_week/3rd_week_of_july"
shutil.copy("downloads.zip", destination_folder)
shutil.copy("video_metadata.csv", destination_folder)
Both files are copied to a specific folder in your Drive for sharing, backup, or post-processing.
✅ Results
After running the pipeline, you get:
- Up to 3 viral-ready YouTube videos per Reddit batch.
- Clean metadata: subreddit, title, score, link.
- Archived videos + transcripts in Google Drive.
- Montages ready for social sharing or research.
🚀 Why This Matters
This pipeline is a complete end-to-end content repurposing solution:
- Content creators → weekly highlights, Shorts, or Reels.
- Educators → searchable lecture clips.
- Researchers → curated datasets for NLP or multimodal learning.
- Podcast producers → automated show notes + viral snippets.
No hallucination, no tedious manual editing, no hidden costs , just a fully automated AI workflow.
📝 final_whisper_video_transcription_to_drive
Transform video content into searchable text with timestamps — all in one seamless Google Colab pipeline.
🔥 Why This Project?
Whether you’re a content creator, researcher, or developer working with video data, one thing is clear:
🎥 Video content is hard to search, analyze, and reuse — unless it’s transcribed.
This Colab notebook offers a complete, no-fluff solution to:
- ✅ Automatically transcribe multiple videos using OpenAI’s Whisper model.
- ✅ Generate plain text and timestamped segments.
- ✅ Save results to Google Drive for long-term storage and use.
- ✅ All within Google Colab, GPU-accelerated, and beginner-friendly.
🚀 What You’ll Get
- 🎙 Whisper-powered transcription (GPU-accelerated in Colab)
- 🕓 Timestamped and plain-text transcripts
- 📦 Auto-zipping and upload to your Drive
- ✅ Ideal for podcasts, interviews, lectures, and short-form content
🛠️ Models and Tools Used
Feature | Tool / Library | Purpose |
---|---|---|
Transcription | openai-whisper |
State-of-the-art speech-to-text |
Video/Audio Handling | ffmpeg-python |
Formats videos for Whisper |
Notebook Environment | Google Colab | Cloud-based, free GPU access |
Storage | Google Drive | Persistent file storage |
Scripting |
os , shutil , zipfile
|
File operations and archiving |
🧩 Key Implementation Steps
1. Mount Google Drive from Previous Step
from google.colab import drive
drive.mount('/content/drive')
2. Install Dependencies
!pip install openai-whisper ffmpeg-python
3. Load the Model & Prepare Paths
import whisper, os, torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = whisper.load_model("base", device=device)
Loads the **Whisper model**on GPU (if available) for faster transcription.
4. Unzip the Video Files and Load Metadata
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(extract_folder)
df = pd.read_csv(csv_path)
Unzips your videos and loads metadata from your Google Drive.
5. Batch Transcribe with Error Handling
for filename in os.listdir(input_folder):
if filename.endswith(".mp4"):
result = model.transcribe(video_path)
json.dump(result["segments"], open(json_path, "w"))
open(txt_path, "w").write(result["text"])
For each `.mp4`, Whisper generates:
- a **`.json`** file with timestamped segments
- a **`.txt`** file with the full transcript
6. Zip the Output for Download & Archive
shutil.make_archive(..., root_dir=transcript_folder)
shutil.copy(zip_path, destination_folder)
📂 Folder Structure on Drive
📂 sensational_video_of_the_week
└── 3rd_week_of_july
├── downloads.zip
├── video_metadata.csv
├── transcripts_plain.zip
└── transcripts_with_segments.zip
Use Cases
- 🧑🏫 Educators: Auto-transcribe lectures and organize notes.
- 🧑💼 Content creators: Convert YouTube Shorts or Reels into searchable assets.
- 🧪 Researchers: Annotate timestamped audio for NLP tasks.
- 👩🎤 Podcast producers: Generate show notes and SEO content.
✅ Final Thoughts
With just a few lines of code and a powerful open-source model, you’ve automated what used to be hours of manual work.
This pipeline:
- Saves time
- Ensures accuracy
- Gives you full control over your video transcription workflows, all within Google Colab
No API keys. No manual uploads. No hidden costs. Just results.
“What if AI could watch your videos, pick out the most viral moments, and turn them into a shareable highlight reel?”
Well, guess what? We built it. 🤖✨
🌟 What This Project Does
Imagine a world where you can take hours of footage and instantly create engaging, bite-sized video montages ready to go viral. That’s exactly what this project does!
Here’s how it works in a nutshell:
- 🗂 Load videos and transcripts (plain + Whisper segments)
- 🧠 Extract viral-worthy moments using Google’s Gemini API
- ⏱ Align quotes with precise video timestamps
- ✂️ Trim unnecessary fluff (AI-powered) while keeping the core message intact
- 🎞 Stitch together clips with dynamic zoom transitions and music
- 📦 Export everything in a neat
.zip
file for easy sharing
No hallucination. No fluff. Just real AI doing real work. 🔥
📂 Data Prep: The Power of a Good Foundation
Before the magic can happen, we need to prep the data. Here’s the foundation we build on:
- 🎥 Original video files
- 📄 Plaintext transcripts
- ⏱ Segmented transcripts (with start/end timestamps)
- 🗂 A metadata CSV (to keep track of titles)
This ensures that everything matches perfectly — even if the filenames are a bit mismatched.
🙏 (Shoutout to difflib.get_close_matches
for making it all align!)
💡 Find the Moments That Matter
Next up? Finding the viral moments! 🚀
Using Gemini 1.5 Flash, we sift through the full transcript of each video to identify potential viral quotes. Each quote gets:
- 🔥 A virality score (1–10)
- 🗣 The exact quote (no paraphrasing here!)
- 💭 A brief explanation of why it could go viral
Once we get this data, we use regex to clean and organize it into a structured DataFrame, making it easier to spot the gems. 🌟
⏱ Map Words to Video
Now, the magic starts to unfold. 🎬
We map each quote back to its exact video timestamp. How?
- 🔍 Direct text lookup against the full transcript
- 🤖 If no direct match, we use SentenceTransformers to semantically find the moment
No timestamps? No problem. We’ve got that covered. 💪
✂️ Make the Moment Snappy (Without Hallucination)
Here’s the kicker: Gemini doesn’t just trim the fluff; it keeps the message intact. We say:
- “Trim the fillers, but don’t change the essence!”
With this, we can:
- ✂️ Trim the start and end of each quote to cut out unnecessary words
- 📝 Align everything with the original transcript
- 🔗 Expand the quotes to full sentence boundaries, ensuring nothing important is lost
The result? Clean, punchy clips that don’t hallucinate or change the message. ✅
🎬 From Grid to Clip — Visual Storytelling
To add the finishing touches:
- We create a static grid image from the video’s preview frames.
- Then, using zoom transitions, we zoom into each clip, play it, and zoom back out.
The result is a punchy, dynamic feel that’s visually captivating — and, most importantly, it feels human.
🎶 Audio and Transitions: Bringing the Montage to Life
Next, we add the sound magic:
- 🎤 Voice and background music
- 🎧 Audio fades and mixing
- 🔗 Seamless transitions between clips
We do all of this using MoviePy and PIL, with zero fancy dependencies.
It’s simple, effective, and gets the job done. 💥
📤 Packaging the Output
Once everything’s polished and ready to go, we zip up the final video montages and upload them to Google Drive — all set for sharing! 📦
📂 Notebook Name: final_viral_video_montage_generator
If you’re looking to automate turning long interviews, podcasts, or other long-form videos into short, shareable moments, this is the notebook for you.
✅ No hallucinated quotes
✅ No manual editing
✅ Just AI-powered storytelling that works
🚀 Why This Matters
This pipeline is perfect for:
- Content creators summarizing long interviews
- Podcast editors clipping viral moments
- Media teams creating weekly highlight reels
- AI researchers exploring multimodal summarization
And the best part? It runs entirely in Google Colab, with free GPU access! 😎
🎵 Music Credits
“Glass Chinchilla” by The Mini Vandals — YouTube Audio Library 🎶
🙌 Final Thoughts
We didn’t just use AI to summarize text; we used it to create compelling video stories that people will want to watch and share. 🌍✨
Got hours of footage collecting digital dust? Now’s the time to unlock its viral potential.
📂 Source Code & Notebook
Get your hands on the code here:
👉 final_viral_video_montage_generator
💬 Want to Support My Work?
If you enjoyed this project, consider buying me a coffee to support more free AI tutorials and tools: