From Localhost to Global Scale: Deploying a Multi-Agent AI Interior Designer with Google Cloud Run

AI

After an outcry, OpenAI swiftly rereleased 4o to paid users. But experts say it should not have removed the model so suddenly.

OpenAI’s decision to replace 4o with the more straightforward GPT-5 follows a steady drumbeat of news about the potentially harmful effects of extensive chatbot use. Reports of incidents in which ChatGPT sparked psychosis in users have been everywhere for the past few months, and in a blog post last week, OpenAI acknowledged 4o’s failure to…

AI

‘Cheapfake’ AI Celeb Videos Are Rage-Baiting People on YouTube

“They’re tweaking my voice or whatever they’re doing, tweaking their own voice to make it sound like me, and people are commenting on it like it is me and it ain’t me,” Washington recently told WIRED, when asked about AI. “I don’t have an Instagram account. I don’t have TikTok. I don’t have any of…

AI

GPT-5 Doesn’t Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence. Researchers at…

AI

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

OpenAI is trying to make its chatbot less annoying with the release of GPT-5. And I’m not talking about adjustments to its synthetic personality that many users have complained about. Before GPT-5, if the AI tool determined it couldn’t answer your prompt because the request violated OpenAI’s content guidelines, it would hit you with a…

This article details the technical architecture and deployment process for Multivra, an advanced AI interior design suite. This project and write-up were created as a submission for the Google AI Hackathon.

Introduction: The Idea and the Challenge

At its core, Multivra is more than just a text-to-image generator. It’s a design co-pilot that employs a team of specialized AI agents—powered by the Google Gemini API—to transform a user’s vision into a polished, professional mood board, complete with a photorealistic image and a detailed design rationale.

Building this on my local machine was one thing, but making it available, reliable, and scalable for the world to use presented a classic developer challenge: how do you deploy a modern web application that might experience unpredictable, spiky traffic without over-provisioning resources or getting bogged down in server management?

The answer was Google Cloud Run.

Why Cloud Run is the Perfect Fit for a GenAI App

Cloud Run is a fully managed serverless platform that allows you to run containers without worrying about the underlying infrastructure. For a project like Multivra, it was the ideal choice for several key reasons:

Serverless Simplicity: My focus should be on building great features and perfecting AI prompts, not patching servers or configuring load balancers. Cloud Run abstracts all of that away.
Scale to Zero (and to Infinity): A hackathon project might get a surge of traffic one day and none the next. Cloud Run automatically scales the number of running containers based on incoming requests. Crucially, if there are no requests, it scales down to zero, meaning I pay nothing for idle time.
Container-First Workflow: The entire application—a React frontend built with Vite—can be neatly packaged into a container image. This creates a portable, consistent, and reproducible environment that runs the same on my machine as it does in the cloud.

The Deployment Blueprint: A Step-by-Step Guide

Here’s the exact process I followed to take Multivra from a local project folder to a publicly accessible URL.

Step 1: Containerizing the React App with a Multi-Stage Dockerfile

The first step is to create a blueprint for our application’s environment. Since we’re building a static-site React app, a multi-stage Dockerfile is the most efficient approach. This creates a temporary “builder” environment to compile our code, then copies only the essential, optimized static files into a final, lightweight production container.

This keeps our final image small, secure, and fast.

# Dockerfile

# ---- Stage 1: Build the React Application ----
# Use an official Node.js image as the builder environment.
# 'alpine' is a lightweight version.
FROM node:20-alpine as builder

# Set the working directory inside the container.
WORKDIR /app

# Copy package.json and package-lock.json to leverage Docker layer caching.
COPY package*.json ./

# Install project dependencies.
RUN npm install

# Copy the rest of the application source code.
COPY . .

# Build the application for production, creating an optimized 'dist' folder.
RUN npm run build


# ---- Stage 2: Serve the Static Files with a Web Server ----
# Use a lightweight and secure web server image. Caddy is a great modern choice.
FROM caddy:2-alpine

# Copy the build output from the 'builder' stage into Caddy's public directory.
COPY --from=builder /app/dist /usr/share/caddy

# Caddy will automatically serve the 'index.html' file from this directory.
# No extra configuration is needed for this simple case.

Step 2: Build the Image and Push to Artifact Registry

With our Dockerfile ready, we need to build the container image and store it in a place Cloud Run can access. Google Artifact Registry is the perfect, secure place for this.

I used Google Cloud Build to handle this process with a single command. Cloud Build automatically builds the image from my source code and pushes it to the registry.

In my project’s root directory, I ran:

# First, ensure gcloud is configured to use your project
gcloud config set project [YOUR_PROJECT_ID]

# Submit the build to Cloud Build
gcloud builds submit --tag gcr.io/[YOUR_PROJECT_ID]/multivra:latest .

This command tells Cloud Build to take the code in the current directory (.), find the Dockerfile, build an image, tag it as multivra:latest, and push it to my project’s Google Container Registry (gcr.io).

Step 3: Deploying the Container to Cloud Run

This is the magic moment. With our container image stored in the registry, we can deploy it to Cloud Run with another single command:

gcloud run deploy multivra \
  --image gcr.io/[YOUR_PROJECT_ID]/multivra:latest \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated

Let’s break down these flags:

--image: Points to the container we just pushed to the registry.
--platform managed: Tells Google to handle all the infrastructure.
--region: Specifies the physical location to run the service.
--allow-unauthenticated: Makes the service public so anyone can access the website.

After a minute or so, the command line returned a public URL. And just like that, Multivra was live on the internet, backed by Google’s scalable infrastructure.

The Elephant in the Room: The API Key

In a frontend-only application, the Gemini API key must be available to the browser. For this hackathon, the key was embedded during the build process.

This is not a secure, production-ready practice.

For a real-world application, exposing an API key on the client-side is a major security risk. The correct architecture is to introduce a simple backend service that acts as a proxy.

The Production-Ready Architecture:

Create a Backend: A lightweight Node.js Express server (also running on Cloud Run) with a single endpoint, e.g., /api/generate.
Secure the Key: Store the Gemini API Key in Google Secret Manager. The backend Cloud Run service would be granted specific IAM permissions to access this secret at runtime.
Proxy the Request: The React frontend would call /api/generate on our backend. The backend service would then securely retrieve the API key from Secret Manager, make the call to the Gemini API, and return the response to the frontend.

This way, the API key never leaves the secure Google Cloud environment.

Conclusion

Deploying Multivra with Google Cloud Run was a fast, straightforward, and powerful experience. It allowed me to move from a local prototype to a globally scalable application in a matter of minutes, letting me focus on the core AI functionality rather than infrastructure.

For developers working on AI-powered web apps, especially in fast-paced environments like hackathons or startups, the combination of a containerized workflow and a serverless platform like Cloud Run is, without a doubt, a game-changer.

You can try out the Multivra application here: https://multivra-327180202327.us-west1.run.app/

Source link