AI

After an outcry, OpenAI swiftly rereleased 4o to paid users. But experts say it should not have removed the model so suddenly.

OpenAI’s decision to replace 4o with the more straightforward GPT-5 follows a steady drumbeat of news about the potentially harmful effects of extensive chatbot use. Reports of incidents in which ChatGPT sparked psychosis in users have been everywhere for the past few months, and in a blog post last week, OpenAI acknowledged 4o’s failure to…

AI

‘Cheapfake’ AI Celeb Videos Are Rage-Baiting People on YouTube

“They’re tweaking my voice or whatever they’re doing, tweaking their own voice to make it sound like me, and people are commenting on it like it is me and it ain’t me,” Washington recently told WIRED, when asked about AI. “I don’t have an Instagram account. I don’t have TikTok. I don’t have any of…

AI

GPT-5 Doesn’t Dislike You—It Might Just Need a Benchmark for Emotional Intelligence

Since the all-new ChatGPT launched on Thursday, some users have mourned the disappearance of a peppy and encouraging personality in favor of a colder, more businesslike one (a move seemingly designed to reduce unhealthy user behavior.) The backlash shows the challenge of building artificial intelligence systems that exhibit anything like real emotional intelligence. Researchers at…

AI

OpenAI Designed GPT-5 to Be Safer. It Still Outputs Gay Slurs

OpenAI is trying to make its chatbot less annoying with the release of GPT-5. And I’m not talking about adjustments to its synthetic personality that many users have complained about. Before GPT-5, if the AI tool determined it couldn’t answer your prompt because the request violated OpenAI’s content guidelines, it would hit you with a…

Software

Building a RAG from Scratch: A Beginner’s Guide (Part 2: Building a Web API)

psitbdUser2 months ago05 mins

In the first part of this series, we built a simple, command-line-only RAG application. In this post, we’ll turn our command-line application into a web API using FastAPI.

The Goal for This Post

By the end of this post, you will have a web API that can answer questions about the documents in the data directory. You will be able to interact with the API using curl or the Swagger UI.

Why FastAPI?

FastAPI is a modern, fast (high-performance) web framework for building APIs with Python. It’s easy to use, has great documentation, and comes with a built-in Swagger UI that allows you to interact with your API from the browser.

The Code

Here is the complete code for our FastAPI application.

requirements.txt

langchain
faiss-cpu
pdfplumber
fastapi
uvicorn
python-dotenv
huggingface_hub

app/models.py

from pydantic import BaseModel


class Query(BaseModel):
    question: str

app/main.py

from fastapi import FastAPI, HTTPException
import uvicorn
import os
from dotenv import load_dotenv

from app.models import Query
from app.rag_logic import (
    load_documents,
    split_documents,
    create_and_store_embeddings,
    load_retriever,
    create_rag_chain,
)

load_dotenv()

app = FastAPI()


def load_vectordb():
    vectordb_path = os.getenv("VECTOR_DB_DIR")
    if not vectordb_path:
        raise ValueError("VECTOR_DB_DIR environment variable not set.")
    data_dir = os.getenv("DATA_DIR")
    if not data_dir:
        raise ValueError("DATA_DIR environment variable not set.")
    if not os.path.exists(vectordb_path) or not os.listdir(vectordb_path):
        documents = load_documents(data_dir)
        chunks = split_documents(documents)
        create_and_store_embeddings(chunks, vectordb_path)
    else:
        print("Vector store already exists.")


@app.post("/ask")
def ask_question(query: Query):
    """Answers a question using the RAG pipeline."""
    try:
        vectordb_path = os.getenv("VECTOR_DB_DIR")
        if not vectordb_path:
            raise ValueError("VECTOR_DB_DIR environment variable not set.")
        retriever = load_retriever(vectordb_path)
        local_llm_url = os.getenv("LOCAL_LLM_URL")
        if not local_llm_url:
            raise ValueError("LOCAL_LLM_URL environment variable not set.")
        rag_chain = create_rag_chain(retriever, local_llm_url)
        response = rag_chain.invoke({"input": query.question})
        return {"answer": response["answer"]}
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))


if __name__ == "__main__":
    load_vectordb()
    uvicorn.run(app, host="0.0.0.0", port=8000)

How to Run the Code

Install the dependencies:
```
pip install -r requirements.txt
```
Create a .env file:

Create a .env file in the root of the project and add the following environment variables:
```
LOCAL_LLM_URL=http://localhost:1234
VECTOR_DB_DIR=vector_store
DATA_DIR=data
```
Run the application:
```
python app/main.py
```
Interact with the API:

You can interact with the API using curl:
```
curl -X POST "http://localhost:8000/ask" -H "Content-Type: application/json" -d '{
  "question": "What is attention?"
}'
```
Or you can use the Swagger UI by navigating to http://localhost:8000/docs in your browser.