16 Build a Free Chat App on Google Colab using RAG

Note: This code has been corrected and the model used is not the best model. The model selection will depend on colab GPU and that can be explored and changed but this is a good starting point nonetheless to understand how things really work.

🔹 Step 1: Install Dependencies

👉 Paste this in a Colab cell:


!pip install -q transformers accelerate sentence-transformers faiss-cpu gradio pypdf

🔹 Step 2: Import Libraries


from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import gradio as gr
from pypdf import PdfReader

🔹 Step 3: Load Free LLM (Lightweight)


model_name = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

🔹 Step 4: Load Embedding Model


embed_model = SentenceTransformer('all-MiniLM-L6-v2')

🔹 Step 5: Upload & Process Documents


def load_pdf(file):
    reader = PdfReader(file)
    text = ""
    for page in reader.pages:
        text += page.extract_text()
    return text

def chunk_text(text, chunk_size=500):
    chunks = []
    for i in range(0, len(text), chunk_size):
        chunks.append(text[i:i+chunk_size])
    return chunks

🔹 Step 6: Create Vector Database (FAISS)


documents = []
embeddings = None
index = None

def process_document(file):
    global documents, embeddings, index
    
    text = load_pdf(file)
    documents = chunk_text(text)

    embeddings = embed_model.encode(documents)

    dimension = embeddings.shape[1]
    index = faiss.IndexFlatL2(dimension)
    index.add(np.array(embeddings))

    return "Document processed!"

🔹 Step 7: RAG Search Function


def retrieve(query, k=3):
    query_embedding = embed_model.encode([query])
    distances, indices = index.search(np.array(query_embedding), k)

    results = [documents[i] for i in indices[0]]
    return " ".join(results)

🔹 Step 8: Chat Function


def chat(query):
    context = retrieve(query)

    prompt = f"""
    Answer the question using the context below:

    Context:
    {context}

    Question:
    {query}
    """

    response = generator(prompt, max_length=256, do_sample=True)[0]['generated_text']
    return response

🔹 Step 9: Build UI (Gradio)


with gr.Blocks() as app:
    gr.Markdown("## 🧠 Free ChatGPT with Your Documents")

    file_input = gr.File(label="Upload PDF")
    upload_btn = gr.Button("Process Document")

    chatbot = gr.Chatbot()
    msg = gr.Textbox(label="Ask a question")

    def user_input(user_message, history):
        bot_response = chat(user_message)
        history.append((user_message, bot_response))
        return "", history

    upload_btn.click(process_document, inputs=file_input, outputs=None)
    msg.submit(user_input, [msg, chatbot], [msg, chatbot])

app.launch()

🎯 How to Use

Run all cells
Upload a PDF
Click Process Document
Ask questions

⚠️ Important Limitations

Area	Limitation
Model	Not GPT-4 quality
Speed	Moderate
Memory	Colab RAM limits
Context	Small chunks

Search This Blog

Tech Talks

16 Build a Free Chat App on Google Colab using RAG

16 Build a Free Chat App on Google Colab using RAG

🔹 Step 1: Install Dependencies

🔹 Step 2: Import Libraries

🔹 Step 3: Load Free LLM (Lightweight)

🔹 Step 4: Load Embedding Model

🔹 Step 5: Upload & Process Documents

🔹 Step 6: Create Vector Database (FAISS)

🔹 Step 7: RAG Search Function

🔹 Step 8: Chat Function

🔹 Step 9: Build UI (Gradio)

🎯 How to Use

⚠️ Important Limitations

Comments

Post a Comment

Popular posts from this blog

19 Google ADK Tutorial

15 Agent AI vs Agentic AI