16 Build a Free Chat App on Google Colab using RAG

 16 Build a Free Chat App on Google Colab using RAG

Note: This code has been corrected and the model used is not the best model. The model selection will depend on colab GPU and that can be explored and changed but this is a good starting point nonetheless to understand how things really work. 

🔹 Step 1: Install Dependencies

👉 Paste this in a Colab cell:

!pip install -q transformers accelerate sentence-transformers faiss-cpu gradio pypdf

🔹 Step 2: Import Libraries

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import gradio as gr
from pypdf import PdfReader

🔹 Step 3: Load Free LLM (Lightweight)

model_name = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

🔹 Step 4: Load Embedding Model

embed_model = SentenceTransformer('all-MiniLM-L6-v2')

🔹 Step 5: Upload & Process Documents

def load_pdf(file):
reader = PdfReader(file)
text = ""
for page in reader.pages:
text += page.extract_text()
return text

def chunk_text(text, chunk_size=500):
chunks = []
for i in range(0, len(text), chunk_size):
chunks.append(text[i:i+chunk_size])
return chunks

🔹 Step 6: Create Vector Database (FAISS)

documents = []
embeddings = None
index = None

def process_document(file):
global documents, embeddings, index

text = load_pdf(file)
documents = chunk_text(text)

embeddings = embed_model.encode(documents)

dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings))

return "Document processed!"

🔹 Step 7: RAG Search Function

def retrieve(query, k=3):
query_embedding = embed_model.encode([query])
distances, indices = index.search(np.array(query_embedding), k)

results = [documents[i] for i in indices[0]]
return " ".join(results)

🔹 Step 8: Chat Function

def chat(query):
context = retrieve(query)

prompt = f"""
Answer the question using the context below:

Context:
{context}

Question:
{query}
"""

response = generator(prompt, max_length=256, do_sample=True)[0]['generated_text']
return response

🔹 Step 9: Build UI (Gradio)

with gr.Blocks() as app:
gr.Markdown("## 🧠 Free ChatGPT with Your Documents")

file_input = gr.File(label="Upload PDF")
upload_btn = gr.Button("Process Document")

chatbot = gr.Chatbot()
msg = gr.Textbox(label="Ask a question")

def user_input(user_message, history):
bot_response = chat(user_message)
history.append((user_message, bot_response))
return "", history

upload_btn.click(process_document, inputs=file_input, outputs=None)
msg.submit(user_input, [msg, chatbot], [msg, chatbot])

app.launch()

🎯 How to Use

  1. Run all cells
  2. Upload a PDF
  3. Click Process Document
  4. Ask questions

⚠️ Important Limitations

AreaLimitation
ModelNot GPT-4 quality
SpeedModerate
MemoryColab RAM limits
ContextSmall chunks



Comments

Popular posts from this blog

19 Google ADK Tutorial

15 Agent AI vs Agentic AI