16 Build a Free Chat App on Google Colab using RAG
16 Build a Free Chat App on Google Colab using RAG
Note: This code has been corrected and the model used is not the best model. The model selection will depend on colab GPU and that can be explored and changed but this is a good starting point nonetheless to understand how things really work.
🔹 Step 1: Install Dependencies
👉 Paste this in a Colab cell:
!pip install -q transformers accelerate sentence-transformers faiss-cpu gradio pypdf
🔹 Step 2: Import Libraries
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from sentence_transformers import SentenceTransformer
import faiss
import numpy as np
import gradio as gr
from pypdf import PdfReader
🔹 Step 3: Load Free LLM (Lightweight)
model_name = "google/flan-t5-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
generator = pipeline("text2text-generation", model=model, tokenizer=tokenizer)
🔹 Step 4: Load Embedding Model
embed_model = SentenceTransformer('all-MiniLM-L6-v2')
🔹 Step 5: Upload & Process Documents
def load_pdf(file):
reader = PdfReader(file)
text = ""
for page in reader.pages:
text += page.extract_text()
return text
def chunk_text(text, chunk_size=500):
chunks = []
for i in range(0, len(text), chunk_size):
chunks.append(text[i:i+chunk_size])
return chunks
🔹 Step 6: Create Vector Database (FAISS)
documents = []
embeddings = None
index = None
def process_document(file):
global documents, embeddings, index
text = load_pdf(file)
documents = chunk_text(text)
embeddings = embed_model.encode(documents)
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings))
return "Document processed!"
🔹 Step 7: RAG Search Function
def retrieve(query, k=3):
query_embedding = embed_model.encode([query])
distances, indices = index.search(np.array(query_embedding), k)
results = [documents[i] for i in indices[0]]
return " ".join(results)
🔹 Step 8: Chat Function
def chat(query):
context = retrieve(query)
prompt = f"""
Answer the question using the context below:
Context:
{context}
Question:
{query}
"""
response = generator(prompt, max_length=256, do_sample=True)[0]['generated_text']
return response
🔹 Step 9: Build UI (Gradio)
with gr.Blocks() as app:
gr.Markdown("## 🧠 Free ChatGPT with Your Documents")
file_input = gr.File(label="Upload PDF")
upload_btn = gr.Button("Process Document")
chatbot = gr.Chatbot()
msg = gr.Textbox(label="Ask a question")
def user_input(user_message, history):
bot_response = chat(user_message)
history.append((user_message, bot_response))
return "", history
upload_btn.click(process_document, inputs=file_input, outputs=None)
msg.submit(user_input, [msg, chatbot], [msg, chatbot])
app.launch()
🎯 How to Use
- Run all cells
- Upload a PDF
- Click Process Document
- Ask questions
⚠️ Important Limitations
| Area | Limitation |
|---|---|
| Model | Not GPT-4 quality |
| Speed | Moderate |
| Memory | Colab RAM limits |
| Context | Small chunks |
Comments
Post a Comment