Chat with Multiple PDFs Using Google Gemini and LangChain
An App to ask questions related to PDF content
Problem Statement
With the increasing demand for analyzing and interacting with information in multiple PDF documents, there is a need for an intelligent system to provide instant answers to user queries about the content within these documents. This project demonstrates how to create an end-to-end application that allows users to upload multiple PDFs, process them into vector embeddings, and interact with the content using Google Gemini Pro and LangChain.
Features
- Upload and process multiple PDFs into vector embeddings.
- Use vector search to retrieve context-relevant answers to user queries.
- Handle PDF documents up to 200 MB each.
- Enable modular development with a clear separation of backend processing and frontend interaction using Streamlit.
- Google Gemini Pro: Generative AI for text embeddings and chat functionalities.
- LangChain: Framework for integrating embeddings, question answering, and chain-based interactions.
- Streamlit: For creating a user-friendly frontend interface.
- PyPDF2: For reading and extracting text from PDF files.
- FAISS (Facebook AI Similarity Search): For efficient vector storage and similarity search.
- Python-dotenv: For managing API keys and environment variables.
Workflow
1. PDF Processing
- Upload PDFs: Users upload one or more PDF documents.
- Extract Text: PDFs are read page by page using PyPDF2, and text is extracted.
- Split Text: Extracted text is divided into chunks using LangChain’s recursive text splitter.
2. Vector Embedding
- Embedding Creation: Text chunks are converted into embeddings using Google Generative AI Embedding.
- Vector Store: Embeddings are saved locally as a vector index using FAISS.
3. Conversational Chain
- Prompt Template: A custom template ensures relevant and detailed responses to user questions.
- Query Processing: The user’s question is matched with the most relevant vectors from the index to retrieve context.
- Answer Generation: Google Gemini Pro generates accurate responses based on the matched context.
4. Frontend Interaction
- A Streamlit app provides:
- Sidebar for uploading and processing PDFs.
- Input box for user questions.
- Display of detailed responses generated by the system.
Steps to Build
Setup
- Environment Setup:
- Create a virtual environment:
conda create -n pdf-chat python=3.10
conda activate pdf-chat
- Install dependencies:
pip install -r requirements.txt
- API Configuration:
Development
- Backend Functionality:
- Extract text from PDFs using PyPDF2.
- Split text into chunks for embedding.
- Convert text chunks into vector embeddings using Google Gemini Pro.
- Save embeddings locally using FAISS.
- Query Handling:
- Load the vector index for similarity search.
- Process user questions and retrieve the most relevant chunks.
- Generate detailed responses using Google Gemini Pro’s conversational API.
- Frontend Integration:
- Build a Streamlit interface for PDF uploads and text input.
- Display real-time responses in a user-friendly format.
Usage Instructions
- Run the application:
```bash
streamlit run app.py