ai-chatbot-with-the-ability-to-read-user-extra-files-
it is an implementation of a chatbot rag system that can chat with user depend on user custom data
AI Chatbot with Conversational Retrieval Overview This Python script implements a chatbot with conversational retrieval capabilities. The chatbot is designed to answer user questions based on pre-processed PDF or text data, making it versatile for various applications. The key components include text extraction, vector embeddings, and a conversational retrieval chain.
Key Components
Dependencies PdfReader from PyPDF2: Extracts text from PDF documents. Langchain Modules: Custom modules for text processing and retrieval. HuggingFaceInstructEmbeddings: Obtains vector embeddings from Hugging Face models. FAISS: Serves as a vector store for efficient storage and retrieval. ConversationBufferMemory: Manages conversation history. ConversationalRetrievalChain: Handles user interactions. HuggingFaceHub: Accesses language models from the Hugging Face model hub.
Chatbot Class The Chatbot class encapsulates the chatbot's functionality, providing methods for text extraction, vector embeddings, and user interaction.
Initialization The class is initialized with parameters such as PDF data, Hugging Face API token (key), and a type flag (0 for PDF data, 1 for custom text data). The custom data, such as a restaurant menu for example, allows the chatbot to answer questions about the specified content in this menu .
Technical Explanation 4. Text Chunking The get_text_chunks method splits extracted text into manageable chunks using a CharacterTextSplitter, essential for efficient processing.
Vector Embeddings The get_vectorstore method utilizes the Hugging Face model to obtain vector embeddings for each text chunk, storing them in a FAISS vector store.
Conversation Chain The get_conversation_chain method sets up a conversational retrieval chain using the Hugging Face language model, the vector store, and a conversation memory.
User Input Handling The handle_user_input method processes user questions and generates bot responses based on the conversation chain.
Bot Initialization and Interaction The start method initializes the bot based on the provided data type (PDF or text), setting up the vector store and conversation chain.
Usage Clone the repository:
Provide your Hugging Face API token. Specify PDF or text data paths in the pdf_docs list. Run the script:
The script serves as a foundation for building a chatbot with conversational retrieval, enabling users to interact with a pre-trained language model based on their custom data.