Have you ever sat down with a cup of coffee, mulling over many documents, and thought, “If only I could just have a conversation with these documents”? That was precisely the idea that struck me while lounging on the porch of my dream home one day. :) 😁 It might sound like a fever dream, but with the power of artificial intelligence, specifically LangChain and LlamaIndex, this far-fetched thought became a reality.
Here is my YouTube video that explains the Document Conversation System.
Before we dive into the nitty-gritty of how I brought this concept to life, let’s understand what LangChain and LlamaIndex are.
LangChain and LlamaIndex: Powering Conversations with Documents
LangChain is a Python-based language model API that harnesses the power of OpenAI’s GPT models to perform various language-related tasks. It enables us to effectively parse and interact with textual data, making it a perfect tool to “talk to documents”.
LlamaIndex, on the other hand, is a package designed for indexing and retrieving data. It facilitates the creation and management of embeddings, a crucial part of our document conversation system, providing efficient and contextually relevant retrievals.
These tools have become a staple in my tech arsenal, seamlessly integrating into my everyday life through AI-based applications. I wanted to communicate with my documents, and these were the perfect tools to do so.
Building the Document Conversation System
Armed with my OpenAI account and Python, I created a document conversation system. To make the code work, one will need an OpenAI API key.
The pip installations are as follows:
pip install llama_index pip install langchain pip install transformers pip install pypdf
The Document Conversation System is an innovative approach integrating two distinct yet interconnected programs to facilitate meaningful interactions with your documents.
Program 1: Indexing and Embedding Generation
The first program in the Document Conversation System lays the groundwork for the conversation. Its primary job is to parse the documents, generate manageable chunks of text, and create corresponding embeddings. These embeddings, derived using OpenAI, capture the semantic essence of the text, translating it into a numerical form.
In addition to generating embeddings, this program also constructs an index using LlamaIndex. This index is a reference, linking each text chunk with its corresponding embedding. The index is instrumental in quickly and efficiently locating relevant text chunks during a conversation based on their semantic similarity to the given prompt.
It’s important to note that the first program doesn’t need to be run every time you want to converse with your documents. Instead, it’s typically run once for each new set of documents, with the resulting embeddings and index stored for future use.
Here is the code of the first program:
from llama_index import( GPTVectorStoreIndex, SimpleDirectoryReader, LLMPredictor, ServiceContext) from langchain.chat_models import ChatOpenAI import os os.environ['OPENAI_API_KEY'] =\ "Your_OpenAI_API_Key" os.environ['TOKENIZERS_PARALLELISM']='false' documents = SimpleDirectoryReader('./data').load_data() modelName = "text-davinci-003" predictor=LLMPredictor(llm=ChatOpenAI(temperature=0, model_name=modelName)) serv_context = ServiceContext.from_defaults( llm_predictor=predictor, chunk_size=600 ) index=GPTVectorStoreIndex.from_documents(documents, service_context=serv_context) index.storage_context.persist( persist_dir="./chunkstorage") print("Written modeled data in ./chunkstorage folder")
Program 2: Initiating the Conversation
The second program comes into play once the embeddings and index are ready. This program is responsible for the actual conversation with the documents. It loads the pre-generated chunks and embeddings and interacts with the OpenAI model based on the user’s prompts or questions.
Upon receiving a prompt, the program identifies the relevant text chunks using the index and sends them, along with the question, to OpenAI. The AI model then crafts a response based on these contextually pertinent chunks.
In essence, the second program serves as the interface for the user to communicate with their documents, harnessing the power of the embeddings and index generated by the first program.
Here is the code for the second program.
from llama_index import( StorageContext, load_index_from_storage) from llama_index.query_engine import RetrieverQueryEngine import time import os os.environ['OPENAI_API_KEY'] = \ "Your_OpenAI_API_Key" os.environ['TOKENIZERS_PARALLELISM']='false' # Number of chunks you would like to use for an answer. # Set to 1 to save money. k=3 # Rebuild storage context from the chunkstorage storage_context = \ StorageContext.from_defaults(persist_dir='./chunkstorage') # Load index loadedIndex = load_index_from_storage(storage_context) loadedIndex._service_context.\ llm_predictor.llm.model_name="text-davinci-003" print(loadedIndex._service_context.llm_predictor.llm) print() retriever = loadedIndex.as_retriever() retriever.similarity_top_k=k # Construct the query engine based on the retriever, # and response_mode. query_engine = RetrieverQueryEngine.from_args( retriever=retriever, response_mode= 'compact' ) timestamp = time.strftime("%Y_%m_%d-%H_%M_%S", time.gmtime()) filename ="Llama"+timestamp + ".txt" if not os.path.exists(filename): with open(filename, 'w') as f: f.write("User: Welcome to OpenAI chat!\n\n") while (True): p=input("Enter quit to quit, or enter your prompt: ") p=p.strip() if (p==""): continue if (p=="quit"): break response = query_engine.query(p) print("\nAI says:\n", response, "\n\n") with open(filename, 'a') as f: f.write("User:\n" + p + "\n\n") f.write("AI:\n" + str(response) + "\n\n") print("Bye bye")
By segregating the tasks of generating the embeddings and index and conducting the conversation into two separate programs, the Document Conversation System ensures efficiency, reusability, and a coherent structure, thereby enabling effective and meaningful interactions with your documents.
The Future of Document Conversations
The implications of this are enormous. Imagine automating question-answer sessions to help your clients based on an existing knowledge base. Or think about extracting specific insights from a collection of documents without combing through every page manually.
The possibilities are endless!
In a world where data is king, conversing with your documents is no longer just a dream. It’s a reality powered by LangChain, LlamaIndex, and OpenAI.