Solutions

Build a PDF search application with vector search and LLMs

Streamline operations and claims processing with powerful PDF search capabilities by integrating MongoDB Atlas Vector Search, SuperDuperDB, and LLMs.

Start Free

Illustration of the vector search function

Solutions overview
Ref architecture
SuperDuperDB
Key learnings
Technologies used
Resources

Use cases: Gen AI, Content Management
Industries: Insurance, Financial Services, Manufacturing and Mobility, Retail
Products: Atlas, Vector Search
Partners: SuperDuperDB, OpenAI, FastAPI

Solutions overview

Retrieval-augmented generation (RAG) applications are a game changer for insurance companies, enabling them to harness the power of unstructured data while promoting accessibility and flexibility. Special attention goes to PDFs, which are ubiquitous yet difficult to search, leading claim adjusters and underwriters to spend hours reviewing contracts, claims, and guidelines in this common format. RAG for PDF search brings efficiency and accuracy to this historically cumbersome task. Now, users can simply type a question in natural language and the app will sift through the company data, provide an answer, summarize the content of the documents, and indicate the source of the information, including the page and paragraph where it was found.

In this GitHub repo, you will find detailed, step-by-step instructions on how to build the PDF search application combining MongoDB, SuperDuperDB, and LLMs. Our use case for this solution focuses on a claim adjuster or an underwriter handling a specific case. Analyzing the guidelines PDF associated with a specific customer helps determine the loss amount in the event of an accident or the new premium in the case of a policy renewal. The app assists by answering questions and displaying the relevant sections of the document.

Insurance firms rely heavily on data processing. To make investment decisions or handle claims, they leverage vast amounts of data, mostly unstructured. Underwriters and claim adjusters need to comb through numerous pages of guidelines, contracts, and reports, typically in PDF format. Manually finding and reviewing every piece of information is time-consuming and can easily lead to expensive mistakes, such as incorrect risk estimations. Quickly finding and accessing relevant content is essential. Combining Atlas Vector Search and LLMs to build RAG apps can directly impact the bottom line of an insurance company.

Building the solution and reference architecture

Combining MongoDB and SuperDuperDB allows you to build an information retrieval system with ease. Let’s break down the process:

The user adds the PDFs that need to be searched.
A script scans the PDFs, creates the chunks, and vectorizes them (see Figure 1). The chunking step is carried out using a sliding window methodology, which ensures that potentially important transitional data between chunks is not lost, helping to preserve continuity of context.
Vectors and chunk metadata are stored in MongoDB, and a Vector Search index is created (see Figure 2).
The PDFs are now ready to be queried. The user selects a customer, asks a question, and the system returns an answer, displaying the page and paragraph where the information was found and highlighting the specific section with a red frame (see Figure 2).

Figure 1: PDF chunking, embedding creation, and storage, orchestrated with SuperDuperDB

Each customer has a guidelines PDF associated with their account based on country of residency. When the user selects a customer and asks a question, the system runs a vector search query only on that particular document, seamlessly filtering out the non-relevant ones. This is made possible by the pre-filtering (see code snippets below) field included in the index and in the search query.

Atlas Vector Search also takes advantage of MongoDB’s new Search Nodes dedicated architecture, enabling better optimization for the right level of resourcing for specific workload needs. Search Nodes provide dedicated infrastructure for Atlas Search and Vector Search workloads, allowing you to optimize compute resources and fully scale search needs independent of the database. Search Nodes provide better performance at scale, delivering workload isolation, higher availability, and the ability to better optimize resource usage.

Figure 2: PDF querying flow, orchestrated with SuperDuperDB

Code Snippet

Code Snippet

SuperDuperDB

SuperDuperDB is an open-source Python framework for integrating AI models and workflows directly with and across major databases for more flexible and scalable custom enterprise AI solutions. It enables developers to build, deploy, and manage AI on their existing data infrastructure and data, while using their preferred tools, eliminating data migration and duplication.

With SuperDuperDB, developers can:

Bring AI to their databases, eliminating data pipelines and moving data, minimizing engineering efforts, time to production, and computation resources.
Implement AI workflows with any open- and closed-source AI models and APIs, on any type of data, with any AI and Python framework, package, class, or function.
Safeguard data by switching from APIs to hosting and fine-tuning your own models, on your own and existing infrastructure, whether on-premises or in the cloud.
Easily switch between embedding models and LLMs to other API providers as well as hosting your own models on HuggingFace or elsewhere just by changing a small configuration.

SuperDuperDB provides an array of sample use cases and notebooks that developers can use to get started, including vector search with MongoDB, embedding generation, multimodal search, RAG, transfer learning, and many more. The demo showcased in this solution is adapted from an app previously developed by SuperDuperDB.

Key learnings

Build the solution following the instructions in this Github repo. It is important to note that the solution is made of two logical steps:
- The initialization script breaks down the PDFs into chunks and then turns them into vector embeddings.
- The querying step allows the user to interrogate the documents.
Text embedding creation: The embedding generation process can be carried out using different models and deployment options. It is always important to be mindful of privacy and data protection requirements. A locally deployed model is recommended if you need your data to remain on the servers. Otherwise, you can simply call an API and get your vector embeddings back, as explained in this tutorial that tells you how to do it with OpenAI.
SuperDuperDB is the framework that helps us with the plumbing of the moving pieces, providing a simple and standard interface to interact with Vector Search and LLMs.