Claim Management With LLMs In RAG And Vector Search

Our solution addresses these challenges by combining the power of Altas Vector Search and a Large Language Model (LLM) in a retrieval augmented generation (RAG) system, allowing organizations to go beyond the limitations of baseline foundational models, making them context-aware by feeding them proprietary data. In this way, they can leverage the full potential of AI to streamline operations.

Reference architectures

With MongoDB:

MongoDB Atlas combines transactional and search capabilities in the same platform, providing a unified development experience. As embeddings are stored alongside existing data, when running a vector search query, we get the document containing both the vector embeddings and the associated metadata, eliminating the need to retrieve the data elsewhere. This is a great advantage for developers who don’t need to learn to use and maintain a separate technology and can fully focus on building their apps.

Ultimately, the data obtained from MongoDB Vector Search is fed to the LLM as context.

Data model approach

The “claim” collection contains documents including a number of fields related to the claim. In particular, we are interested in the “claimDescription” field, which we vectorize and add to the document as “claimDescriptionEmbedding.” This embedding is then indexed and used to retrieve documents associated with the user prompt.

Building the solution

The instructions to build the demo are included in the readme of this Github repo. You’ll be guided through the following steps:

OpenAI API key setup
Atlas connection setup
Dataset download
LLM configuration options
Vector Search index creation

Step 4 of this tutorial walks you through the creation and configuration of the Vector Search index within the Atlas UI. Make sure you follow this structure:

Code Snippet

Ultimately you have to run both the front and the back end. You’ll access a web UI that allows you to ask questions of the LLM, obtain an answer, and see the reference documents used as context.

Key Learnings

Text embedding creation — The embedding generation process can be carried out using different models and deployment options. It is always important to be mindful of privacy and data protection requirements. A locally deployed model is recommended if we need our data to never leave our servers. Otherwise, we can simply call an API and get our vectors back, as explained in this tutorial that tells you how to do it with OpenAI.
Creation of a Vector Search index in Atlas — It is now possible to create indexes for local deployments.
Performing a Vector Search query — Notably, Vector Search queries have a dedicated operator within MongoDB’s aggregation pipeline. This means they can be concatenated with other operations, making it extremely convenient for developers because they don’t need to learn a different language or change context.
Using LangChain as the framework that glues together MongoDB Atlas Vector Search and the LLM, allowing for an easy and fast RAG implementation.

Authors

Luca Napoli, Industry Solutions, MongoDB
Jeff Needham, Industry Solutions, MongoDB

Solutions