MongoDB Atlas

Compatibility

Only available on Node.js.

You can still create API routes that use MongoDB with Next.js by setting the runtime variable to nodejs like so:

export const runtime = "nodejs";

You can read more about Edge runtimes in the Next.js documentation here.

This guide provides a quick overview for getting started with MongoDB Atlas vector stores. For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.

Overview

Integration details

Class	Package	PY support	Package latest
`MongoDBAtlasVectorSearch`	`@langchain/mongodb`	✅

Setup

To use MongoDB Atlas vector stores, you’ll need to configure a MongoDB Atlas cluster and install the @langchain/mongodb integration package.

Initial Cluster Configuration

To create a MongoDB Atlas cluster, navigate to the MongoDB Atlas website and create an account if you don’t already have one.

Create and name a cluster when prompted, then find it under Database. Select Browse Collections and create either a blank collection or one from the provided sample data.

Note: The cluster created must be MongoDB 7.0 or higher.

Creating an Index

After configuring your cluster, you’ll need to create an index on the collection field you want to search over.

Switch to the Atlas Search tab and click Create Search Index. From there, make sure you select Atlas Vector Search - JSON Editor, then select the appropriate database and collection and paste the following into the textbox:

{
  "fields": [
    {
      "numDimensions": 1536,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    }
  ]
}

Note that the dimensions property should match the dimensionality of the embeddings you are using. For example, Cohere embeddings have 1024 dimensions, and by default OpenAI embeddings have 1536:

Note: By default the vector store expects an index name of default, an indexed collection field name of embedding, and a raw text field name of text. You should initialize the vector store with field names matching your index name collection schema as shown below.

Finally, proceed to build the index.

Embeddings

This guide will also use OpenAI embeddings, which require you to install the @langchain/openai integration package. You can also use other supported embeddings models if you wish.

Installation

Install the following packages:

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/mongodb mongodb @langchain/openai

yarn add @langchain/mongodb mongodb @langchain/openai

pnpm add @langchain/mongodb mongodb @langchain/openai

Credentials

Once you’ve done the above, set the MONGODB_ATLAS_URI environment variable from the Connect button in Mongo’s dashboard. You’ll also need your DB name and collection name:

process.env.MONGODB_ATLAS_URI = "your-atlas-url";
process.env.MONGODB_ATLAS_COLLECTION_NAME = "your-atlas-db-name";
process.env.MONGODB_ATLAS_DB_NAME = "your-atlas-db-name";

If you are using OpenAI embeddings for this guide, you’ll need to set your OpenAI key as well:

process.env.OPENAI_API_KEY = "YOUR_API_KEY";

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

// process.env.LANGCHAIN_TRACING_V2="true"
// process.env.LANGCHAIN_API_KEY="your-api-key"

Instantiation

Once you’ve set up your cluster as shown above, you can initialize your vector store as follows:

import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MongoClient } from "mongodb";

const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const collection = client
  .db(process.env.MONGODB_ATLAS_DB_NAME)
  .collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);

const embeddings = new OpenAIEmbeddings({
  model: "text-embedding-3-small",
});

const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
  collection: collection,
  indexName: "vector_index", // The name of the Atlas search index. Defaults to "default"
  textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
  embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});

Manage vector store

Add items to vector store

You can now add documents to your vector store:

import type { Document } from "@langchain/core/documents";

const document1: Document = {
  pageContent: "The powerhouse of the cell is the mitochondria",
  metadata: { source: "https://example.com" },
};

const document2: Document = {
  pageContent: "Buildings are made out of brick",
  metadata: { source: "https://example.com" },
};

const document3: Document = {
  pageContent: "Mitochondria are made out of lipids",
  metadata: { source: "https://example.com" },
};

const document4: Document = {
  pageContent: "The 2024 Olympics are in Paris",
  metadata: { source: "https://example.com" },
};

const documents = [document1, document2, document3, document4];

await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });

[ '1', '2', '3', '4' ]

Adding a document with the same id as an existing document will update the existing one.

Delete items from vector store

await vectorStore.delete({ ids: ["4"] });

Query vector store

Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.

Query directly

Performing a simple similarity search can be done as follows:

const similaritySearchResults = await vectorStore.similaritySearch(
  "biology",
  2
);

for (const doc of similaritySearchResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Filtering

MongoDB Atlas supports pre-filtering of results on other fields. They require you to define which metadata fields you plan to filter on by updating the index you created initially. Here’s an example:

{
  "fields": [
    {
      "numDimensions": 1024,
      "path": "embedding",
      "similarity": "euclidean",
      "type": "vector"
    },
    {
      "path": "source",
      "type": "filter"
    }
  ]
}

Above, the first item in fields is the vector index, and the second item is the metadata property you want to filter on. The name of the property is the value of the path key. So the above index would allow us to search on a metadata field named source.

Then, in your code you can use MQL Query Operators for filtering.

The below example illustrates this:

const filter = {
  preFilter: {
    source: {
      $eq: "https://example.com",
    },
  },
};

const filteredResults = await vectorStore.similaritySearch(
  "biology",
  2,
  filter
);

for (const doc of filteredResults) {
  console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}

* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Returning scores

If you want to execute a similarity search and receive the corresponding scores you can run:

const similaritySearchWithScoreResults =
  await vectorStore.similaritySearchWithScore("biology", 2, filter);

for (const [doc, score] of similaritySearchWithScoreResults) {
  console.log(
    `* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
      doc.metadata
    )}]`
  );
}

* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]

Query by turning into retriever

You can also transform the vector store into a retriever for easier usage in your chains.

const retriever = vectorStore.asRetriever({
  // Optional filter
  filter: filter,
  k: 2,
});
await retriever.invoke("biology");

[
  Document {
    pageContent: 'The powerhouse of the cell is the mitochondria',
    metadata: { _id: '1', source: 'https://example.com' },
    id: undefined
  },
  Document {
    pageContent: 'Mitochondria are made out of lipids',
    metadata: { _id: '3', source: 'https://example.com' },
    id: undefined
  }
]

Usage for retrieval-augmented generation

For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:

Closing connections

Make sure you close the client instance when you are finished to avoid excessive resource consumption:

await client.close();

API reference

For detailed documentation of all MongoDBAtlasVectorSearch features and configurations head to the API reference.

Vector store conceptual guide
Vector store how-to guides

MongoDB Atlas

Overview

Integration details

Setup

Initial Cluster Configuration

Creating an Index

Embeddings

Installation

Credentials

Instantiation

Manage vector store

Add items to vector store

Delete items from vector store

Query vector store

Query directly

Filtering

Returning scores

Query by turning into retriever

Usage for retrieval-augmented generation

Closing connections

API reference

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview​

Integration details​

Setup​

Initial Cluster Configuration​

Creating an Index​

Embeddings​

Installation​

Credentials​

Instantiation​

Manage vector store​

Add items to vector store​

Delete items from vector store​

Query vector store​

Query directly​

Filtering​

Returning scores​

Query by turning into retriever​

Usage for retrieval-augmented generation​

Closing connections​

API reference​

Related​

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview

Integration details

Setup

Initial Cluster Configuration

Creating an Index

Embeddings

Installation

Credentials

Instantiation

Manage vector store

Add items to vector store

Delete items from vector store

Query vector store

Query directly

Filtering

Returning scores

Query by turning into retriever

Usage for retrieval-augmented generation

Closing connections

API reference

Related