MongoDB Atlas
Only available on Node.js.
You can still create API routes that use MongoDB with Next.js by setting the runtime
variable to nodejs
like so:
export const runtime = "nodejs";
You can read more about Edge runtimes in the Next.js documentation here.
This guide provides a quick overview for getting started with MongoDB
Atlas vector stores. For detailed
documentation of all MongoDBAtlasVectorSearch
features and
configurations head to the API
reference.
Overviewโ
Integration detailsโ
Class | Package | PY support | Package latest |
---|---|---|---|
MongoDBAtlasVectorSearch | @langchain/mongodb | โ |
Setupโ
To use MongoDB Atlas vector stores, youโll need to configure a MongoDB
Atlas cluster and install the @langchain/mongodb
integration package.
Initial Cluster Configurationโ
To create a MongoDB Atlas cluster, navigate to the MongoDB Atlas website and create an account if you donโt already have one.
Create and name a cluster when prompted, then find it under Database
.
Select Browse Collections
and create either a blank collection or one
from the provided sample data.
Note: The cluster created must be MongoDB 7.0 or higher.
Creating an Indexโ
After configuring your cluster, youโll need to create an index on the collection field you want to search over.
Switch to the Atlas Search
tab and click Create Search Index
. From
there, make sure you select Atlas Vector Search - JSON Editor
, then
select the appropriate database and collection and paste the following
into the textbox:
{
"fields": [
{
"numDimensions": 1536,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
}
]
}
Note that the dimensions property should match the dimensionality of the embeddings you are using. For example, Cohere embeddings have 1024 dimensions, and by default OpenAI embeddings have 1536:
Note: By default the vector store expects an index name of default, an indexed collection field name of embedding, and a raw text field name of text. You should initialize the vector store with field names matching your index name collection schema as shown below.
Finally, proceed to build the index.
Embeddingsโ
This guide will also use OpenAI
embeddings, which require you
to install the @langchain/openai
integration package. You can also use
other supported embeddings models
if you wish.
Installationโ
Install the following packages:
- npm
- yarn
- pnpm
npm i @langchain/mongodb mongodb @langchain/openai
yarn add @langchain/mongodb mongodb @langchain/openai
pnpm add @langchain/mongodb mongodb @langchain/openai
Credentialsโ
Once youโve done the above, set the MONGODB_ATLAS_URI
environment
variable from the Connect
button in Mongoโs dashboard. Youโll also
need your DB name and collection name:
process.env.MONGODB_ATLAS_URI = "your-atlas-url";
process.env.MONGODB_ATLAS_COLLECTION_NAME = "your-atlas-db-name";
process.env.MONGODB_ATLAS_DB_NAME = "your-atlas-db-name";
If you are using OpenAI embeddings for this guide, youโll need to set your OpenAI key as well:
process.env.OPENAI_API_KEY = "YOUR_API_KEY";
If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:
// process.env.LANGCHAIN_TRACING_V2="true"
// process.env.LANGCHAIN_API_KEY="your-api-key"
Instantiationโ
Once youโve set up your cluster as shown above, you can initialize your vector store as follows:
import { MongoDBAtlasVectorSearch } from "@langchain/mongodb";
import { OpenAIEmbeddings } from "@langchain/openai";
import { MongoClient } from "mongodb";
const client = new MongoClient(process.env.MONGODB_ATLAS_URI || "");
const collection = client
.db(process.env.MONGODB_ATLAS_DB_NAME)
.collection(process.env.MONGODB_ATLAS_COLLECTION_NAME);
const embeddings = new OpenAIEmbeddings({
model: "text-embedding-3-small",
});
const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection: collection,
indexName: "vector_index", // The name of the Atlas search index. Defaults to "default"
textKey: "text", // The name of the collection field containing the raw content. Defaults to "text"
embeddingKey: "embedding", // The name of the collection field containing the embedded text. Defaults to "embedding"
});
Manage vector storeโ
Add items to vector storeโ
You can now add documents to your vector store:
import type { Document } from "@langchain/core/documents";
const document1: Document = {
pageContent: "The powerhouse of the cell is the mitochondria",
metadata: { source: "https://example.com" },
};
const document2: Document = {
pageContent: "Buildings are made out of brick",
metadata: { source: "https://example.com" },
};
const document3: Document = {
pageContent: "Mitochondria are made out of lipids",
metadata: { source: "https://example.com" },
};
const document4: Document = {
pageContent: "The 2024 Olympics are in Paris",
metadata: { source: "https://example.com" },
};
const documents = [document1, document2, document3, document4];
await vectorStore.addDocuments(documents, { ids: ["1", "2", "3", "4"] });
[ '1', '2', '3', '4' ]
Adding a document with the same id
as an existing document will update
the existing one.
Delete items from vector storeโ
await vectorStore.delete({ ids: ["4"] });
Query vector storeโ
Once your vector store has been created and the relevant documents have been added you will most likely wish to query it during the running of your chain or agent.
Query directlyโ
Performing a simple similarity search can be done as follows:
const similaritySearchResults = await vectorStore.similaritySearch(
"biology",
2
);
for (const doc of similaritySearchResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Filteringโ
MongoDB Atlas supports pre-filtering of results on other fields. They require you to define which metadata fields you plan to filter on by updating the index you created initially. Hereโs an example:
{
"fields": [
{
"numDimensions": 1024,
"path": "embedding",
"similarity": "euclidean",
"type": "vector"
},
{
"path": "source",
"type": "filter"
}
]
}
Above, the first item in fields
is the vector index, and the second
item is the metadata property you want to filter on. The name of the
property is the value of the path
key. So the above index would allow
us to search on a metadata field named source
.
Then, in your code you can use MQL Query Operators for filtering.
The below example illustrates this:
const filter = {
preFilter: {
source: {
$eq: "https://example.com",
},
},
};
const filteredResults = await vectorStore.similaritySearch(
"biology",
2,
filter
);
for (const doc of filteredResults) {
console.log(`* ${doc.pageContent} [${JSON.stringify(doc.metadata, null)}]`);
}
* The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Returning scoresโ
If you want to execute a similarity search and receive the corresponding scores you can run:
const similaritySearchWithScoreResults =
await vectorStore.similaritySearchWithScore("biology", 2, filter);
for (const [doc, score] of similaritySearchWithScoreResults) {
console.log(
`* [SIM=${score.toFixed(3)}] ${doc.pageContent} [${JSON.stringify(
doc.metadata
)}]`
);
}
* [SIM=0.374] The powerhouse of the cell is the mitochondria [{"_id":"1","source":"https://example.com"}]
* [SIM=0.370] Mitochondria are made out of lipids [{"_id":"3","source":"https://example.com"}]
Query by turning into retrieverโ
You can also transform the vector store into a retriever for easier usage in your chains.
const retriever = vectorStore.asRetriever({
// Optional filter
filter: filter,
k: 2,
});
await retriever.invoke("biology");
[
Document {
pageContent: 'The powerhouse of the cell is the mitochondria',
metadata: { _id: '1', source: 'https://example.com' },
id: undefined
},
Document {
pageContent: 'Mitochondria are made out of lipids',
metadata: { _id: '3', source: 'https://example.com' },
id: undefined
}
]
Usage for retrieval-augmented generationโ
For guides on how to use this vector store for retrieval-augmented generation (RAG), see the following sections:
- Tutorials: working with external knowledge.
- How-to: Question and answer with RAG
- Retrieval conceptual docs
Closing connectionsโ
Make sure you close the client instance when you are finished to avoid excessive resource consumption:
await client.close();
API referenceโ
For detailed documentation of all MongoDBAtlasVectorSearch
features
and configurations head to the API
reference.
Relatedโ
- Vector store conceptual guide
- Vector store how-to guides