Skip to content

Commit

Permalink
feat(vector-store): add MongoDB support for vector store service
Browse files Browse the repository at this point in the history
* refactor(long-memory): replace jieba with jieba-wasm for improved text segmentation

- Updated the dependency from `@node-rs/jieba` to `jieba-wasm` in package.json.
- Refactored the text segmentation logic in `similarity.ts` to utilize the new `cut` function from `jieba-wasm`, enhancing compatibility and performance.

* refactor(long-memory): enhance BM25 similarity calculation in similarity.ts

- Improved the BM25 similarity calculation by introducing term frequency maps for both documents.
- Added a smoothing factor and adjusted the scoring formula to normalize against the theoretical maximum score.
- Enhanced code readability and maintainability by restructuring the logic for term frequency and IDF calculations.

* style(long-memory): prettier

* feat(vector-store): add MongoDB configuration options to the vector store service

- Introduced new MongoDB configuration parameters: mongodbUrl, mongodbDbName, and mongodbCollectionName.
- Updated the configuration schema to include MongoDB as a supported vector store option.
- Added documentation link for MongoDB configuration in the usage section.

* feat(vector-store): add MongoDB database settings to localization files

---------

Co-authored-by: dingyi <[email protected]>
  • Loading branch information
Hoshino-Yumetsuki and dingyi222666 authored Dec 31, 2024
1 parent 0fa8d90 commit 3fc51ca
Show file tree
Hide file tree
Showing 4 changed files with 142 additions and 1 deletion.
16 changes: 15 additions & 1 deletion packages/vector-store-service/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ export interface Config extends ChatLunaPlugin.Config {
milvusUrl: string
milvusUsername: string
milvusPassword: string

mongodbUrl: string
mongodbDbName: string
mongodbCollectionName: string
}

export const Config: Schema<Config> = Schema.intersect([
Expand All @@ -35,7 +39,8 @@ export const Config: Schema<Config> = Schema.intersect([
Schema.const('faiss').description('Faiss'),
Schema.const('redis').description('Redis'),
Schema.const('milvus').description('Milvus'),
Schema.const('luna-vdb').description('lunavdb')
Schema.const('luna-vdb').description('lunavdb'),
Schema.const('mongodb').description('MongoDB Atlas')
])
)
.default(['luna-vdb'])
Expand All @@ -52,6 +57,14 @@ export const Config: Schema<Config> = Schema.intersect([
.default('http://127.0.0.1:19530'),
milvusUsername: Schema.string().default(''),
milvusPassword: Schema.string().role('secret').default('')
}),

Schema.object({
mongodbUrl: Schema.string()
.role('url')
.default('mongodb://localhost:27017'),
mongodbDbName: Schema.string().default('chatluna'),
mongodbCollectionName: Schema.string().default('chatluna_collection')
})
]).i18n({
'zh-CN': require('./locales/zh-CN.schema.yml'),
Expand All @@ -69,6 +82,7 @@ export const usage = `
要查看如何配置 Milvus 数据库,看[这里](https://js.langchain.com/docs/integrations/vectorstores/milvus/)
要查看如何配置 MongoDB 数据库,看[这里](https://js.langchain.com/docs/integrations/vectorstores/mongodb_atlas/)
目前配置 Faiss 数据库安装后可能会导致 koishi 环境不安全,如果安装完成后进行某些操作完成后出现了问题(如,升级 node 版本),开发者不对此负直接责任。
`
Expand Down
4 changes: 4 additions & 0 deletions packages/vector-store-service/src/locales/en-US.schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ $inner:
milvusUrl: Milvus URL Address
milvusUsername: Milvus Username
milvusPassword: Milvus Password
- $desc: MongoDB Database Settings
mongodbUrl: MongoDB URL Address
mongodbDbName: MongoDB Database Name
mongodbCollectionName: MongoDB Collection Name
4 changes: 4 additions & 0 deletions packages/vector-store-service/src/locales/zh-CN.schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ $inner:
milvusUrl: Milvus url 地址
milvusUsername: Milvus 用户名
milvusPassword: Milvus 密码
- $desc: MongoDB 数据库设置
mongodbUrl: MongoDB url 地址
mongodbDbName: MongoDB 数据库名
mongodbCollectionName: MongoDB 集合名
119 changes: 119 additions & 0 deletions packages/vector-store-service/src/vectorstore/mongodb.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import { MongoDBAtlasVectorSearch } from '@langchain/mongodb'
import { Context, Logger } from 'koishi'
import { ChatLunaPlugin } from 'koishi-plugin-chatluna/services/chat'
import { createLogger } from 'koishi-plugin-chatluna/utils/logger'
import { Config } from '..'
import { ChatLunaSaveableVectorStore } from 'koishi-plugin-chatluna/llm-core/model/base'
import { MongoClient, ObjectId } from 'mongodb'

let logger: Logger

export async function apply(
ctx: Context,
config: Config,
plugin: ChatLunaPlugin
) {
logger = createLogger(ctx, 'chatluna-vector-store-service')

if (!config.vectorStore.includes('mongodb')) {
return
}

await importMongoDB()

plugin.registerVectorStore('mongodb', async (params) => {
const embeddings = params.embeddings

const client = new MongoClient(config.mongodbUrl)
await client.connect()

ctx.on('dispose', async () => {
await client.close()
logger.info('MongoDB connection closed')
})

const collection = client
.db(config.mongodbDbName)
.collection(config.mongodbCollectionName)

const vectorStore = new MongoDBAtlasVectorSearch(embeddings, {
collection,
indexName: params.key ?? 'vector_index',
textKey: 'text',
embeddingKey: 'embedding'
})

const wrapperStore =
new ChatLunaSaveableVectorStore<MongoDBAtlasVectorSearch>(
vectorStore,
{
async deletableFunction(_store, options) {
if (options.deleteAll) {
await collection.deleteMany({})
return
}

const ids: string[] = []
if (options.ids) {
ids.push(...options.ids)
}

if (options.documents) {
const documentIds = options.documents
?.map(
(document) =>
document.metadata?.raw_id as
| string
| undefined
)
.filter((id): id is string => id != null)

ids.push(...documentIds)
}

if (ids.length > 0) {
await collection.deleteMany({
_id: { $in: ids.map((id) => new ObjectId(id)) }
})
}
},
async addDocumentsFunction(
store,
documents,
options: { ids?: string[] }
) {
let keys = options?.ids ?? []

keys = documents.map((document, i) => {
const id = keys[i] ?? crypto.randomUUID()
document.metadata = {
...document.metadata,
raw_id: id
}
return id
})

await store.addDocuments(documents)
},
async saveableFunction(_store) {
await client.close()
logger.info('MongoDB connection closed during save')
}
}
)

return wrapperStore
})
}

async function importMongoDB() {
try {
const { MongoClient } = await import('mongodb')
return { MongoClient }
} catch (err) {
logger.error(err)
throw new Error(
'Please install mongodb as a dependency with, e.g. `npm install -S mongodb`'
)
}
}

0 comments on commit 3fc51ca

Please sign in to comment.