Skip to content

Commit

Permalink
Merge pull request #7 from supabase-community/or/docs
Browse files Browse the repository at this point in the history
Creates a Docs site
  • Loading branch information
olirice authored Aug 13, 2024
2 parents 48d7c87 + f63576c commit 38174cc
Show file tree
Hide file tree
Showing 21 changed files with 316 additions and 188 deletions.
199 changes: 20 additions & 179 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
# vec2pg

# `vec2pg`

<p>
<a href="https://github.com/supabase-community/vec2pg/actions">
Expand All @@ -8,6 +7,8 @@
<a href="https://github.com/supabase-community/vec2pg/actions">
<img src="https://github.com/supabase-community/vec2pg/workflows/pre-commit/badge.svg" alt="Pre-commit Status" height="18">
</a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.8+-blue.svg" alt="Python version" height="18"></a>
<a href=""><img src="https://img.shields.io/badge/postgresql-15+-blue.svg" alt="PostgreSQL version" height="18"></a>
</p>
<p>
<a href="https://github.com/supabase-community/vec2pg/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/markdown-subtemplate.svg" alt="License" height="18"></a>
Expand All @@ -17,23 +18,25 @@
</a>
<a href="https://pypi.org/project/vec2pg/"><img src="https://img.shields.io/pypi/dm/vec2pg.svg" alt="Download count" height="18"></a>
</p>
<p>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.8+-blue.svg" alt="Python version" height="18"></a>
<a href=""><img src="https://img.shields.io/badge/postgresql-14+-blue.svg" alt="PostgreSQL version" height="18"></a>
</p>

---

**Documentation**: <a href="https://supabase-community.github.io/vec2pg" target="_blank">https://supabase-community.github.io/vec2pg</a>

**Source Code**: <a href="https://github.com/supabase-community/vec2pg" target="_blank">https://github.com/supabase-community/vec2pg</a>

---

A CLI for migrating data from vector databases to [Supabase](https://supabase.com).
`vec2pg` is a CLI tool for migrating data from third-party vector databases to [Supabase](https://supabase.com) with Pgvector.


Supported data sources include:

- [Pinecone](https://docs.pinecone.io/home)
- (more soon)
- [Qdrant](https://qdrant.tech/)
- [Vote for others](https://github.com/supabase-community/vec2pg/issues/6)

## Usage

```
vec2pg --help
Expand All @@ -42,175 +45,13 @@ vec2pg --help
```
Usage: vec2pg [OPTIONS] COMMAND [ARGS]...
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ pinecone │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

## Installation

Requirements:

- Python >= 3.8

```sh
pip install vec2pg
```


## Migration Guide

### Pinecone

```
vec2pg pinecone migrate --help
```

```
Usage: vec2pg pinecone migrate [OPTIONS] PINECONE_INDEX PINECONE_API_KEY
POSTGRES_CONNECTION_STRING
╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * pinecone_index TEXT [default: None] [required] │
│ * pinecone_api_key TEXT [env var: PINECONE_API_KEY] [default: None] [required] │
│ * postgres_connection_string TEXT [env var: POSTGRES_CONNECTION_STRING] [default: None] [required] │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ──────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell.│
│ --show-completion Show completion for the current shell │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────╮
│ pinecone Move data from Pinecone to Supabase │
│ qdrant Move data from Qdrant to Supabase │
╰────────────────────────────────────────────────────────────────────────╯
```



To migrate from [Pinecone serverless](https://www.pinecone.io/blog/serverless/) index to Postgres you'll need:

- A Pinecone API Key

![pinecone api key](/assets/pinecone_api_key.png)

- The Pinecone serverless index name

![pinecone serverless index name](/assets/pinecone_index_name.png)

- A Supabase instance

From the Supabase instance, we need the connection parameters. Retrieve them on the [database settings page](https://supabase.com/dashboard/project/_/settings/database)(https://supabase.com/dashboard/project/_/settings/database)

![supabase connection parameters](/assets/supabase_connection_params.png)

And substitute those values into a valid Postgres connection string
```
postgresql://<User>:<Password>@<Host>:<Port>/postgres
```
e.g.
```
postgresql://postgres.ahqsutirwnsocaaorimo:<Password>@aws-0-us-east-1.pooler.supabase.com:6543/postgres
```

Then we can call `vec2pg pinecone migrate` passing our values. You can supply all parameters directly to the CLI, but its a good idea to pass the Pinecone API Key (PINECONE_API_KEY) and Supabase connection string (POSTGRES_CONNECTION_STRING) as environment variables to avoid logging credentials to your shell's history.

![sample output](/assets/pinecone_to_supabase_output.png)

The CLI provies a progress bar to monitor the migration.

On completion, you can view a copy of the Pinecone index data in Supabase Postgres at `vec2pg.<pinecone index name>`

![view results](/assets/pinecone_view_results.png)

From there you can transform and manipulate the data in Postgres using SQL.

### Qdrant

```
vec2pg qdrant migrate --help
```

```
Usage: vec2pg qdrant migrate [OPTIONS] QDRANT_COLLECTION_NAME QDRANT_URL
QDRANT_API_KEY POSTGRES_CONNECTION_STRING
╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * qdrant_collection_name TEXT [default: None] [required] │
│ * qdrant_url TEXT [default: None] [required] │
│ * qdrant_api_key TEXT [env var: QDRANT_API_KEY] [default: None] [required] │
│ * postgres_connection_string TEXT [env var: POSTGRES_CONNECTION_STRING] [default: None] [required] │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

To migrate from Qdrant collection to Postgres you'll need to log in at https://cloud.qdrant.io/ and collect your:

- Qdrant API Key

![Qdrant api key](/assets/qdrant_api_key.png)

- Qdrant URL and collection name

![Qdrant cluster url](/assets/qdrant_nav_dashboard.png)

The URL is the "Cluster URL". To find the collection name, select "Open Dashboard".

![Qdrant collection name](/assets/qdrant_collection_name.png)

- A Supabase instance

From the Supabase instance, we need the connection parameters. Retrieve them on the [database settings page](https://supabase.com/dashboard/project/_/settings/database)(https://supabase.com/dashboard/project/_/settings/database)


![supabase connection parameters](/assets/supabase_connection_params.png)

And substitute those values into a valid Postgres connection string
```
postgresql://<User>:<Password>@<Host>:<Port>/postgres
```
e.g.
```
postgresql://postgres.ahqsutirwnsocaaorimo:<Password>@aws-0-us-east-1.pooler.supabase.com:6543/postgres
```

Then we can call `vec2pg qdrant migrate` passing our values. You can supply all parameters directly to the CLI, but its a good idea to pass the Qdrant API Key (QDRANT_API_KEY) and Supabase connection string (POSTGRES_CONNECTION_STRING) as environment variables to avoid logging credentials to your shell's history.

![sample output](/assets/qdrant_to_supabase_output.png)

The CLI provides a progress bar to monitor the migration.

On completion, you can view a copy of the Pinecone index data in Supabase Postgres at `vec2pg.<qdrant collection name>`

![view results](/assets/qdrant_view_results.png)

From there you can transform and manipulate the data in Postgres using SQL.



# Requisites
- Python >= 3.8

# Contributing

To run the tests you will need
- Python >= 3.8
- docker
- [Pinecone API key](https://docs.pinecone.io/guides/get-started/authentication#find-your-pinecone-api-key)

The Pinecone API key should be stored as an environment variable `PINECONE_API_KEY`

Run the tests
```
poetry run pytest
```

Run the pre-commit hooks
```
poetry run pre-commit run --all
```

# Star History

![](https://starchart.cc/supabase-community/vec2pg.svg)
Binary file added docs/assets/favicon.ico
Binary file not shown.
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
File renamed without changes
23 changes: 23 additions & 0 deletions docs/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Contributing

`vec2pg` is open source software. External contributions are welcome. Note that we have a high bar for testing.

Before opening a PR, please [create an issue](https://github.com/supabase-community/vec2pg/issues/new/choose) in GitHub to discuss and approve the change you're interested in making.

To run the tests you will need:

- Python >= 3.8
- Docker
- [Pinecone API key](https://docs.pinecone.io/guides/get-started/authentication#find-your-pinecone-api-key) - pinecone does not support a local mode, so we have to hit their service during testing

The Pinecone API key should be stored as an environment variable `PINECONE_API_KEY`

Run the tests
```
poetry run pytest
```

Run the pre-commit hooks
```
poetry run pre-commit run --all
```
62 changes: 62 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# `vec2pg`

<p>
<a href="https://github.com/supabase-community/vec2pg/actions">
<img src="https://github.com/supabase-community/vec2pg/workflows/tests/badge.svg" alt="Test Status" height="18">
</a>
<a href="https://github.com/supabase-community/vec2pg/actions">
<img src="https://github.com/supabase-community/vec2pg/workflows/pre-commit/badge.svg" alt="Pre-commit Status" height="18">
</a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.8+-blue.svg" alt="Python version" height="18"></a>
<a href=""><img src="https://img.shields.io/badge/postgresql-15+-blue.svg" alt="PostgreSQL version" height="18"></a>
</p>
<p>
<a href="https://github.com/supabase-community/vec2pg/blob/master/LICENSE"><img src="https://img.shields.io/pypi/l/markdown-subtemplate.svg" alt="License" height="18"></a>
<a href="https://badge.fury.io/py/alembic_utils"><img src="https://badge.fury.io/py/vec2pg.svg" alt="PyPI version" height="18"></a>
<a href="https://github.com/psf/black">
<img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Codestyle Black" height="18">
</a>
<a href="https://pypi.org/project/vec2pg/"><img src="https://img.shields.io/pypi/dm/vec2pg.svg" alt="Download count" height="18"></a>
</p>

---

**Documentation**: <a href="https://supabase-community.github.io/vec2pg" target="_blank">https://supabase-community.github.io/vec2pg</a>

**Source Code**: <a href="https://github.com/supabase-community/vec2pg" target="_blank">https://github.com/supabase-community/vec2pg</a>

---

`vec2pg` is a CLI tool for migrating data from third-party vector databases to [Supabase](https://supabase.com).


Supported data sources include:

- [Pinecone](pinecone.md)
- [Qdrant](qdrant.md)
- [[Vote for others]](https://github.com/supabase-community/vec2pg/issues/6)

The general flow involves passing an API key for your vector database, a Postgres connection string, and a reference to the collection you want to copy. `vec2pg` then presents a progress bar in the terminal that you can use to monitor progress. Once complete, the vectors and any associated metadata are available in your Postgres instance at `vec2pg.<collection_name>`.


### Usage

```
vec2pg --help
```

```
Usage: vec2pg [OPTIONS] COMMAND [ARGS]...
╭─ Options ──────────────────────────────────────────────────────────────╮
│ --install-completion Install completion for the current shell.│
│ --show-completion Show completion for the current shell │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────╯
╭─ Commands ─────────────────────────────────────────────────────────────╮
│ pinecone Move data from Pinecone to Supabase │
│ qdrant Move data from Qdrant to Supabase │
╰────────────────────────────────────────────────────────────────────────╯
```


21 changes: 21 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Installation

`vec2pg` is a simple package [available on PYPI](https://pypi.org/project/vec2pg/)

Requirements:

- Python >= 3.8

### From PYPI

Use your preferred package manager to add the package to your local enviroment.

Using pip
```sh
pip install vec2pg
```

Using poetry
```sh
poetry add vec2pg
```
Loading

0 comments on commit 38174cc

Please sign in to comment.