---
title: 'vec2pg: Migrate to pgvector from Pinecone and Qdrant'
description: Migrate vector data from vector DBs to Supabase
author: oli_rice
date: '2024-08-16'
tags:
  - launch-week
  - release-notes
  - tech
  - AI
categories:
  - launch-week
  - developers
  - platform
---
[vec2pg](https://github.com/supabase-community/vec2pg) is a CLI utility for migrating data from vector databases to [Supabase](https://supabase.com/), or any Postgres instance with [pgvector](https://github.com/pgvector/pgvector).

![vec2pg help](/images/blog/lw12/day-5/vec2pg_1.png)

## Objective

Our goal with [https://github.com/supabase-community/vec2pg](https://github.com/supabase-community/vec2pg) is to create an easy on-ramp to efficiently copy your data from various vector databases into Postgres with associated ids and metadata. The data loads into a new schema with a table name that matches the source e.g. `vec2pg.<collection_name>` . That output table uses [pgvector's](https://github.com/pgvector/pgvector) `vector` type for the embedding/vector and the builtin `json` type for additional metadata.

![vec2pg qudrant 1](/images/blog/lw12/day-5/vec2pg_qdrant_2.png)

Once loaded, the data can be manipulated using SQL to transform it into your preferred schema.

When migrating, be sure to increase your Supabase project's [disk size](https://supabase.com/dashboard/project/_/database/settings) so there is enough space for the vectors.

## Vendors

At launch we support migrating to Postgres from [Pinecone](https://www.pinecone.io/) and [Qdrant](https://qdrant.tech/). You can vote for additional providers in the [issue tracker](https://github.com/supabase-community/vec2pg/issues/6) and we'll reference that when deciding which vendor to support next.

Throughput when migrating workloads is measured in records-per-second and is dependent on a few factors:

- the resources of the source data
- the size of your Postgres instance
- network speed
- vector dimensionality
- metadata size

When throughput is mentioned, we assume a [Small](https://supabase.com/docs/guides/platform/compute-add-ons) Supabase Instance, a 300 Mbps network, 1024 dimensional vectors, and reasonable geographic colocation of the developer machine, the cloud hosted source DB, and the Postgres instance.

### Pinecone

vec2pg copies entire Pinecone indexes without the need to manage namespaces. It will iterate through all namespaces in the specified index and has a column for the namespace in its Postgres output table.

![vec2pg pinecone 1](/images/blog/lw12/day-5/vec2pg_pinecone_1.png)

Given the conditions noted above, expect 700-1100 records per second.

### Qdrant

The `qdrant` subcommand supports migrating from cloud and locally hosted Qdrant instances.

![vec2pg qdrant 2](/images/blog/lw12/day-5/vec2pg_qdrant_3.png)

Again, with the conditions mentioned above, Qdrant collections migrate at between 900 and 2500 records per second.

## Why Use Postgres/pgvector?

The main reasons to use Postgres for your vector workloads are the same reasons you use Postgres for all of your other data. Postgres is performant, scalable, and secure. Its a well understood technology with a wide ecosystem of tools that support needs from early stage startups through to large scale enterprise.

A few game changing capabilities that are old hat for Postgres that haven't made their way to upstart vector DBs include:

### **Backups**

Postgres has extensive supports for backups and point-in-time-recovery (PITR). If your vectors are included in your Postgres instance you get backup and restore functionality for free. Combining the data results in one fewer systems to maintain. Moreover, your relational workload and your vector workload are transactionally consistent with full referential integrity so you never get dangling records.

### **Row Security**

[Row Level Security (RLS)](https://supabase.com/docs/guides/database/postgres/row-level-security) allows you to write a SQL expression to determine which users are allowed to insert/update/select individual rows.

For example

```sql
create policy "Individuals can view their own todos."
  on public.todos
  for select
  using
    ( ( select auth.uid() ) = user_id );
```

Allows users of Supabase APIs to update their own records in the `todos` table.

Since `vector` is just another column type in Postgres, you can write policies to ensure e.g. each tenant in your application can only access their own records. That security is enforced at the database level so you can be confident each tenant only sees their own data without repeating that logic all over API endpoint code or in your client application.

### **Performance**

pgvector has world class performance in terms of raw throughput and dominates in performance per dollar. Check out some of our prior blog posts for more information on functionality and performance:

- [**What's new in pgvector v0.7.0**](https://supabase.com/blog/pgvector-0-7-0)
- [**pgvector 0.6.0: 30x faster with parallel index builds**](https://supabase.com/blog/pgvector-fast-builds)
- [**Matryoshka embeddings: faster OpenAI vector search using Adaptive Retrieval**](https://supabase.com/blog/matryoshka-embeddings)

Keep an eye out for our upcoming post directly comparing pgvector with Pinecone Serverless.

## Next Steps

To get started, head over to the [vec2pg GitHub Page](https://github.com/supabase-community/vec2pg), or if you're comfortable with CLI help guides, you can install it using `pip` :

```bash
pip install vec2pg
```

If your current vector database vendor isn't supported, be sure to weigh in on the [vendor support issue](https://github.com/supabase-community/vec2pg/issues/6).
