GPU-accelerated index building

Send vectors. Get an index.

One API call. Powered by NVIDIA cuVS.

API reference

Two endpoints. That's the whole API.

Auth with X-TensorTensor-API-Key. JSON in, JSON out. No SDK required.

Queue a build

POST /api/v1/build

Submit a URL pointing to your raw float32 vectors and the dimension. Returns a job id immediately.

Request

curl -X POST https://api.tensortensor.com/api/v1/build \
  -H "X-TensorTensor-API-Key: $TENSORTENSOR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "email": "you@example.com",
    "vectorsUrl": "https://your.cdn/vectors.bin",
    "dimensions": 768
  }'

Response

{
  "data": {
    "jobId": "bld_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
    "status": "received"
  }
}

Check status & download

GET /api/v1/build/:jobId

Poll the job. Once status is complete, the response includes a downloadUrl and index metadata.

Request

curl https://api.tensortensor.com/api/v1/build/$JOB_ID \
  -H "X-TensorTensor-API-Key: $TENSORTENSOR_KEY"

Response

{
  "data": [{
    "jobId": "bld_xxx",
    "status": "complete",
    "submittedAt": "2026-04-26T18:46:16.000Z",
    "completedAt": "2026-04-26T19:14:02.000Z",
    "downloadUrl": "https://...hnsw",
    "vectors": 100000,
    "dimensions": 768,
    "fileSize": 524288000,
    "memoryRequired": 629145600
  }]
}

How it works

Three steps. That's the integration.

01
Send vectors

POST a URL to your raw float32 vectors plus the dimension. One call, X-TensorTensor-API-Key auth, returns a job id.
02
We build on GPU

We fetch the vectors, build a CAGRA index on NVIDIA GPUs via cuVS, and convert it to portable HNSW.
03
Download the index

Once ready, you get a download URL. Load the index in your app with hnswlib-node.

Load the index

Drop it in your app.

Node.js · hnswlib-node CPU-servable

// Load the index you downloaded
const hnsw = require('hnswlib-node');
const index = new hnsw.HierarchicalNSW('cosine', 768);
index.readIndex('./index.hnsw');
const result = index.searchKnn(queryVector, 10);

Also works with Python (hnswlib), C++ (hnswlib), faiss, and anything that reads HNSW.

Pricing

Pay by the vector. No subscriptions

Free trial to get started. Usage-based pricing from there. No monthly minimums.

01
Free trial

$0

For kicking the tires.
- 10 builds
- 768 & 1024 dimensions
- Up to 1M vectors per build
- Standard queue priority
- Community support
Get API key
Most popular
02
Pay as you go

$1 per 1M vectors

Buy credits. Build on demand.
- Unlimited builds
- 768, 1024, 1536 dimensions
- Up to 10M vectors per build
- Priority queue
- Email support
Get API key
03
Enterprise

Custom

Reserved capacity.
- Dedicated GPU instance
- Private download hosting
- SSO + audit logs
- SLA + direct Slack channel
Contact sales

When to use us

Your search stack. We build your index.

TensorTensor is for teams running their own vector search (Faiss, hnswlib, custom pipelines) who need consistent, repeatable index builds without the GPU infrastructure.

Send your vectors, get a portable HNSW file, load it wherever you want.

Send vectors. Get an index.

Two endpoints. That's the whole API.

Queue a build

Check status & download

Three steps. That's the integration.

Send vectors

We build on GPU

Download the index

Drop it in your app.

Pay by the vector. No subscriptions

Free trial

Pay as you go

Enterprise

Your search stack. We build your index.