Embedding Speed Test

Measure how fast your embeddings endpoint returns vectors — per-request latency and embeddings/sec across a batch. Essential for sizing RAG ingestion pipelines.

🔒 Runs entirely in your browser · nothing is uploaded or stored

Embeddings endpoint (/v1/embeddings)

API key (local only)

📦 Sizing ingestion

Embeddings/sec tells you how long bulk indexing will take. 1M chunks at 50/sec ≈ 5.5 hours — plan batch jobs accordingly.

🧱 Batch vs single

Most APIs accept arrays. Batching cuts per-request overhead dramatically vs one call per chunk.

Frequently Asked Questions

What model name do I use?

Whatever your provider exposes, e.g. text-embedding-3-small, voyage-3, or a self-hosted model name. It must match the endpoint's expected schema.

Why measure embeddings/sec?

It directly determines RAG ingestion time and cost. Knowing it lets you decide between real-time and batch indexing.

Related tools