Embedding Speed Test
Measure how fast your embeddings endpoint returns vectors — per-request latency and embeddings/sec across a batch. Essential for sizing RAG ingestion pipelines.
🔒 Runs entirely in your browser · nothing is uploaded or stored
📦 Sizing ingestion
Embeddings/sec tells you how long bulk indexing will take. 1M chunks at 50/sec ≈ 5.5 hours — plan batch jobs accordingly.
🧱 Batch vs single
Most APIs accept arrays. Batching cuts per-request overhead dramatically vs one call per chunk.
Frequently Asked Questions
What model name do I use?
Whatever your provider exposes, e.g. text-embedding-3-small, voyage-3, or a self-hosted model name. It must match the endpoint's expected schema.
Why measure embeddings/sec?
It directly determines RAG ingestion time and cost. Knowing it lets you decide between real-time and batch indexing.
Related tools