Embedding model

text-embedding-3-large

Most capable third-generation OpenAI embedding model for English and multilingual retrieval workflows. By default, it produces 3,072-dimensional embeddings. Token IDs on this page use the shared cl100k_base tokenizer vocabulary.

OpenAI

Endpoint
Embeddings
Status
Current
Default dimensions
3,072
Max input
8,192 tokens
Tokenizer
cl100k_base
Tokenizer tokens
100,261 known tokens (100,256 mergeable)
Open model token IDs Open tokenizer reference OpenAI model docs