Tokenizer vocabulary

Mistral Tekken v3

Mistral's open Tekken v3 byte-pair tokenizer, trained for efficient multilingual text and code tokenization. Vocabulary size, token ranges, and special-token IDs are listed here.

Mistral

Creator
Mistral
Mergeable tokens
130,072
Total known tokens
131,072

Browse by type

Open token index