Token category

Qwen2.5 Multilingual

Multilingual text often splits into script-specific subwords, punctuation, and byte fragments. The same sentence can tokenize very differently across languages and writing systems.

Loading tokens...