Token category

Qwen2 Multilingual

Multilingual text often splits into script-specific subwords, punctuation, and byte fragments. The same sentence can tokenize very differently across languages and writing systems.

Loading tokens...