Glossary Core Concepts 1 min read

Tokens

Also known as: Token, Subword Token, BPE Token

The basic units of text that language models process, typically representing words, subwords, or characters. Token counts determine context window usage and API costs.

Tokenization, subword tokens, BPE, byte-pair encoding, token count, token limit, wordpiece, sentencepiece, vocabulary, token embedding, detokenization, special tokens, token budget, token efficiency, tokenizer

Sources & References