Tokenization
Words become numbers
The first step is breaking text into tokens — small pieces the model can process, each mapped to a unique number.
"I ate a banana on Friday"
Key Insight
Each word gets a unique ID from the model's vocabulary. "banana" is always token #39127. Real tokenizers (like BPE) can split words into sub-pieces — "eating" might become "eat" + "ing".
Use arrow keys to navigate