This directory contains a minimal, self-contained example demonstrating how alternative tokenization affects model performance on simple addition tasks.
The example trains a LoRA adapter on Llama-3.2-3B-Instruct to perform addition with two different tokenization schemes:
- Regular: Standard tokenization (e.g., "123 + 456")
- Irregular: Alternative tokenization with different token boundaries and +1 offset in the operation
minimal_example/
├── config.yaml # Training configuration
├── train.py # Training script
├── eval.py # Evaluation script
├── models/ # Model storage (gitignored)
│ ├── download.sh # Download pretrained model from HuggingFace
│ └── .gitignore # Excludes model files from git
└── README.md # This file
# Train the model (saves to models/minimal_example_lora)
python train.py
# With custom parameters
python train.py --batch-size 64 --num-samples 500000 --learning-rate 0.0001# Evaluate the trained model
python eval.py
# Evaluate a specific model
python eval.py --model-path models/my_custom_lora# Download pretrained model from HuggingFace
cd models
export HF_TOKEN=your_token # Only needed for private repos
./download.shEdit config.yaml to modify:
- Training digits and out-of-distribution test digits
- LoRA parameters (rank, target modules, etc.)
- Training hyperparameters (batch size, learning rate, etc.)
- Evaluation settings
- Alternative Tokenization: During training, numbers are tokenized in various ways to improve robustness
- Two Operations:
- Regular:
a + b - Irregular:
a + b + 1
- Regular:
- Evaluation: Tests on both seen digits and out-of-distribution 4-digit numbers
The model typically achieves:
- ~95%+ accuracy on regular addition
- ~90%+ accuracy on irregular addition
- Reasonable generalization to OOD digits
After training, a comprehensive report is saved in the model folder:
models/minimal_example_lora/FINAL_TRAINING_REPORT.md
This report includes:
- Complete configuration used
- Training dynamics with ASCII plots (loss curve, learning rate, evaluation progress)
- Detailed performance metrics and error analysis
- Sample predictions
- transformers
- peft
- torch
- PyYAML
- tqdm
See parent directory's requirements for full list.