encoding_and_decoding

Encoding & Decoding (Redux)

This directory restructures the old encoding_and_decoding/ into a clean, configurable setup similar to minimal_example/.

It trains a LoRA adapter on a base Llama model to:

Encode: Given a binary code, emit text whose tokenization encodes the code (space-boundary convention).
Decode: Given a message that was encoded with a binary code, output the original 16-bit code.

Layout

enc_and_dec/
├── config.yaml      # Training and eval configuration
├── train.py         # Training loop with periodic eval and report
├── eval.py          # Standalone evaluation utilities (incl. o3 messages)
├── fns.py           # Prompt builders and helpers
└── models/
    ├── .gitignore
    └── download.sh

Quick Start

Training

cd enc_and_dec
python -m enc_and_dec.train --config config.yaml

Evaluation

python -m enc_and_dec.eval --config config.yaml
# Or only O3 messages
python -m enc_and_dec.eval --config config.yaml --o3-only

Notes

Uses utils/tokenization_utils.py for tokenizer and helpers.
Fails loudly: no silent fallbacks.
No torch.compile due to variable-length generations (poem-like outputs not used here, but encoding outputs vary).

Name		Name	Last commit message	Last commit date
parent directory ..
models		models
README.md		README.md
__init__.py		__init__.py
config.yaml		config.yaml
eval.py		eval.py
fns.py		fns.py
highlight_tokens.py		highlight_tokens.py
prompt_illustration.py		prompt_illustration.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Encoding & Decoding (Redux)

Layout

Quick Start

Notes

FilesExpand file tree

encoding_and_decoding

Directory actions

More options

Directory actions

More options

Latest commit

History

encoding_and_decoding

Folders and files

parent directory

README.md

Encoding & Decoding (Redux)

Layout

Quick Start

Notes