How does Fine-Tuning Actually Work?
A reference guide for builders — based on the AISEA Co-labs session
Fine-tuning is the process of changing a pretrained model's weights so it behaves differently. Conceptually, a language model is a function f(x) that maps token sequences to probability distributions over possible next tokens. Fine-tuning changes that mapping. The model already has opinions — pretraining gave it those. Fine-tuning steers something that's already moving.
How models learn
Every training step runs the same loop: a forward pass produces a prediction, the loss function measures how wrong that prediction was (cross-entropy loss: L(y, ŷ) = −log[P(correct token)]), backpropagation computes a gradient for every weight in the network, and gradient descent takes a step downhill. Fine-tuning is this same loop running on your specific data at a lower learning rate, with most weights frozen depending on the technique you choose.
The spectrum — how much do you change?
There are four meaningful positions on the spectrum from least to most invasive: Text prompting changes nothing in the model. Examples in the context window bias the probability distribution at inference time. Free, instant, always the right starting point. Prompt tuning trains a small set of soft prompt vectors prepended to every input. The base model stays frozen. Gradients flow through it but only the prompt vectors receive updates. Rarely worth it in 2026 — mostly of academic interest now.
LoRA / QLoRA trains small low-rank adapter matrices alongside frozen base weights. The weight update ΔW = B × A, where A ∈ ℝʳˣⁿ and B ∈ ℝᵐˣʳ, with r (rank) chosen by the practitioner. With r=8 on a 7B model, the adapters represent 0.4% of the original parameter count. QLoRA adds 4-bit quantisation of the base weights, enabling fine-tuning of a 7B model on a 16GB consumer GPU.
Full fine-tuning updates all parameters. Requires A100-class hardware for anything beyond small models. Almost never the right first move.
The training objective: SFT
Supervised Fine-Tuning (SFT) is the training objective most builders will use. You supply labelled input/output pairs and train the model to reproduce the outputs. SFT is what you're optimising toward — LoRA is the efficiency strategy for how you get there. The two are not alternatives; they operate on different axes. When someone says "I LoRA fine-tuned my model," they almost certainly ran SFT with LoRA as the parameter efficiency technique.
The tool stack
Four layers, each answering a different question: HuggingFace Transformers + PEFT (foundation — model weights and LoRA math), Unsloth + Flash Attention (efficiency — 2× speed, 70% less VRAM, same results), Axolotl (orchestration — one YAML config wires model, dataset, technique, and hyperparams together), and managed platforms like Together AI or Fireworks AI (if you don't want to manage GPU infrastructure at all). For a first run: Google Colab + Unsloth + a 3B–8B open model is the zero-cost path.
Data formats
Training data must be structured correctly before the job will run. Alpaca format (instruction / input / output) suits single-turn tasks. ShareGPT format (a list of conversation turns) suits dialogue and multi-turn tasks. Raw JSONL suits continued pretraining on domain text or custom token templates. Data quality matters more than format choice — a thousand clean consistent examples outperform ten thousand messy ones.
When to use what
Start with prompting or RAG. If that provably fails, consider distillation — use a larger capable model to generate high-quality training data, then train a smaller model on it (distillation + LoRA is a common production pattern). If you need consistent style or format behaviour at scale and you want to own the weights, reach for LoRA/QLoRA with a dataset of at least 500 high-quality examples. Full fine-tuning is a last resort. The most common mistake builders make is reaching for fine-tuning before exhausting what a good prompt can do.
What fine-tuning cannot do
LoRA cannot inject new factual knowledge — that's what RAG is for. It can cause catastrophic forgetting if the dataset is too narrow or training runs too long. Rank is a ceiling, not a guarantee — a small or inconsistent dataset won't use all the expressive capacity you've allocated. And fine-tuning does not fix bad data: it amplifies whatever signal is present, including noise.
Attached files