Fine-Tuning a Language Model — Custom Training with LLaMA

DodaTech 1 min read

In this tutorial, you'll learn about Fine. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

What You'll Learn

Fine-tune an open-source language model (like LLaMA or Mistral) on your own dataset using LoRA — efficient training that runs on a single GPU.

Why It Matters

BASE models are general. Fine-tuning adapts them to your domain — your company's style, your codebase, or a specific task format.

Real-World Use

Training a support bot on your company's tone, adapting a model to generate SQL from natural language, or teaching it your codebase's conventions.

What is Fine-Tuning?

Fine-tuning continues the training Process on a smaller, specialized dataset. The model already knows language — you're teaching it your specific patterns.

Efficient Fine-Tuning with LoRA

LoRA (Low-Rank Adaptation) freezes the original weights and trains small Adapter matrices instead. This:

Reduces memory by 10x
Trains in hours instead of days
Works on a single GPU (even 8GB)

Setup

pip install transformers datasets accelerate peft bitsandbytes

Step 1: Prepare Your Data

from datasets import Dataset

data = [
    {"instruction": "What is Python?", "output": "Python is a high-level programming language..."},
    {"instruction": "Explain decorators", "output": "Decorators are functions that modify other functions..."},
]

dataset = Dataset.from_list(data)

def format(example):
    return {"text": f"### Instruction: {example['instruction']}\n### Response: {example['output']}"}

dataset = dataset.map(format)

Step 2: Load Model with LoRA

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model

model_name = "mistralai/Mistral-7B-Instruct-v0.3"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_4bit=True,
    device_map="auto"
)

lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05
)

model = get_peft_model(model, lora_config)

Step 3: Train

from transformers import TrainingArguments, Trainer

args = TrainingArguments(
    output_dir="./lora-finetuned",
    per_device_train_batch_size=4,
    num_train_epochs=3,
    learning_rate=2e-4,
    logging_steps=10,
    save_steps=100,
)

trainer = Trainer(
    model=model,
    args=args,
    train_dataset=dataset,
)

trainer.train()

Step 4: Use Your Fine-Tuned Model

model.eval()
inputs = tokenizer("### Instruction: What is a decorator?\n### Response:", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

← Previous Using Hugging Face Transformers — Pretrained Models in Python Next → Building a RAG Pipeline with LangChain — Complete Guide

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Ai Ml