Customizing Foundation Models

The adaptation ladder — prompting → RAG → fine-tuning → pre-training — and how to pick a model and method.

9 min read

There's a ladder of ways to make a foundation model fit your needs. Each rung adds capability — and cost, data requirements, and complexity. The exam constantly asks you to pick the right rung for a scenario.

The adaptation ladder (cheapest → most expensive)

MethodWhat it changesData neededTypical trigger
Prompt engineeringNothing in the model — just better instructionsNoneFirst attempt at any task
RAGWhat the model can *reference*Your documents (unlabeled)Company knowledge, current facts, citations
Fine-tuningThe model's *weights* — behavior, style, format, domain languageHundreds–thousands of labeled examplesConsistent specialized behavior prompting can't achieve
Continued pre-trainingDeep domain knowledge in the weightsLarge unlabeled domain corpusHeavy domain adaptation (medical, legal)
Train from scratchEverythingInternet-scale data + millions in computeAlmost never the right answer on this exam
Exam tip

Data-type tell: fine-tuning uses labeled prompt/response pairs; continued pre-training uses unlabeled domain text. And in Bedrock, customized models produce a private copy — the shared base model is never altered by your data — and require Provisioned Throughput to serve.

Choosing the base model

Selection criteria

  • Capability/quality on your task (benchmarks + your own evaluation).
  • Modality — text, image, embeddings, multimodal.
  • Context window — how much input it must hold.
  • Cost and latency — smaller models are cheaper and faster; prefer the smallest that meets quality.
  • Customization options — does it support fine-tuning?
  • Licensing — proprietary vs open weights.
Think of it like this

Prompting is giving directions. RAG is handing over the map. Fine-tuning is a training course that changes habits. Continued pre-training is a second university degree. Training from scratch is raising the employee from birth — almost never worth it.

Knowledge check
Question 1 of 4

A company wants its model to ALWAYS respond in its precise brand voice and proprietary report format — something prompting alone hasn't achieved reliably. It has 5,000 labeled example pairs. Which approach fits?