Principles of Responsible AI

The dimensions of trustworthy AI — fairness, explainability, transparency, privacy, robustness — and where bias comes from.

9 min read

The core dimensions

  • Fairness — the system's outcomes don't disadvantage groups (age, gender, ethnicity…).
  • Explainability — humans can understand *why* the model made a decision.
  • Transparency — being open about how the system works, its data, and its limits.
  • Privacy & security — protecting personal data throughout the AI lifecycle.
  • Robustness — reliable behavior even on unexpected or adversarial inputs.
  • Veracity/safety — outputs are accurate and not harmful.
  • Controllability & governance — humans can monitor, guide, and override the system; accountability is assigned.

Where bias comes from

SourceExample
Unrepresentative training data (sampling bias)A hiring model trained mostly on one demographic's résumés
Historical bias baked into dataLoan data reflecting decades of discriminatory lending
Measurement/labeling biasHuman labelers applying inconsistent standards
Algorithmic amplificationThe model exaggerates small imbalances present in data
Exam tip

Mitigations to recognize: diverse, representative, high-quality training data; bias audits before and after deployment (SageMaker Clarify); human oversight for consequential decisions; continuous monitoring of live behavior. "Garbage in, garbage out" is tested — biased data yields biased models regardless of the algorithm.

The interpretability trade-off

Simple models (linear regression, decision trees) are interpretable — you can read exactly why they decided. Deep neural networks and LLMs are far more capable but relatively opaque ("black boxes"). Regulated, high-stakes decisions (credit, hiring, healthcare) may favor an interpretable model, or require explainability tooling and human review layered on the powerful one.

Think of it like this

An interpretable model is a chef who explains every ingredient. A deep network is a chef whose dishes are superb but whose recipe even they can't fully articulate. For a casual dinner that's fine; for allergy-safe catering, you need the ingredient list.

Knowledge check
Question 1 of 4

A model systematically scores loan applicants from one region lower because the training data reflected historical discrimination. Which responsible AI dimension is violated?