Principles of Responsible AI

The dimensions of trustworthy AI — fairness, explainability, transparency, privacy, robustness — and where bias comes from.

9 min read

The core dimensions

Fairness — the system's outcomes don't disadvantage groups (age, gender, ethnicity…).
Explainability — humans can understand *why* the model made a decision.
Transparency — being open about how the system works, its data, and its limits.
Privacy & security — protecting personal data throughout the AI lifecycle.
Robustness — reliable behavior even on unexpected or adversarial inputs.
Veracity/safety — outputs are accurate and not harmful.
Controllability & governance — humans can monitor, guide, and override the system; accountability is assigned.

Where bias comes from

Source	Example
Unrepresentative training data (sampling bias)	A hiring model trained mostly on one demographic's résumés
Historical bias baked into data	Loan data reflecting decades of discriminatory lending
Measurement/labeling bias	Human labelers applying inconsistent standards
Algorithmic amplification	The model exaggerates small imbalances present in data

Exam tip

Mitigations to recognize: diverse, representative, high-quality training data; bias audits before and after deployment (SageMaker Clarify); human oversight for consequential decisions; continuous monitoring of live behavior. "Garbage in, garbage out" is tested — biased data yields biased models regardless of the algorithm.

The interpretability trade-off

Simple models (linear regression, decision trees) are interpretable — you can read exactly why they decided. Deep neural networks and LLMs are far more capable but relatively opaque ("black boxes"). Regulated, high-stakes decisions (credit, hiring, healthcare) may favor an interpretable model, or require explainability tooling and human review layered on the powerful one.

Think of it like this

An interpretable model is a chef who explains every ingredient. A deep network is a chef whose dishes are superb but whose recipe even they can't fully articulate. For a casual dinner that's fine; for allergy-safe catering, you need the ingredient list.

Knowledge check

Question 1 of 4

A model systematically scores loan applicants from one region lower because the training data reflected historical discrimination. Which responsible AI dimension is violated?

PreviousEvaluating GenAI Applications NextResponsible AI Tools on AWS