The AWS AI/ML Service Stack
SageMaker, the pre-trained AI services, and how to choose between building, buying, and prompting.
AWS organizes its AI/ML offerings in three layers. At the bottom: infrastructure (GPU instances, custom Trainium/Inferentia chips). In the middle: Amazon SageMaker AI for building custom models. On top: AI services — pre-trained, API-callable capabilities that require zero ML expertise. Knowing which layer a scenario needs is a repeated exam theme.
Amazon SageMaker AI
Key points
- The managed platform to build, train, and deploy custom ML models end to end.
- SageMaker JumpStart — pre-built models and solutions to start from.
- SageMaker Data Wrangler / Feature Store — prepare data and manage features.
- SageMaker Clarify — detect bias and explain predictions (big in Domain 4).
- SageMaker Model Monitor — watch deployed models for drift.
- SageMaker Ground Truth — human labeling of training data.
Pre-trained AI services (know the one-liners)
Image & video analysis: objects, faces, moderation.
NLP on text: sentiment, entities, PII detection.
Speech → text.
Text → speech.
Language translation.
Extract text, tables, and forms from documents.
Conversational chatbots and voice bots.
Intelligent enterprise search over your documents.
Real-time recommendation engines.
Time-series forecasting (demand, inventory).
Detect online fraud with ML.
Generative AI: access foundation models via one API (next module).
GenAI assistants for business users and developers (next module).
Choose the layer by the constraint: "no ML expertise / fastest" → a pre-trained AI service. "custom model on our own data" → SageMaker. "generative AI / chatbot on company knowledge" → Bedrock (or Amazon Q). If a pre-trained service does the job, it beats building — less cost, less effort, no training data needed.
Underneath everything: AWS's ML infrastructure — GPU instance families, plus purpose-built silicon: AWS Trainium (training) and AWS Inferentia (inference) chips for better price-performance and energy efficiency.
A company with NO machine learning expertise wants to add text-sentiment analysis to its app as quickly as possible. What should it use?