🎓 AWS DVA-C02 STUDY GUIDE - COMPREHENSIVE EDITION
📊 EXECUTIVE SUMMARY - Tổng quan bài thi
🎯 Exam Overview
- Thời gian: 130 phút
- Câu hỏi: 65 câu
- Điểm đạt: 720/1000
- Format: Multiple choice + Multiple response
- Giá: $150 USD
📈 Score Distribution
- Domain 1: Development (32%)
- Domain 2: Security (26%)
- Domain 3: Deployment (24%)
- Domain 4: Troubleshooting (18%)
⚡ Top 5 Services (80% Score)
- Lambda: 15-18% câu hỏi
- DynamoDB: 12-15%
- API Gateway: 10-12%
- IAM: 10-12%
- CloudWatch: 8-10%
🎓 Study Strategy
- Week 1-3: TIER 1 (deep dive)
- Week 4: TIER 2 (focused)
- Week 5: TIER 3 + Integration
- Week 6: Practice exams + review
- Target: 85%+ on practice tests
🔑 Service Dependency Map
┌──────────────────────────────────────────────────────────────────┐
│ APPLICATION LAYER │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ API Gateway │───▶│ Lambda │───▶│ DynamoDB │ │
│ │ (Frontend) │ │ (Compute) │ │ (Data) │ │
│ └──────────────┘ └──────┬───────┘ └──────────────┘ │
│ │ │ │ │
└─────────┼────────────────────┼────────────────────┼──────────────┘
│ │ │
┌─────────┼────────────────────┼────────────────────┼──────────────┐
│ ▼ SECURITY & OBSERVABILITY ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ Cognito │ │ IAM │ │ CloudWatch │ │
│ │ (Auth) │ │ (Access) │ │ (Monitor) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────────┘
│ │ │
┌─────────┼────────────────────┼────────────────────┼──────────────┐
│ ▼ INTEGRATION & DELIVERY ▼ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │ SQS/SNS │ │ CodePipeline│ │ S3 │ │
│ │ (Messaging) │ │ (CI/CD) │ │ (Storage) │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────────┘
🧭 NAVIGATION HUB - Lộ trình học tập
📊 Study Progress Tracker
0 / 15 topics completed
💡 Tip: Sử dụng checklist này
- Tick ✓ sau khi hoàn thành mỗi topic
- Progress tự động save trong browser (localStorage)
- Review lại các topic chưa tick trước khi thi
- Aim for 100% completion 1 tuần trước exam
TIER 1 - CRITICAL SERVICES (65-70% điểm)
1. AWS LAMBDA
🔷 AWS LAMBDA - Core Overview
🎯 Exam Weight
15-18% of total exam
~10-12 câu hỏi
⚡ Core Purpose
Serverless compute - event-driven code execution
Không cần quản lý servers
🔑 Must-Know Topics
- Invocation types & triggers
- Concurrency models
- Error handling & retries
- VPC integration
🎓 Study Priority
⭐⭐⭐⭐⭐ CRITICAL
Est. time: 8-10 hours
Spend most time here!
💡 Mental Model: Lambda như "Function Vending Machine"
┌──────────────────────────────────────────────────────────────┐
│ EVENT arrives (API call, S3 upload, Schedule, etc) │
└─────────────────────┬────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ Lambda Container Pool │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ WARM │ │ COLD │ │ PROVISIONED│ │
│ │ Container │ │ Container │ │ Container │ │
│ │ (Ready) │ │ (Setup) │ │ (Always) │ │
│ └──────┬─────┘ └──────┬─────┘ └──────┬─────┘ │
└─────────┼────────────────┼────────────────┼──────────────────┘
↓ ↓ ↓
<10ms 100-300ms Always warm
Reuse Cold Start (Pre-warmed)
existing Initialize Cost: $$$
container everything
🎭 Analogy dễ hiểu:
Lambda = Nhân viên làm việc theo giờ:
- Warm start: Nhân viên đang ở văn phòng, sẵn sàng làm ngay (< 10ms)
- Cold start: Phải gọi nhân viên tới, mở cửa, setup bàn làm việc (100-300ms delay)
- Provisioned: Trả lương giữ nhân viên luôn ở văn phòng (đắt hơn nhưng không delay)
📚 Core Concepts
- Serverless compute service: chạy code không cần quản lý server
- Event-driven: code chỉ chạy khi có trigger
- Pay per use: tính tiền theo số request + compute time
- Auto-scaling: tự động scale based on incoming requests
- Stateless: mỗi invocation độc lập, không share state
⚙️ Function Configuration
Runtime: Node.js, Python, Java, Go, .NET, Ruby, Custom Runtime
Memory: 128MB - 10,240MB (tăng memory = tăng CPU tương ứng)
Timeout: max 900s (15 phút)
Ephemeral storage: /tmp (512MB - 10GB)
Environment variables: 4KB total size limit
Handler: entry point function (e.g., index.handler)
🔄 Execution Environment
- Cold Start: lần đầu invoke function, AWS phải setup environment (~100-300ms)
- Warm Start: function đã chạy gần đây, reuse container
- Optimization:
- Minimize package size
- Use layers cho shared code
- Keep functions warm bằng scheduled events
Environment Variables
- Key-value pairs available trong function code
- Có thể encrypt bằng KMS
- Use case: DB connection strings, API keys, configuration
Layers
- Shared code/libraries dùng chung cho nhiều functions
- Max 5 layers per function
- Max 250MB unzipped (all layers + function)
- Use case: common dependencies, libraries, custom runtimes
Versions & Aliases
Versions:
- Immutable snapshot của function code + configuration
- $LATEST = version mới nhất (mutable)
- Published versions: v1, v2, v3... (immutable)
Aliases:
- Pointer trỏ đến version cụ thể
- Mutable (có thể change version mà alias trỏ tới)
- Use case: dev, staging, prod environments
- Weighted aliases: Blue/Green deployment (70% v1, 30% v2)
Concurrency
Reserved Concurrency:
- Đặt trước số concurrent executions cho function
- Đảm bảo function luôn có capacity
- Giới hạn function không vượt quá X executions
Provisioned Concurrency:
- Giữ X instances luôn warm (không cold start)
- Đắt hơn nhưng performance tốt hơn
- Use case: latency-sensitive applications
Account limit: 1000 concurrent executions/region (có thể tăng)
Error Handling
Synchronous invocation:
- API Gateway, ALB, Cognito
- Error trả về ngay cho client
- Client phải retry
Asynchronous invocation:
- S3, SNS, EventBridge
- Lambda tự retry 2 lần (total 3 attempts)
- Có thể config destination:
- On Success: SQS, SNS, Lambda, EventBridge
- On Failure: SQS, SNS (hoặc DLQ)
Stream-based invocation:
- DynamoDB Streams, Kinesis
- Lambda retry until success hoặc data expire
- Failed batches block shard processing
Dead Letter Queue (DLQ)
- SQS hoặc SNS nhận failed events
- Chỉ cho async invocations
- Phải config IAM permissions
Common Triggers
Trigger |
Type |
Use Case |
API Gateway |
Sync |
REST APIs |
S3 |
Async |
File processing |
DynamoDB Streams |
Stream |
Data replication |
SQS |
Poll-based |
Queue processing |
EventBridge |
Async |
Scheduled tasks, event routing |
SNS |
Async |
Fan-out notifications |
CloudWatch Logs |
Async |
Log processing |
IAM Permissions
Execution Role: Lambda needs này để access AWS resources
{
"Effect": "Allow",
"Action": [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents",
"dynamodb:GetItem"
],
"Resource": "*"
}
Resource-based Policy: Who can invoke Lambda
{
"Effect": "Allow",
"Principal": {"Service": "s3.amazonaws.com"},
"Action": "lambda:InvokeFunction"
}
🧠 Mnemonics & Memory Tricks
Lambda Limits - "LAMBDA TIME"
- Layers: max 5 layers per function
- Account concurrent: 1000 default limit
- Memory: 128MB - 10,240MB (10GB)
- Burst storage: /tmp 512MB - 10GB
- Duration: max 900s = 15 minutes
- All code size: 250MB unzipped (function + layers)
- Triggers: 20+ AWS services can invoke
- Invocations: 3 types - Sync, Async, Stream
- Minimum memory: 128MB
- Environment vars: 4KB total size
Invocation Types - "SAS Model"
Type |
Retry Behavior |
Triggers |
Use When |
Sync |
Client retries |
API Gateway, ALB, direct invoke |
Need immediate response |
Async |
Lambda retries 2x (3 total) |
S3, SNS, EventBridge, SES |
Fire-and-forget, background jobs |
Stream |
Retry until success/expire |
Kinesis, DynamoDB Streams, SQS |
Ordered processing, queue polling |
Concurrency - "RIP" Model
- Reserved Concurrency: Giới hạn max executions, đảm bảo có capacity
- Incremental scaling: +500 burst, then +500/minute
- Provisioned Concurrency: Keep warm, no cold start (đắt hơn)
VPC Lambda - "ENI" Rule
Remember: VPC Lambda needs ENI (Elastic Network Interface)
- ENA setup takes time → slower cold start (1-2s thêm)
- NAT gateway needed for internet access
- IAM role cần EC2 network permissions
⚠️ Common Mistakes & Exam Traps
Mistake |
Why Wrong |
Correct Approach |
Exam Keyword |
Hardcode credentials in env vars |
Security risk, rotation breaks |
Use IAM roles + Secrets Manager/Systems Manager |
"Store DB password" → ❌ env vars |
Not setting timeout correctly |
Default 3s too short |
Set timeout > expected duration, max 15min |
"Silent failures" = timeout issue |
Ignoring cold starts |
Latency spikes hurt UX |
Provisioned concurrency OR optimize package size |
"Latency-sensitive" = provisioned |
VPC Lambda without NAT |
Cannot reach internet endpoints |
Add NAT gateway OR use VPC endpoints for AWS services |
"VPC + internet" = NAT gateway |
Not using DLQ for async |
Failed events lost after retries |
Configure SQS/SNS DLQ to capture failures |
"Prevent data loss" = DLQ |
Putting too much in /tmp |
/tmp cleaned between invocations |
Use S3 or EFS for persistent storage |
"Share data between invocations" = S3/EFS |
Not handling throttling |
429 errors crash application |
Exponential backoff + retry logic OR SQS buffer |
"Too many requests" = throttling |
Using wrong invocation type |
Sync when should be async |
Async for long-running, sync for immediate response |
"> 29s processing" = must use async |
🔗 Integration Patterns với Lambda
Pattern 1: API-Driven (Synchronous)
┌─────────┐ ┌──────────────┐ ┌────────┐ ┌──────────┐
│ Client │────▶│ API Gateway │─────▶│ Lambda │────▶│ DynamoDB │
└─────────┘ │ (REST/HTTP) │ │ (<29s) │ │ / RDS │
└──────────────┘ └────┬───┘ └──────────┘
│
▼
Response back
✅ Use when:
- Web/mobile API
- Need immediate response
- User waiting for result
⚠️ Watch out:
- API Gateway timeout: 29s max
- Lambda timeout must be < 29s
- Handle errors gracefully (return proper HTTP codes)
💰 Cost: API Gateway requests + Lambda compute time
Pattern 2: Event-Driven (Asynchronous)
┌─────────┐ ┌────────┐ ┌────────────┐ ┌──────────┐
│ User │─────▶│ S3 │────▶│ Lambda │─────▶│ SNS │
│ uploads │ │ Event │ │ (Process) │ │ Notify │
└─────────┘ └────────┘ └─────┬──────┘ └──────────┘
│
▼ (On failure)
┌──────────┐
│ DLQ │
│ (SQS) │
└──────────┘
✅ Use when:
- Background processing
- No immediate response needed
- Can tolerate eventual consistency
⚠️ Watch out:
- Configure DLQ to catch failures
- Idempotency (duplicates possible)
- Lambda retries 2x automatically
💰 Cost: S3 events (free) + Lambda compute only
Pattern 3: Queue-Driven (Poll-based)
┌──────────┐ ┌─────────┐ ┌────────────────┐ ┌─────────┐
│ Producer │─────▶│ SQS │◀───▶│ Lambda (polls)│─────▶│ Process │
│ │ │ (Buffer)│ │ Batch: 1-10 │ │ Delete │
└──────────┘ └─────────┘ └────────┬───────┘ └─────────┘
│
▼ (Failed msgs)
┌──────────┐
│ DLQ/Retry│
└──────────┘
✅ Use when:
- Decouple producer/consumer
- Throttle processing rate
- Need retry control
- Variable traffic patterns
⚠️ Watch out:
- Visibility timeout must be > Lambda timeout
- Failed messages go back to queue
- FIFO queue for ordering (lower throughput)
💰 Cost: SQS requests + Lambda compute time
Pattern 4: Stream Processing (Real-time)
┌──────────┐ ┌─────────────┐ ┌────────┐ ┌──────────┐
│ DynamoDB │─────▶│ Stream │─────▶│ Lambda │─────▶│ Analytics│
│ Table │ │ (24h window)│ │(Ordered│ │ / Archive│
└──────────┘ └─────────────┘ │per PK) │ └──────────┘
└────┬───┘
│
▼ (Failed batch)
Blocks shard!
✅ Use when:
- Change Data Capture (CDC)
- Real-time analytics
- Data replication
- Audit trail
⚠️ Watch out:
- Failed batches BLOCK shard processing
- Configure bisect on error for Kinesis
- Ordering per partition key only
- Stream data expires (24h Kinesis, 24h DynamoDB)
💰 Cost: Stream charges + Lambda compute
Pattern 5: Scheduled Jobs (Cron-like)
┌──────────────┐ ┌────────┐ ┌──────────────┐
│ EventBridge │─────▶│ Lambda │─────▶│ Cleanup old │
│ (Cron rule) │ │ runs │ │ S3 objects │
│ rate(1 day) │ └────────┘ └──────────────┘
└──────────────┘
✅ Use when:
- Periodic tasks
- Scheduled maintenance
- Report generation
⚠️ Watch out:
- EventBridge max 1 minute resolution
- Lambda timeout = 15min max
- Use Step Functions for longer jobs
💰 Cost: EventBridge rules (cheap) + Lambda compute
❓ Self-Check Questions - Lambda
Q1: Troubleshooting Scenario
Scenario: Your Lambda function processes S3 image uploads. Users report that some large images (> 5MB) aren't being processed, but CloudWatch shows no errors. What's the most likely issue?
💭 Think first, then expand answer
🎯 Correct Answer:
Lambda timeout is too short (default 3s)
🔍 Why other answers wrong:
- IAM permissions: Would show errors in CloudWatch ❌
- S3 event not configured: No images would be processed ❌
- Memory too low: Would show OOM (Out of Memory) errors ❌
- Concurrent limit: Would show throttling errors ❌
📝 Exam Tip:
"Silent failures" in async Lambda = timeout issue
Large files take longer → timeout before completion → no error logged
🛠️ How to fix:
1. CloudWatch Logs → Search "Task timed out"
2. Lambda Configuration → Increase timeout (e.g., 60s)
3. OR optimize code to process faster
4. OR use Step Functions for long-running tasks
Q2: Concurrency & Scaling
Scenario: Your application needs 1200 concurrent Lambda executions during peak hours. Current account limit is 1000. What should you do?
💭 Think about options
🎯 Best Answers (Multiple correct):
- Request limit increase via AWS Support (Recommended)
- Simple, direct solution
- Usually approved within 24-48h
- Can request up to tens of thousands
- Use SQS to buffer requests (Alternative)
- Throttle processing rate
- Lambda pulls from queue at manageable rate
- Good for variable traffic
- Reserved concurrency for critical functions
- Guarantee capacity for important functions
- Other functions share remaining capacity
❌ Wrong Answers:
- Split into multiple AWS accounts: Overkill, management overhead ❌
- Use EC2 instead: Defeats serverless purpose ❌
- Provisioned concurrency: Keeps warm, doesn't increase limit ❌
📝 Exam Keywords:
If question says... |
Answer likely involves... |
"Exceeding concurrent limit" |
Request limit increase OR SQS buffer |
"Throttling errors 429" |
Increase limit OR reserved concurrency |
"Variable traffic patterns" |
SQS buffering |
Q3: Cost Optimization
Scenario: Lambda with 512MB memory runs 2 seconds, processing 1M requests/month. How to optimize cost while maintaining performance?
💭 Calculate & think
💰 Current Cost:
Compute: 1M × 2s × 512MB = 1M GB-seconds
After free tier (400K): 600K GB-seconds
Cost: 600K × $0.0000166667 = ~$10/month
Requests: 1M requests (free tier covers 1M)
Total: ~$10/month
🎯 Optimization Strategies:
Strategy |
Impact |
Trade-off |
INCREASE memory to 1024MB |
✅ Faster execution (1s) = 50% cost saving! |
Counterintuitive but works |
Optimize code |
✅ Reduce runtime directly |
Dev time investment |
Use Lambda Layers |
✅ Smaller package = faster cold start |
None, best practice |
Batch processing |
✅ Fewer invocations |
Higher latency per request |
❌ Decrease memory |
❌ Slower = MORE cost |
Don't do this! |
📝 Exam Answer:
"Increase memory allocation" - more CPU = faster = cheaper overall
Lambda pricing = memory × time, so reducing time can offset memory cost
🧮 Proof:
512MB × 2s = 1024 MB-seconds
1024MB × 1s = 1024 MB-seconds (same!)
But 1024MB × 0.8s = 819 MB-seconds (20% cheaper!)
Q4: VPC Integration
Scenario: Lambda needs to access RDS in private subnet AND call external API. What's required?
💭 Think about networking
🎯 Complete Solution:
- Lambda in VPC
- Configure VPC, subnets, security groups
- Lambda gets ENI in your VPC
- RDS Security Group
- Allow inbound from Lambda security group
- Port 3306 (MySQL) or 5432 (PostgreSQL)
- NAT Gateway for internet
- Lambda in private subnet
- Route table: 0.0.0.0/0 → NAT Gateway
- NAT Gateway in public subnet
- IAM Execution Role
- EC2:CreateNetworkInterface
- EC2:DescribeNetworkInterfaces
- EC2:DeleteNetworkInterface
📊 Architecture:
┌─────────────────────────────────────────────┐
│ VPC │
│ ┌──────────────────┐ ┌──────────────────┐│
│ │ Public Subnet │ │ Private Subnet ││
│ │ │ │ ││
│ │ NAT Gateway ────┼──▶ Lambda ────────┐ ││
│ │ │ │ │ │ │ ││
│ └───────┼──────────┘ └───────┼─────────┼─┘│
│ │ │ │ │
│ Internet ▼ ▼ │
│ Gateway RDS External│
│ API │
└─────────────────────────────────────────────┘
❌ Common Mistakes:
- Lambda in public subnet: Still can't reach internet without NAT ❌
- No NAT Gateway: Can't call external API ❌
- Missing IAM ENI permissions: Lambda can't create network interface ❌
💡 Cost Optimization:
VPC Endpoints for AWS services (no NAT needed for S3, DynamoDB, etc.)
S3 VPC Endpoint → No NAT charges
DynamoDB VPC Endpoint → No NAT charges
Only external APIs need NAT Gateway
📝 Exam Keywords:
- "VPC + internet access" = NAT Gateway required
- "Private RDS access" = Lambda in VPC, security group rules
- "Minimize cost" = VPC Endpoints for AWS services
📝 Exam-Specific Notes: Lambda
🎯 High-Frequency Question Types:
- Timeout scenarios (8-10 questions per exam)
- Always check timeout config first
- API Gateway max: 29s
- SQS visibility timeout must be > Lambda timeout
- Silent failures = timeout issue
- Concurrency & throttling (5-7 questions)
- Account limit: 1000 default (can increase)
- Reserved concurrency: dedicate X to function
- Provisioned concurrency: keep X warm (different!)
- Throttling error: 429 TooManyRequestsException
- Error handling & retries (6-8 questions)
- Sync: client retries
- Async: Lambda retries 2x (3 total attempts)
- Stream: retry until success or data expires
- DLQ only for async invocations
- VPC integration (3-5 questions)
- VPC = ENI required (slower cold start)
- Internet access = NAT Gateway needed
- VPC Endpoints for AWS services (no NAT)
- IAM needs EC2 network permissions
- Security & permissions (4-6 questions)
- Execution role: what Lambda can access
- Resource policy: who can invoke Lambda
- Never hardcode credentials
- Use Secrets Manager or Systems Manager
🚩 Red Flag Keywords - Lambda
⚠️ If answer suggests these → Usually WRONG!
Red Flag |
Why Wrong |
Correct Answer |
"Store credentials in env vars" |
Security risk |
Use IAM roles + Secrets Manager |
"Store large files in /tmp" |
/tmp ephemeral, 512MB-10GB limit |
Use S3 or EFS |
"Use provisioned concurrency for cost" |
More expensive |
Provisioned = performance, not cost |
"Lambda in public subnet" |
Still need NAT for internet |
Private subnet + NAT Gateway |
"Increase memory to reduce cost" |
Sounds wrong but... |
✅ CORRECT! Faster = cheaper overall |
⏱️ Time Management Tips:
- Lambda questions: Average 90 seconds each
- Quick wins: Timeout, concurrency, invocation type questions (30-45s)
- Time sinks: Complex VPC + security scenarios (2-3 min)
- Strategy: Flag complex VPC questions, come back later
🎓 Must Memorize Numbers:
Limit |
Value |
Exam Frequency |
Max timeout |
900s (15 min) |
⭐⭐⭐⭐⭐ Very High |
Default timeout |
3s |
⭐⭐⭐⭐⭐ Very High |
Memory range |
128MB - 10GB |
⭐⭐⭐⭐ High |
Account concurrency |
1000 (default) |
⭐⭐⭐⭐ High |
Max layers |
5 |
⭐⭐⭐ Medium |
Package size |
250MB unzipped |
⭐⭐⭐ Medium |
/tmp storage |
512MB - 10GB |
⭐⭐⭐ Medium |
Async retries |
2 (3 total attempts) |
⭐⭐⭐⭐ High |
API Gateway timeout |
29s |
⭐⭐⭐⭐⭐ Very High |
2. AMAZON DYNAMODB
Core Concepts
- NoSQL database: key-value & document store
- Fully managed: auto scaling, backup, replication
- Single-digit millisecond performance
Table Structure
Primary Key Options:
- Partition Key only (Simple Primary Key)
- Must be unique
- Determines physical partition
- Ex: UserID
- Partition Key + Sort Key (Composite Primary Key)
- Partition key groups items
- Sort key orders within partition
- Combination must be unique
- Ex: UserID (PK) + Timestamp (SK)
Indexes
Local Secondary Index (LSI):
- Same partition key, different sort key
- Must create at table creation (cannot add later)
- Max 5 LSIs per table
- Shares RCU/WCU with base table
- Use case: Query same partition key, different sort order
Global Secondary Index (GSI):
- Different partition key và/hoặc sort key
- Can add/delete anytime
- Has own RCU/WCU (riêng biệt với base table)
- Eventually consistent reads only
- Max 20 GSIs per table
- Use case: Query on non-primary key attributes
Khi nào dùng GSI vs LSI:
- GSI: cần query theo attribute khác (không phải PK)
- LSI: cần query theo sort key khác trong cùng partition
- Default: dùng GSI (flexible hơn)
Capacity Modes
Provisioned:
- Specify RCU (Read Capacity Units) và WCU (Write Capacity Units)
- 1 RCU = 1 strongly consistent read/s cho item ≤4KB
- 1 RCU = 2 eventually consistent reads/s cho item ≤4KB
- 1 WCU = 1 write/s cho item ≤1KB
- Cheaper nếu traffic predictable
- Auto-scaling available
On-Demand:
- Pay per request
- No capacity planning needed
- 2.5x đắt hơn provisioned
- Use case: unpredictable traffic, spiky workloads
Read Consistency
Eventually Consistent (default):
- Fastest, cheapest
- Có thể đọc stale data (< 1 second lag)
Strongly Consistent:
- Always latest data
- Higher latency, cost 2x RCU
- Specify:
ConsistentRead=True
Operations
Query:
- Requires partition key
- Optionally filter by sort key
- Efficient, uses indexes
- Returns sorted results
- Can reverse order:
ScanIndexForward=False
Scan:
- Reads entire table
- Inefficient, expensive
- Can filter results (filter after read)
- Parallel scans available
- Avoid in production when possible
GetItem/PutItem/UpdateItem/DeleteItem:
- Single item operations
- GetItem: strongly or eventually consistent
- UpdateItem: atomic operations (increment, decrement)
DynamoDB Streams
- Ordered record of item-level changes (insert, update, delete)
- Retention: 24 hours
- 4 view types:
- KEYS_ONLY: chỉ key của item changed
- NEW_IMAGE: entire item sau khi changed
- OLD_IMAGE: entire item trước khi changed
- NEW_AND_OLD_IMAGES: both before và after
Use Cases:
- Real-time analytics
- Replicate data to other tables/regions
- Trigger Lambda on data changes
- Audit trail
Integration với Lambda:
- Lambda polls stream
- Batch size: 1-10,000 records
- Ordered processing per partition key
- Failed batches block shard
Advanced Features
Conditional Writes:
# Only update if item exists
table.put_item(
Item={'id': '123', 'status': 'active'},
ConditionExpression='attribute_exists(id)'
)
Optimistic Locking:
- Use version number attribute
- Increment on each update
- Condition: version = expected_version
TTL (Time To Live):
- Auto delete items after expiry
- Free (no WCU cost)
- Attribute must be number (Unix timestamp)
- Deletion within 48 hours (not immediate)
- DynamoDB Streams captures deletes
Transactions:
- ACID operations across multiple items/tables
- All-or-nothing
- TransactWriteItems: up to 100 items
- TransactGetItems: up to 100 items
- Cost: 2x RCU/WCU
Backup & Restore:
- On-demand backups: manual, retained until deleted
- Point-in-time recovery (PITR): continuous backups, restore any second in last 35 days
3. AMAZON API GATEWAY
API Types
REST API:
- Full-featured
- Regional, Edge-optimized, Private endpoints
- API keys, usage plans, request validation
- Caching available
HTTP API:
- Simpler, cheaper (70% cheaper)
- Lower latency
- OIDC, OAuth 2.0 support
- No usage plans, API keys, caching
- Use case: simple proxy to Lambda, HTTP backends
WebSocket API:
- Bi-directional communication
- Persistent connections
- Use case: chat apps, real-time dashboards
Integration Types
Lambda Proxy (Recommended):
// API Gateway passes entire request to Lambda
event = {
httpMethod: 'GET',
path: '/users/123',
headers: {...},
queryStringParameters: {...},
body: '...'
}
// Lambda returns formatted response
return {
statusCode: 200,
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({message: 'success'})
}
Lambda Non-Proxy:
- Manual mapping templates
- Transform request/response
HTTP Proxy:
- Forward request to HTTP endpoint
- Minimal transformation
AWS Service:
- Direct integration với AWS services (DynamoDB, S3, SNS)
- No Lambda needed
Mock:
- Return hardcoded response
- Use case: testing, development
Stages & Deployments
Stage:
- Environment pointer (dev, test, prod)
- Each stage có own URL:
https://api-id.execute-api.region.amazonaws.com/stage-name
- Stage variables: environment-specific config (Lambda alias, DB endpoint)
Deployment:
- Snapshot of API configuration
- Must deploy to make changes live
Canary Deployment:
- Route % traffic to new version
- Ex: 10% to canary, 90% to stable
- Promote or rollback based on metrics
Authorization
IAM Authorization:
- Use IAM credentials (Sig v4)
- Good for internal AWS services
- Client must sign requests
Lambda Authorizer (Custom):
- Your Lambda validates token (JWT, OAuth)
- Returns IAM policy
- Result cached by
authorizationCacheTTL
- Types:
- Token-based:
Authorization
header
- Request-based: entire request
Cognito User Pool:
- Managed authentication
- API Gateway validates JWT token
- No custom code needed
API Keys:
- Simple key-based access
- Use với usage plans
- NOT for authorization (không secure)
- Use case: rate limiting, monitoring
Throttling & Usage Plans
Throttling:
- Account-level: 10,000 RPS (requests per second)
- Burst: 5,000 requests
- Can set per stage, per method
- Returns
429 Too Many Requests
Usage Plans:
- Throttling limits per API key
- Quota: max requests per day/week/month
- Associate với stages
- Use case: tiered pricing, partner APIs
Request/Response Transformation
Mapping Templates:
- Transform request/response using VTL (Velocity Template Language)
- Use case: change JSON structure, add/remove fields
Request Validation:
- Validate request before reaching backend
- Checks: required parameters, data types, format
- Returns
400 Bad Request
if invalid
CORS (Cross-Origin Resource Sharing)
Problem: Browser blocks requests from different origin
Solution:
Enable CORS in API Gateway:
- OPTIONS method returns:
Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET,POST,PUT
Access-Control-Allow-Headers: Content-Type
Lambda Proxy: Must return CORS headers in response
Caching
- Cache responses at stage level
- TTL: 0-3600 seconds
- Size: 0.5GB - 237GB
- Per-key caching: cache based on query params, headers
- Invalidate: client sends
Cache-Control: max-age=0
header (needs IAM permission)
- Cost: extra charge per hour
Monitoring
- CloudWatch Metrics: latency, error count, cache hit/miss
- CloudWatch Logs: detailed request/response logs
- X-Ray: distributed tracing
- Access logs: who called API, when
4. AWS IAM (Identity and Access Management)
Core Components
Users:
- Permanent credentials
- Long-term access keys
- Use case: developers, admins (avoid for applications)
Groups:
- Collection of users
- Assign policies to groups
- Users inherit group permissions
Roles:
- Temporary credentials (STS)
- Can be assumed by: users, services, AWS accounts
- No long-term credentials
- Always use roles for EC2, Lambda, ECS
Policies:
- JSON documents defining permissions
- Types:
- Identity-based: attached to users/groups/roles
- Resource-based: attached to resources (S3, Lambda, SQS)
Policy Structure
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow", // or "Deny"
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem"
],
"Resource": "arn:aws:dynamodb:region:account:table/MyTable",
"Condition": {
"StringEquals": {
"dynamodb:LeadingKeys": ["${aws:username}"]
}
}
}
]
}
Policy Evaluation:
- Default: Deny
- Explicit Deny > Explicit Allow
- If any Deny exists, final result = Deny
Role Types
Service Role:
- For AWS services (Lambda, EC2, ECS)
- Trust policy allows service to assume role
Cross-Account Role:
- Allow users from Account A to access Account B
- Trust policy specifies Account A
Trust Policy vs Permission Policy
Trust Policy (Who can assume):
{
"Effect": "Allow",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
Permission Policy (What can be done):
{
"Effect": "Allow",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::mybucket/*"
}
Resource-Based Policies
Lambda:
- Who can invoke function
- Services, accounts, principals
S3 Bucket Policy:
- Who can access bucket
- Cross-account access
SQS Queue Policy:
- Who can send/receive messages
IAM Best Practices
- Least privilege: only grant needed permissions
- Use roles for applications: not access keys
- Enable MFA for sensitive operations
- Rotate credentials regularly
- Use policy conditions to restrict access
- Never embed credentials in code
Common Conditions
"Condition": {
"IpAddress": {"aws:SourceIp": "203.0.113.0/24"},
"DateGreaterThan": {"aws:CurrentTime": "2024-01-01T00:00:00Z"},
"StringEquals": {"aws:PrincipalTag/Department": "Finance"},
"Bool": {"aws:SecureTransport": "true"} // Require HTTPS
}
STS (Security Token Service)
- Generate temporary credentials
- Methods:
- AssumeRole: cross-account, service roles
- AssumeRoleWithWebIdentity: federated users (obsolete, use Cognito)
- GetSessionToken: MFA authentication
Service Control Policies (SCPs)
- Applied at AWS Organizations level
- Restrict maximum permissions
- Does NOT grant permissions (only limit)
- Example: prevent all accounts from leaving organization
5. AMAZON CLOUDWATCH
Metrics
Default Metrics:
- EC2: CPU, Network, Disk (no memory - need CloudWatch agent)
- Lambda: Invocations, Duration, Errors, Throttles, ConcurrentExecutions
- DynamoDB: ConsumedReadCapacity, ConsumedWriteCapacity, UserErrors
- API Gateway: Count, Latency, 4XXError, 5XXError
Custom Metrics:
- Push your own metrics using PutMetricData API
- Standard resolution: 1-minute granularity
- High resolution: 1-second granularity
- Dimensions: key-value pairs to filter metrics (ex: Environment=Prod)
Metric Math:
- Combine multiple metrics
- Example: ErrorRate = Errors / Invocations * 100
CloudWatch Logs
Concepts:
- Log Groups: container for log streams
- Log Streams: sequence of log events from same source
- Retention: 1 day to 10 years (or never expire)
Sources:
- Lambda: automatic (needs IAM permission)
- EC2: CloudWatch Agent
- ECS: awslogs driver
- Elastic Beanstalk: automatic
- API Gateway: enable per stage
Log Insights:
Query logs using SQL-like syntax
fields @timestamp, @message
| filter @message like /ERROR/
| stats count() by bin(5m)
Metric Filters:
- Extract metrics from logs
- Example: count ERROR occurrences
- Create alarms on extracted metrics
Subscriptions:
Stream logs to:
- Lambda: real-time processing
- Kinesis Data Streams: analytics
- Kinesis Data Firehose: S3, Elasticsearch
CloudWatch Alarms
States:
- OK: metric within threshold
- ALARM: metric breached threshold
- INSUFFICIENT_DATA: not enough data
Actions:
- SNS notification
- Auto Scaling action
- EC2 action (stop, terminate, reboot)
- Systems Manager action
Alarm Types:
- Static threshold: value > X
- Anomaly detection: ML-based, dynamic threshold
- Composite: combine multiple alarms (AND/OR)
Evaluation:
- Period: time interval (10s, 30s, 1m, etc.)
- Datapoints to alarm: X out of Y datapoints breach
- Example: 3 out of 5 datapoints > threshold
CloudWatch Events / EventBridge
- Event-driven architecture
- Rules match events and route to targets
Sources:
- Scheduled (cron, rate)
- AWS service events (EC2 state change, S3 upload)
- Custom applications (PutEvents API)
Targets:
- Lambda, SNS, SQS, Step Functions, ECS Task, etc.
X-Ray Integration
Distributed Tracing:
- Track requests across microservices
- Identify bottlenecks, errors
Segments:
- Work done by single service
- Subsegments: finer-grained timing
Annotations & Metadata:
- Annotations: indexed, searchable (key-value)
- Metadata: non-indexed, detailed info
Lambda Integration:
- Enable Active Tracing in Lambda config
- SDK automatically instruments
- See: downstream calls (DynamoDB, S3, HTTP)
Sampling:
- Avoid tracing every request (expensive)
- Default: 1 request/second + 5% of additional requests
TIER 2 - IMPORTANT SERVICES (15-20% điểm)
6. AMAZON S3
Storage Classes
Class |
Use Case |
Retrieval Time |
Standard |
Frequent access |
Milliseconds |
Intelligent-Tiering |
Unknown access patterns |
Milliseconds |
Standard-IA |
Infrequent access |
Milliseconds |
One Zone-IA |
Infrequent, non-critical |
Milliseconds |
Glacier Instant |
Archive, immediate access |
Milliseconds |
Glacier Flexible |
Archive, min to hours |
Minutes to hours |
Glacier Deep |
Long-term archive |
12 hours |
Lifecycle Policies
Automate transitions between storage classes
Expire objects after X days
{
"Rules": [{
"Id": "Archive old logs",
"Status": "Enabled",
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER"}
],
"Expiration": {"Days": 365}
}]
}
Versioning
- Keep multiple versions of object
- Protects from accidental deletes
- Delete = add delete marker (not permanent)
- Permanently delete: specify version ID
- Once enabled, cannot disable (only suspend)
Encryption
Server-Side Encryption:
- SSE-S3: S3-managed keys (AES-256)
- SSE-KMS: KMS-managed keys (audit trail, rotation)
- SSE-C: Customer-provided keys
- Enable by default or per-object
Client-Side Encryption:
- Encrypt before upload
- You manage keys
S3 Event Notifications
Trigger:
- Object created (Put, Post, Copy, CompleteMultipartUpload)
- Object deleted
- Object restored from Glacier
Destinations:
- Lambda function
- SNS topic
- SQS queue
Use Cases:
- Image processing on upload
- Trigger workflows
- Log processing
Pre-Signed URLs
- Temporary access to private objects
- Generated by SDK/CLI with your credentials
- Expiration: up to 7 days (SigV4)
- Use case: let users upload/download without AWS credentials
CORS Configuration
[{
"AllowedOrigins": ["https://example.com"],
"AllowedMethods": ["GET", "PUT", "POST"],
"AllowedHeaders": ["*"],
"MaxAgeSeconds": 3000
}]
S3 Bucket Policies
{
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::mybucket/*",
"Condition": {
"IpAddress": {"aws:SourceIp": "203.0.113.0/24"}
}
}
7. AMAZON SQS (Simple Queue Service)
Queue Types
Standard Queue:
- Unlimited throughput
- At-least-once delivery (có thể duplicate)
- Best-effort ordering (không guarantee order)
- Use case: decouple components, high throughput
FIFO Queue:
- Exactly-once delivery (no duplicates)
- Strict ordering
- 300 TPS (or 3000 with batching)
- Queue name must end with
.fifo
- Use case: order-critical workflows
Key Concepts
Visibility Timeout:
- Message invisible to other consumers after being received
- Default: 30 seconds (0 seconds to 12 hours)
- Consumer must delete message or it becomes visible again
- ChangeMessageVisibility API: extend timeout
Message Retention:
- How long messages stay in queue
- Default: 4 days (1 minute to 14 days)
Polling:
- Short Polling (default): return immediately (may be empty)
- Long Polling (recommended): wait up to 20 seconds for messages
- Reduces API calls, cost
- Set: WaitTimeSeconds > 0
Delay Queues:
- Delay delivery of all new messages
- 0 to 15 minutes
- FIFO: cannot use per-message delay
Message Timers:
- Delay individual messages (Standard only)
- 0 to 15 minutes
Dead Letter Queues (DLQ)
- Receives messages that fail processing
- Set maxReceiveCount: after X receives, move to DLQ
- DLQ must be same type (Standard → Standard, FIFO → FIFO)
- Use case: isolate problematic messages for debugging
FIFO-Specific Features
Message Deduplication:
- Deduplication ID: prevent duplicates within 5-minute window
- Content-based deduplication: hash message body
Message Group ID:
- Messages in same group = ordered
- Different groups = processed in parallel
- Use case: process orders per customer in order
SQS with Lambda
- Lambda polls queue (event source mapping)
- Batch size: 1-10 messages (Standard), 1-10 (FIFO)
- If function errors: message returns to queue after visibility timeout
- Configure: max retries, DLQ
IAM Permissions
{
"Effect": "Allow",
"Action": [
"sqs:SendMessage",
"sqs:ReceiveMessage",
"sqs:DeleteMessage"
],
"Resource": "arn:aws:sqs:region:account:queue-name"
}
8. AMAZON SNS (Simple Notification Service)
Pub/Sub Model
- Topic: channel for messages
- Publishers: send messages to topic
- Subscribers: receive messages from topic
- Fan-out: 1 message → many subscribers
Subscription Types
- Email / Email-JSON
- HTTP / HTTPS endpoints
- SQS queue
- Lambda function
- SMS
- Mobile push (iOS, Android)
Message Filtering
Subscriber receives only matching messages
Filter policy: JSON document
{
"event": ["order_placed", "order_cancelled"],
"price": [{"numeric": [">=", 100]}]
}
SNS + SQS Fan-Out Pattern
S3 Event → SNS Topic → [SQS Queue 1, SQS Queue 2, Lambda]
- 1 event triggers multiple processing paths
- Each subscriber processes independently
- Benefit: decoupling, parallel processing
Message Attributes
- Key-value metadata
- Use for filtering
- Example:
{"event": "order_placed", "region": "us-east-1"}
Delivery Retry
- HTTP/S: retry with backoff (up to 100,015 seconds)
- Lambda: async invocation (2 retries)
- SQS: SNS retries indefinitely
IAM Permissions
{
"Effect": "Allow",
"Action": ["sns:Publish"],
"Resource": "arn:aws:sns:region:account:topic-name"
}
9. AWS CI/CD SERVICES
AWS CodeCommit
- Git-based repository
- Private, managed
- Encrypted at rest (KMS)
- Triggers: Lambda, SNS on push
- Not in exam much, know it exists
AWS CodeBuild
Build Process:
- Pull source (CodeCommit, GitHub, S3)
- Run build commands (buildspec.yml)
- Output artifacts to S3
buildspec.yml:
version: 0.2
phases:
install:
runtime-versions:
nodejs: 18
pre_build:
commands:
- npm install
build:
commands:
- npm run build
post_build:
commands:
- aws s3 cp dist/ s3://mybucket/ --recursive
artifacts:
files:
- '**/*'
base-directory: dist
Key Points:
- Runs in Docker container
- Build environment: CPU, memory configs
- Logs to CloudWatch Logs
- Cache dependencies in S3 (faster builds)
AWS CodeDeploy
Deployment Types:
In-Place (Rolling):
- Update existing instances
- Downtime during update
- Use case: dev/test environments
Blue/Green:
- New instances created (Green)
- Traffic shifts from old (Blue) to new
- Rollback: shift back to Blue
- Use case: production, zero downtime
Deployment Targets:
- EC2 instances
- On-premises servers
- Lambda functions
- ECS services
appspec.yml (Lambda):
version: 0.0
Resources:
- myFunction:
Type: AWS::Lambda::Function
Properties:
Name: myFunction
Alias: live
CurrentVersion: 1
TargetVersion: 2
Deployment Configs (Lambda):
- Linear: traffic shifts in equal increments (10% every 10 min)
- Canary: shift X% first, then 100% (10% for 10 min, then all)
- All-at-once: immediate
Hooks (Lambda):
- BeforeAllowTraffic: run tests before shift
- AfterAllowTraffic: run validation after shift
AWS CodePipeline
Pipeline Structure:
- Source: CodeCommit, GitHub, S3
- Build: CodeBuild
- Test: CodeBuild, 3rd party
- Deploy: CodeDeploy, CloudFormation, ECS, S3
- Approval: Manual approval step
Artifacts:
- Passed between stages via S3
- Encrypted
Event-Driven:
- CloudWatch Events trigger pipeline on source change
Example Flow:
GitHub push → Build (CodeBuild) → Test → Manual Approval → Deploy (CodeDeploy)
TIER 3 - GOOD TO KNOW (5-10% điểm)
10. AMAZON ECS (Elastic Container Service)
Launch Types
EC2 Launch Type:
- You manage EC2 instances
- Install ECS agent
- More control, cheaper for sustained workloads
Fargate:
- Serverless containers
- No EC2 management
- Pay per vCPU + memory
- Use case: simplicity, variable workloads
Core Concepts
Task Definition:
Blueprint for containers
Like Docker Compose file
{
"family": "my-app",
"containerDefinitions": [{
"name": "web",
"image": "nginx",
"memory": 512,
"cpu": 256,
"portMappings": [{"containerPort": 80}]
}]
}
Task:
- Running instance of task definition
- Single execution
Service:
- Maintains desired count of tasks
- Auto-restart failed tasks
- Load balancing
IAM Roles
Task Role:
- Permissions for application (access S3, DynamoDB)
- Attached to task definition
Execution Role:
- Permissions for ECS agent (pull image from ECR, write logs)
- Required for Fargate
Deployment
- Rolling update: gradually replace tasks
- Blue/Green: via CodeDeploy
11. AMAZON COGNITO
Cognito User Pools
- User directory (sign up, sign in)
- JWT tokens issued
- MFA, password policies
- Hosted UI available
- Integration: API Gateway, ALB
Cognito Identity Pools
- Provide temporary AWS credentials
- Federated identities (Google, Facebook, SAML)
- Guest users (unauthenticated)
- Returns STS credentials
User Pools vs Identity Pools
Feature |
User Pools |
Identity Pools |
Purpose |
Authentication |
Authorization (AWS access) |
Output |
JWT tokens |
Temporary AWS credentials |
Use Case |
Login to app |
Access AWS services from app |
Common Pattern
User → Cognito User Pool (login) → JWT token
→ Exchange via Identity Pool → STS credentials
→ Access S3, DynamoDB
12. AMAZON ELASTICACHE
Engine Types
Redis:
- Advanced data structures (sorted sets, lists, hashes)
- Pub/Sub messaging
- Persistence to disk
- Multi-AZ with auto-failover
- Backup and restore
- Use case: complex data types, persistence needed
Memcached:
- Simple key-value
- Multi-threaded
- No persistence
- No replication
- Use case: simple caching, horizontal scaling
Caching Strategies
Lazy Loading (Cache-Aside):
# Read from cache
data = cache.get(key)
if not data:
# Cache miss, read from DB
data = db.get(key)
cache.set(key, data)
return data
- Pros: only requested data cached
- Cons: cache miss penalty, stale data possible
Write-Through:
# Write to DB and cache together
db.set(key, data)
cache.set(key, data)
- Pros: cache always fresh
- Cons: write penalty, unused data cached
Hybrid: Lazy Loading + TTL
- Set expiration on cached items
- Balance freshness vs performance
Use Cases
- Session storage (Redis)
- Database query caching
- Rate limiting (Redis counters)
- Leaderboards (Redis sorted sets)
- Real-time analytics
🔄 SERVICE COMPARISON MATRIX - Exam Decision Making
📊 Service Comparison Tables
Những câu hỏi "Which service should you use?" chiếm 20-25% exam. Master section này = easy points!
🔹 Compute Services: Lambda vs ECS vs EC2
Scenario |
Use Lambda |
Use ECS/Fargate |
Use EC2 |
Event-driven, short tasks |
✅ Perfect fit |
❌ Overkill |
❌ Too much overhead |
Long-running processes (> 15min) |
❌ 15min limit |
✅ No time limit |
✅ No time limit |
Microservices architecture |
✅ Good (simple) |
✅ Better (complex) |
⚠️ Manual setup |
Need custom runtime/libraries |
⚠️ Layers or container |
✅ Full Docker support |
✅ Total control |
Cost optimization priority |
✅ Pay per request |
⚠️ Always running |
⚠️ Always running |
Predictable, steady traffic |
⚠️ Can be expensive |
✅ Better cost/performance |
✅ Reserved instances |
🔹 Storage Services: S3 vs EFS vs EBS
Scenario |
Use S3 |
Use EFS |
Use EBS |
Object storage (images, videos) |
✅ Perfect fit |
❌ Wrong use case |
❌ Wrong use case |
Shared file system (multiple instances) |
⚠️ Not a file system |
✅ NFS protocol |
❌ Single instance only |
Lambda needs persistent storage |
✅ Simple integration |
✅ Mount as file system |
❌ Can't attach |
Database storage (EC2) |
❌ Not designed for this |
⚠️ Can work but slow |
✅ Block storage |
Serverless application |
✅ Native integration |
✅ Lambda can mount |
❌ Not serverless |
🔹 Database Services: DynamoDB vs RDS vs ElastiCache
Scenario |
Use DynamoDB |
Use RDS |
Use ElastiCache |
Simple key-value access |
✅ Fast, scalable |
⚠️ Overkill |
⚠️ For caching only |
Complex SQL queries, JOINs |
❌ No SQL support |
✅ Full SQL |
❌ Not a database |
Need ACID transactions |
✅ Has transactions |
✅ Full ACID |
❌ Not transactional |
Millisecond latency required |
✅ Single-digit ms |
⚠️ 5-10ms typical |
✅ Sub-ms with cache |
Reduce database load |
⚠️ Not for caching |
⚠️ Not for caching |
✅ Cache layer |
Session storage |
✅ Can work |
⚠️ Overkill |
✅ Redis perfect |
Unpredictable scaling |
✅ Auto-scales |
⚠️ Manual scaling |
⚠️ Manual scaling |
🔹 Messaging Services: SQS vs SNS vs EventBridge vs Kinesis
Scenario |
Use SQS |
Use SNS |
Use EventBridge |
Use Kinesis |
Decouple components |
✅ Pull model |
✅ Push model |
✅ Event routing |
⚠️ Overkill |
Fan-out (1 to many) |
❌ 1:1 only |
✅ Perfect for this |
✅ With rules |
✅ Multiple consumers |
Message ordering required |
✅ FIFO queue |
⚠️ With FIFO topic |
❌ No guarantee |
✅ Per shard |
Real-time stream processing |
❌ Not streaming |
❌ Not streaming |
⚠️ Simple events |
✅ High throughput |
Replay messages |
❌ Deleted after read |
❌ No replay |
❌ No replay |
✅ 24h-365 days |
Multiple consumers same data |
❌ Deleted after read |
✅ Each gets copy |
✅ Multiple targets |
✅ Each reads stream |
Email/SMS notifications |
❌ No built-in |
✅ Native support |
⚠️ Via SNS |
❌ No built-in |
Event-driven architecture |
✅ Simple |
✅ Simple |
✅ Advanced routing |
⚠️ For streams |
🔹 API Services: API Gateway REST vs HTTP vs WebSocket
Feature |
REST API |
HTTP API |
WebSocket API |
Use case |
Full-featured APIs |
Simple, low-cost APIs |
Real-time bidirectional |
Cost |
$$$ (Higher) |
$ (70% cheaper) |
$$ (Per connection) |
Request validation |
✅ Built-in |
❌ Manual |
❌ Manual |
Caching |
✅ Built-in |
❌ Not available |
❌ Not available |
Usage plans & API keys |
✅ Yes |
❌ No |
❌ No |
Performance |
Good |
Better (lower latency) |
Persistent connection |
Exam recommendation |
Default choice |
If "cost-effective" mentioned |
If "real-time", "chat", "push" |
🔹 DynamoDB: GSI vs LSI
Aspect |
Global Secondary Index (GSI) |
Local Secondary Index (LSI) |
Partition Key |
✅ Different from base table |
❌ SAME as base table |
Sort Key |
✅ Different or none |
✅ Different from base table |
When to create |
✅ Anytime (add/delete) |
❌ Table creation ONLY |
Capacity |
✅ Own RCU/WCU (separate) |
❌ Shares with base table |
Consistency |
❌ Eventually consistent only |
✅ Eventually OR strongly |
Max per table |
20 |
5 |
Use case |
Query by different attributes |
Query same PK, different sort |
Exam default |
✅ Use this unless specified |
⚠️ Rare, specific scenarios |
🌳 Decision Trees - Quick Selection Guide
Choose Database Service:
Start: Need database?
│
├─ Need SQL, complex queries, JOINs?
│ └─ YES → RDS
│ ├─ High availability? → Multi-AZ
│ ├─ Scale reads? → Read Replicas
│ └─ Connection pooling? → RDS Proxy
│
├─ Simple key-value, high scale?
│ └─ YES → DynamoDB
│ ├─ Need caching? → DAX
│ ├─ Query non-PK? → GSI
│ └─ Change tracking? → Streams
│
├─ Need to reduce DB load?
│ └─ YES → ElastiCache
│ ├─ Simple caching? → Memcached
│ ├─ Advanced (pub/sub)? → Redis
│ └─ Session storage? → Redis
│
└─ File storage?
└─ YES → See storage decision tree
Choose Messaging Service:
Start: Need async communication?
│
├─ Need message ordering?
│ ├─ YES, also need replay? → Kinesis
│ └─ YES, no replay? → SQS FIFO
│
├─ Fan-out (1 message → many consumers)?
│ ├─ Simple fan-out? → SNS
│ ├─ Complex routing rules? → EventBridge
│ └─ Each consumer needs copy? → SNS → SQS (fan-out pattern)
│
├─ Decouple, throttle, buffer?
│ └─ SQS Standard
│ ├─ High throughput? → Standard
│ ├─ Exactly-once? → FIFO
│ └─ Delay messages? → Delay Queue
│
├─ Real-time streaming, analytics?
│ └─ Kinesis Data Streams
│ └─ Need analysis? → Kinesis Analytics
│
└─ Scheduled events?
└─ EventBridge (cron rules)
Choose Storage Service:
Start: Need storage?
│
├─ Object storage (images, files, backups)?
│ └─ S3
│ ├─ Infrequent access? → S3-IA
│ ├─ Archive? → Glacier
│ ├─ Fast retrieval? → S3 Standard
│ └─ Need CDN? → S3 + CloudFront
│
├─ Block storage for EC2?
│ └─ EBS
│ ├─ High performance? → io2
│ ├─ Balanced? → gp3
│ └─ Throughput? → st1
│
├─ Shared file system (multiple instances)?
│ └─ EFS
│ ├─ Lambda needs files? → EFS mount
│ ├─ Linux instances? → EFS
│ └─ Windows instances? → FSx for Windows
│
└─ Lambda persistent storage?
├─ Small files? → S3
├─ File system? → EFS
└─ Temp during execution? → /tmp (512MB-10GB)
Choose Compute Service:
Start: Need compute?
│
├─ Event-driven, short tasks (< 15min)?
│ └─ Lambda
│ ├─ Cold start issue? → Provisioned concurrency
│ ├─ Need VPC? → VPC + NAT Gateway
│ └─ Long runtime? → Step Functions
│
├─ Containers, microservices?
│ └─ ECS/Fargate
│ ├─ Don't want servers? → Fargate
│ ├─ Cost optimize? → EC2 launch type
│ └─ Need Kubernetes? → EKS
│
├─ Long-running, always-on?
│ └─ EC2
│ ├─ Predictable? → Reserved Instances
│ ├─ Variable? → Spot/On-Demand
│ └─ Auto-scale? → Auto Scaling Group
│
└─ Simple web app, don't want infra?
└─ Elastic Beanstalk
└─ Full automation, PaaS
🎯 Exam Scenario Playbook - Common Question Patterns
Pattern 1: "Most cost-effective solution"
If scenario mentions... |
Answer likely involves... |
Why |
Sporadic traffic, unpredictable |
Lambda (not EC2) |
Pay per request vs always running |
Simple API, no advanced features |
HTTP API (not REST API) |
70% cheaper |
Infrequent access data |
S3-IA or Glacier |
Lower storage cost |
Reserved capacity, predictable |
Provisioned (not on-demand) |
Upfront commitment = discount |
Pattern 2: "Minimize latency"
If scenario mentions... |
Answer likely involves... |
Why |
Database queries slow |
ElastiCache (Redis/Memcached) |
In-memory cache |
DynamoDB slow queries |
DAX (DynamoDB Accelerator) |
Microsecond latency |
Lambda cold starts |
Provisioned concurrency |
Keep functions warm |
Global users, slow content |
CloudFront CDN |
Edge caching |
Pattern 3: "Ensure high availability"
If scenario mentions... |
Answer likely involves... |
Why |
Database availability |
RDS Multi-AZ |
Automatic failover |
Lambda reliability |
Multiple AZs (automatic) |
Lambda is multi-AZ by default |
Application resilience |
Multi-region deployment |
Region failure protection |
Load balancing |
ALB + multiple AZs |
Distribute traffic |
Pattern 4: "Loose coupling / Decoupling"
If scenario mentions... |
Answer likely involves... |
Why |
Components shouldn't wait |
SQS between services |
Async processing |
One producer, many consumers |
SNS fan-out |
Publish/subscribe |
Service failures shouldn't cascade |
SQS + DLQ |
Buffer + retry |
Event-driven architecture |
EventBridge |
Event routing |
Pattern 5: "Security best practices"
If scenario mentions... |
Answer likely involves... |
Red flags to avoid |
Store database credentials |
Secrets Manager or Systems Manager |
❌ Environment variables |
API authentication |
IAM or Cognito |
❌ API keys alone |
Encrypt data at rest |
KMS encryption |
❌ Client-side only |
Access AWS resources |
IAM roles (not keys) |
❌ Hardcoded access keys |
NOT TODO / SKIM ONLY
EC2 (BASICS ONLY)
Instance Roles
- IAM role attached to EC2
- Credentials available via metadata:
http://169.254.169.254/latest/meta-data/iam/security-credentials/role-name
- Auto-rotated by AWS
- Never hardcode credentials in EC2
User Data
- Script runs at instance launch
- Use case: install software, configure environment
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
Metadata
- Instance info available at
http://169.254.169.254/latest/meta-data/
- Examples: instance-id, public-ip, iam/security-credentials
Chỉ cần biết concept, không cần deep dive
RDS (BASICS ONLY)
Core Points
- Managed relational database
- Engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora
- Multi-AZ: high availability (synchronous replication)
- Read Replicas: scale reads (asynchronous replication)
RDS Proxy
- Connection pooling
- Reduce DB connections from Lambda
- IAM authentication
- Use case: Lambda with many concurrent executions
Chỉ biết khi nào dùng, không cần chi tiết
VPC (BASICS ONLY)
Security Groups
- Stateful firewall
- Allow rules only (no deny)
- Applies to ENI (network interface)
- Default: deny all inbound, allow all outbound
NACLs (Network ACLs)
- Stateless firewall
- Apply to subnet level
- Allow and deny rules
- Numbered rules (evaluated in order)
Chỉ hiểu security groups cho Lambda, EC2
CloudFormation (CONCEPTS ONLY)
Infrastructure as Code
- JSON/YAML templates
- Declare resources
- Stack: collection of resources
- Change sets: preview changes
Key Sections
AWSTemplateFormatVersion: '2010-09-09'
Parameters:
InstanceType:
Type: String
Default: t2.micro
Resources:
MyEC2Instance:
Type: AWS::EC2::Instance
Properties:
InstanceType: !Ref InstanceType
Outputs:
InstanceId:
Value: !Ref MyEC2Instance
Đọc qua syntax, không cần thuộc
Elastic Beanstalk (CONCEPTS ONLY)
Platform as a Service
- Deploy applications without managing infrastructure
- Supports: Node.js, Python, Java, .NET, PHP, Ruby, Go, Docker
- Handles: capacity provisioning, load balancing, auto-scaling, monitoring
Deployment Policies
- All at once: downtime
- Rolling: batch updates
- Rolling with additional batch: maintain capacity
- Immutable: new instances, swap
- Blue/Green: manual via swap environment URLs
Biết là PaaS, use case, skip chi tiết
Step Functions (CONCEPTS ONLY)
Serverless Workflow Orchestration
- Coordinate Lambda functions, ECS tasks
- State machine (JSON definition)
- Visual workflow
State Types
- Task: do work (Lambda, ECS, etc.)
- Choice: branching logic
- Parallel: concurrent execution
- Wait: delay
- Success/Fail: end states
Đọc qua, ít ra đề
Kinesis (CONCEPTS ONLY)
Kinesis Data Streams
- Real-time data streaming
- Shards: read/write capacity units
- Consumers: Lambda, KCL, Kinesis Data Analytics
- Retention: 1-365 days
Kinesis vs SQS
Feature |
Kinesis |
SQS |
Ordering |
Per shard |
FIFO queue only |
Retention |
Up to 365 days |
Up to 14 days |
Consumers |
Multiple read same data |
Message deleted after read |
Use Case |
Real-time analytics, log streaming |
Decouple components |
Biết differences, không deep dive
AWS SAM (CONCEPTS ONLY)
Serverless Application Model
- Extension of CloudFormation
- Simplified syntax for Lambda, API Gateway, DynamoDB
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Resources:
MyFunction:
Type: AWS::Serverless::Function
Properties:
Handler: index.handler
Runtime: nodejs18.x
Events:
Api:
Type: Api
Properties:
Path: /hello
Method: get
SAM CLI
sam build
: build application
sam local invoke
: test locally
sam deploy
: deploy to AWS
Biết là gì, không cần practice
SKIP ENTIRELY
- AppSync: GraphQL service, ít ra đề
- Amplify: Frontend framework, không liên quan DVA nhiều
- Systems Manager: Parameter Store ra vài câu, đọc qua
- Secrets Manager: Similar Parameter Store, đọc qua
- Route 53: DNS, không phải focus DVA
- CloudFront: CDN, biết tích hợp S3, API Gateway
- AWS X-Ray: Đã cover trong CloudWatch section
EXAM TIPS & TRICKS
Common Scenarios
Scenario: "Reduce costs for Lambda"
→ Answers: Increase memory (faster = cheaper), reduce package size, use layers, reserved concurrency
Scenario: "Lambda timeout issues"
→ Check: timeout config (max 15 min), async processing, Step Functions for long workflows
Scenario: "DynamoDB throttling"
→ Answers: Increase RCU/WCU, use on-demand mode, implement exponential backoff, check GSI capacity
Scenario: "API Gateway CORS errors"
→ Enable CORS, Lambda proxy must return CORS headers
Scenario: "Lambda cannot access VPC resource"
→ Check: Lambda in VPC, security groups, NAT gateway for internet access
Scenario: "Secure API access"
→ Answers: IAM auth (AWS services), Cognito (users), Lambda authorizer (custom)
Scenario: "Process S3 uploads asynchronously"
→ S3 Event → SQS → Lambda (decoupled, handles failures)
Scenario: "Order processing must not lose messages"
→ SQS FIFO queue + DLQ
Scenario: "Fan-out notifications"
→ SNS topic with multiple SQS subscriptions
Scenario: "Cache database queries"
→ ElastiCache (Redis for complex, Memcached for simple)
Keywords to Watch
Keyword |
Think |
"Real-time" |
Kinesis, Lambda, WebSocket API |
"Serverless" |
Lambda, API Gateway, DynamoDB |
"Cost-effective" |
On-demand pricing, auto-scaling, S3 lifecycle |
"High availability" |
Multi-AZ, DynamoDB global tables, S3 |
"Decouple" |
SQS, SNS, EventBridge |
"Ordered processing" |
SQS FIFO, Kinesis (per shard) |
"Temporary credentials" |
STS, IAM roles, Cognito Identity Pools |
"Least privilege" |
IAM policies with specific actions/resources |
"Audit trail" |
CloudWatch Logs, CloudTrail, X-Ray |
"Rollback" |
Lambda aliases, CodeDeploy blue/green |
Time Management
- 65 questions, 130 minutes = 2 min/question
- Flag uncertain questions (review later)
- Read question twice, eliminate wrong answers
- Watch for "MOST cost-effective", "LEAST operational overhead"
Common Traps
- Lambda memory: More memory = more CPU (not just RAM)
- DynamoDB GSI: Eventually consistent only, separate capacity
- SQS visibility timeout: Must be > Lambda timeout
- API Gateway stages: Must deploy to activate changes
- IAM: Explicit Deny always wins
- S3 versioning: Cannot disable, only suspend
Day Before Exam
- Review flashcards: Lambda limits, DynamoDB capacity calculation
- Skim service FAQs: Lambda, DynamoDB, API Gateway
- Sleep well (8 hours)
- No cramming (trust your prep)
QUICK REFERENCE TABLES
Lambda Limits
Limit |
Value |
Memory |
128MB - 10,240MB |
Timeout |
900s (15 min) |
/tmp storage |
512MB - 10GB |
Deployment package |
50MB (zipped), 250MB (unzipped) |
Concurrent executions |
1000/region (default) |
DynamoDB Capacity
Operation |
Capacity |
1 RCU |
1 strongly consistent read/s (≤4KB) |
1 RCU |
2 eventually consistent reads/s (≤4KB) |
1 WCU |
1 write/s (≤1KB) |
API Gateway Limits
Limit |
Value |
Throttle |
10,000 RPS |
Burst |
5,000 requests |
Timeout |
29 seconds |
Payload |
10MB |
SQS Limits
Limit |
Value |
Message size |
256KB |
Visibility timeout |
0s - 12h (default 30s) |
Retention |
1 min - 14 days (default 4 days) |
Delay |
0s - 15 min |
FIFO throughput |
300 TPS (3000 with batching) |
STUDY CHECKLIST
Week 1-2: TIER 1
- ☐ Lambda: all triggers, concurrency, error handling
- ☐ DynamoDB: keys, indexes, streams, capacity modes
- ☐ API Gateway: integration types, auth, stages
- ☐ IAM: roles, policies, trust vs permission
- ☐ CloudWatch: metrics, logs, alarms, X-Ray
Week 3-4: TIER 2
- ☐ S3: events, storage classes, encryption
- ☐ SQS: Standard vs FIFO, DLQ, polling
- ☐ SNS: topics, fan-out pattern
- ☐ CodePipeline/Build/Deploy: CI/CD flow
Week 5: TIER 3 + Review
- ☐ ECS: task definitions, IAM roles
- ☐ Cognito: User Pools vs Identity Pools
- ☐ ElastiCache: caching strategies
- ☐ Practice exam 1: identify gaps
Week 6: Practice & Polish
- ☐ Practice exam 2-4
- ☐ Review all mistakes
- ☐ Flashcard drill: 200+ cards
- ☐ Rest day before exam
📋 FINAL REVIEW CHECKLIST - 48 Hours Before Exam
🔷 Lambda Must-Know
- ☑ Invocation types: Sync, Async, Stream
- ☑ Timeout: default 3s, max 900s (15min)
- ☑ Concurrency: 1000 default, reserved vs provisioned
- ☑ VPC = ENI + NAT for internet
- ☑ Error handling: DLQ for async only
- ☑ Memory = CPU (128MB-10GB)
🔶 DynamoDB Must-Know
- ☑ PK required, SK optional
- ☑ GSI: different PK, create anytime, own capacity
- ☑ LSI: same PK, at creation, shared capacity
- ☑ Strong vs Eventually consistent
- ☑ Streams: 24h retention, CDC
- ☑ DAX: microsecond cache
🔷 API Gateway Must-Know
- ☑ REST vs HTTP (HTTP 70% cheaper)
- ☑ Timeout: 29s max
- ☑ Stages: dev, prod, etc.
- ☑ Caching: REST only, not HTTP
- ☑ Auth: IAM, Cognito, Lambda authorizer
- ☑ WebSocket: real-time, bidirectional
🔶 IAM Must-Know
- ☑ Least privilege principle
- ☑ Roles > Users for AWS resources
- ☑ Policy evaluation: Explicit Deny wins
- ☑ Never hardcode credentials
- ☑ Use Secrets Manager for passwords
- ☑ Resource-based vs Identity-based
🔷 SQS Must-Know
- ☑ Standard: at-least-once, no order
- ☑ FIFO: exactly-once, ordered, 300 TPS
- ☑ Visibility timeout > Lambda timeout
- ☑ DLQ after X failed attempts
- ☑ Long polling (1-20s) better than short
- ☑ Message retention: 1min - 14 days
🔶 SNS Must-Know
- ☑ Pub/Sub model, fan-out
- ☑ Push-based (not pull)
- ☑ Subscribers: SQS, Lambda, HTTP, Email, SMS
- ☑ FIFO topics (with FIFO queues)
- ☑ Message filtering available
- ☑ SNS → SQS = fan-out pattern
🔷 S3 Must-Know
- ☑ Object storage (not file system)
- ☑ Storage classes: Standard, IA, Glacier
- ☑ Versioning: protect from deletes
- ☑ Encryption: SSE-S3, SSE-KMS, SSE-C
- ☑ CORS for cross-origin access
- ☑ Presigned URLs: temporary access
🔶 CloudWatch Must-Know
- ☑ Metrics: 1min or 5min
- ☑ Logs: aggregation, insights, exports
- ☑ Alarms: trigger actions
- ☑ X-Ray: distributed tracing
- ☑ Custom metrics via PutMetricData
- ☑ Logs retention: 1 day - never expire
⏰ Last 24 Hours Strategy
Morning (4 hours)
- Review all mnemonics (LAMBDA TIME, SAS, etc.)
- Skim comparison tables
- Practice 1 full-length exam
- Review missed questions
Afternoon (3 hours)
- Focus on weak areas from practice
- Review service limits/numbers
- Read red flag keywords
- Quick skim of TIER 3 services
Evening (2 hours)
- Light review of decision trees
- Read exam scenario playbook
- Relax - no heavy studying
- Early sleep (8 hours!)
Exam Day
- Breakfast + caffeine (normal routine)
- Arrive 30min early
- Brain dump: write down mnemonics on scratch
- Read questions CAREFULLY (twice!)
🎯 During Exam Tips:
- Flag & skip if uncertain (come back later)
- Eliminate obviously wrong answers first
- Look for keywords: "cost-effective", "most secure", "minimize latency"
- Watch for red flags: hardcoded credentials, wrong service choice
- Time management: 2 minutes per question average (130min ÷ 65 questions)
- First pass: Answer easy ones (45-60 min for ~40 questions)
- Second pass: Review flagged (45-60 min for ~25 questions)
- Buffer: Keep 15-20 min for final review
📊 Score Breakdown - What You Need
Domain |
Weight |
Questions (~) |
Pass Target (720/1000) |
Development with AWS Services |
32% |
~21 |
15+ correct (71%) |
Security |
26% |
~17 |
12+ correct (71%) |
Deployment |
24% |
~16 |
11+ correct (69%) |
Troubleshooting & Optimization |
18% |
~11 |
8+ correct (73%) |
Total: Need ~46-48 correct out of 65 questions (71-74%)
Note: Exam has 15 unscored questions (testing for future), so aim for 75%+ to be safe!
FINAL NOTES
This guide covers 80% of exam content.
Focus on understanding TIER 1 services deeply.
TIER 2 services: know integration patterns.
TIER 3 services: recognize use cases.
Success formula:
- 40% hands-on practice (actually build things)
- 40% practice exams (learn from mistakes)
- 20% reading/watching (theory)
You got this! 🚀