Post

AWS Solutions Architect Associate: Domain 2 - Define Performant Architectures

Complete guide to Domain 2: defining performant architectures using caching, databases, content delivery, and optimization techniques.

Introduction

Domain 2 focuses on designing architectures that deliver optimal performance. This includes selecting appropriate database solutions, implementing caching strategies, and optimizing content delivery. This domain represents approximately 28% of the exam.

Database Selection

RDBMS vs NoSQL Selection Matrix

RDBMS (RDS): ✓ Structured data with defined schema ✓ ACID compliance required ✓ Complex queries and joins ✓ Vertical scaling suitable Examples: PostgreSQL, MySQL, Oracle NoSQL (DynamoDB): ✓ Unstructured or semi-structured data ✓ High write throughput ✓ Horizontal scaling needed ✓ Flexible schema Examples: Document, Key-value, Time-series

RDS Performance Optimization

Read Replicas

# Create read replica aws rds create-db-instance-read-replica \ --db-instance-identifier prod-db-replica \ --source-db-instance-identifier prod-db \ --availability-zone us-east-1b

Parameter Group Tuning

-- PostgreSQL production settings shared_buffers = 256MB (25% of server RAM) effective_cache_size = 1GB (50-75% of RAM) work_mem = 4MB (total_ram / (max_connections * 2)) random_page_cost = 1.1 (for SSD) max_connections = 500

DynamoDB Performance

import boto3 from boto3.dynamodb.conditions import Key dynamodb = boto3.resource('dynamodb') table = dynamodb.Table('Orders') # Query with projection expression (fetch only needed attributes) response = table.query( KeyConditionExpression=Key('customer_id').eq('123'), ProjectionExpression='order_id, total_amount, created_at' ) # Batch operations for higher throughput with table.batch_writer( batch_size=100, overwrite_by_pkeys=['customer_id', 'order_id'] ) as batch: for item in items: batch.put_item(Item=item)

Caching Strategies

ElastiCache Implementation

Redis for Sessions and Real-time Data

import redis cache = redis.Redis( host='elasticache-endpoint.amazonaws.com', port=6379, decode_responses=True ) # Session caching cache.setex( f'session:{session_id}', 3600, # 1 hour TTL json.dumps(session_data) ) # Real-time metrics cache.incr('page_views:today') cache.expire('page_views:today', 86400) # Reset daily

Memcached for Object Caching

import memcache cache = memcache.Client(['elasticache-endpoint:11211']) # Cache frequently accessed data def get_user(user_id): cached = cache.get(f'user:{user_id}') if cached: return json.loads(cached) user = db.query(User).filter_by(id=user_id).first() cache.set(f'user:{user_id}', json.dumps(user), time=3600) return user

Application-Level Caching Patterns

Cache-Aside Pattern

1. Application checks cache 2. If miss, query database 3. Store result in cache 4. Return to application

Write-Through Pattern

1. Write to cache 2. Write to database 3. Return success to application (Ensures consistency but higher latency)

Write-Behind Pattern

1. Write to cache immediately 2. Asynchronously write to database (Low latency but risk of data loss)

Content Delivery Network (CDN)

CloudFront Distribution

Distribution: Enabled: true OriginDomainName: my-app.example.com Behaviors: - PathPattern: /api/* CachePolicyId: Managed-CachingDisabled ViewerProtocolPolicy: https-only AllowedMethods: [GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE] - PathPattern: /static/* CachePolicyId: Managed-CachingOptimized ViewerProtocolPolicy: https-only Compress: true - PathPattern: /* CachePolicyId: Managed-CachingStandard ViewerProtocolPolicy: redirect-to-https DefaultCacheBehavior: ViewerProtocolPolicy: redirect-to-https CachePolicyId: Managed-CachingStandard HttpVersion: http2and3 PriceClass: PriceClass_100

Cache Invalidation

# Invalidate specific paths aws cloudfront create-invalidation \ --distribution-id E123456 \ --paths "/index.html" "/api/users/*" # Invalidate entire distribution aws cloudfront create-invalidation \ --distribution-id E123456 \ --paths "/*"

Database Optimization Techniques

Connection Pooling with RDS Proxy

DBProxy: Engine: MYSQL DBProxyName: my-proxy EngineFamily: MYSQL Auth: - AuthScheme: SECRETS SecretArn: arn:aws:secretsmanager:... MaxConnectionsPercent: 100 MaxIdleConnectionsPercent: 50 ConnectionBorrowTimeout: 120 SessionPinningFilters: - EXCLUDE_VARIABLE_SETS

Query Optimization

-- Add indexes for frequently queried columns CREATE INDEX idx_customer_email ON customers(email); CREATE INDEX idx_orders_customer ON orders(customer_id, created_at); -- Use EXPLAIN to analyze query plans EXPLAIN ANALYZE SELECT o.*, c.name FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.created_at > '2025-01-01'; -- Result should show: -- Index Scan on idx_orders_customer -- Nested Loop -- Index Scan on customers.id

Partitioning Strategy

-- Time-based partitioning for large tables CREATE TABLE orders_2025 PARTITION OF orders FOR VALUES FROM ('2025-01-01') TO ('2026-01-01'); CREATE TABLE orders_2024 PARTITION OF orders FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

Lambda Optimization

Memory and Execution Time

import time import json def lambda_handler(event, context): start = time.time() # Expensive operation result = expensive_computation() duration = time.time() - start print(f"Duration: {duration}s") print(f"Memory Used: {context.memory_limit_in_mb}MB") print(f"Cost: ${duration * (context.memory_limit_in_mb / 1024) * 0.0000002}") return { 'statusCode': 200, 'body': json.dumps(result) }

Memory vs Cost Analysis

128MB: 1s function = $0.00000021 (slowest but cheapest) 1024MB: 0.3s function = $0.00000015 (faster and cheaper!) 10240MB: 0.1s function = $0.00000017 (expensive) Optimal memory often lies in middle ranges (512-1024MB)

Monitoring and Performance Metrics

CloudWatch Metrics

import boto3 import time from datetime import datetime cloudwatch = boto3.client('cloudwatch') def track_query_performance(table_name, query_type, duration): cloudwatch.put_metric_data( Namespace='DatabasePerformance', MetricData=[ { 'MetricName': 'QueryDuration', 'Value': duration, 'Unit': 'Milliseconds', 'Timestamp': datetime.utcnow(), 'Dimensions': [ {'Name': 'Table', 'Value': table_name}, {'Name': 'QueryType', 'Value': query_type} ] } ] )

Setting Alarms

# Alert on high query latency aws cloudwatch put-metric-alarm \ --alarm-name high-db-latency \ --alarm-description "Alert when RDS latency > 100ms" \ --metric-name ReadLatency \ --namespace AWS/RDS \ --statistic Average \ --period 300 \ --threshold 100 \ --comparison-operator GreaterThanThreshold \ --evaluation-periods 2

Common Exam Questions

Q: Which caching solution provides the lowest latency? A: ElastiCache (Redis/Memcached) typically provides <1ms latency

Q: What’s the optimal CloudFront cache behavior for API endpoints? A: Use cache policy “Managed-CachingDisabled” and disable caching

Q: How does RDS Proxy improve application performance? A: By multiplexing connections, reducing database load and latency

Key Takeaways

  1. Choose the right database for your use case
  2. Implement multi-layer caching strategy
  3. Use CloudFront for static content delivery
  4. Optimize database queries with proper indexing
  5. Monitor performance metrics continuously
  6. Right-size Lambda functions for cost optimization

Resources

This post is licensed under CC BY 4.0 by the author.