Start Here
Everything on this site is organised into six pillars. You don't need to read everything — start with what matters most to you right now, or follow the recommended beginner path below.
How this site is organised
- 1. Six pillars cover the full breadth of system design and cloud architecture.
- 2. Each pillar has five topics in a deliberate reading order.
- 3. Topics with a filled dot (●) have a published article. Empty topics are coming soon.
- 4. Articles link to the next and previous topic so you stay in context.
Recommended first reads
All six pillars
01
Fundamentals of System Design
Core concepts every engineer must understand before going deeper.
- 01 What is System Design 12 min
-
02 Client–Server Architecture & Request Flow Soon
-
03 Latency vs Throughput Soon
-
04 Availability vs Reliability vs Durability Soon
-
05 Scalability Basics (Vertical vs Horizontal) Soon
-
06 Consistency Models Soon
-
07 CAP Theorem (Practical) Soon
-
08 Stateless vs Stateful Systems Soon
-
09 Caching Fundamentals Soon
-
10 Load Balancing Basics Soon
-
11 Synchronous vs Asynchronous Systems Soon
-
12 Monolith vs Microservices Soon
-
13 Tight vs Loose Coupling Soon
-
14 How Systems Fail Soon
-
01 DNS Resolution Soon
-
02 CDN & Edge Caching Soon
-
03 Load Balancing & Traffic Distribution Soon
-
04 API Gateway Patterns Soon
-
05 VPC Architecture & Internal Traffic Flow Soon
-
06 Outbound Traffic & NAT Patterns Soon
- 07 Network Observability (VPC Flow Logs, Metrics) 10 min
-
08 Anycast vs Unicast Routing Soon
-
09 DNS Failover & Health Checks Soon
-
10 Reverse Proxies & Edge Routing Patterns Soon
-
11 TCP & Connection Management (Practical View) Soon
-
12 TLS / HTTPS & Encryption Overhead Soon
-
13 Rate Limiting & Traffic Shaping Soon
-
14 Handling Traffic Spikes & Burst Patterns Soon
-
15 API Gateway vs Load Balancer (When to Use What) Soon
-
16 Network Security Layers (WAF, Security Groups, NACLs) Soon
-
17 Egress Control & Private Endpoints Soon
-
18 Multi-Region Traffic Routing (Geo, Failover, Latency-Based) Soon
-
19 Service-to-Service Networking (East-West Traffic) Soon
-
20 Service Mesh (Conceptual Overview) Soon
03
Data & Storage Architecture
Choosing and designing the right data layer for your system.
-
01 Data Modeling Fundamentals Soon
-
02 SQL vs NoSQL Tradeoffs Soon
-
03 Query Patterns & Access Design Soon
-
04 Replication Models Soon
-
05 Sharding Strategies Soon
-
06 Caching Layers Soon
-
07 Data Consistency & Integrity Soon
-
08 Data Lakes vs Warehouses Soon
-
09 Indexing Strategies (B-Tree, LSM — Conceptual) Soon
-
10 Storage Engines (Conceptual Overview) Soon
- 11 Data Fragmentation & Storage Efficiency 8 min
-
12 Schema Design & Migrations Soon
-
13 Backups & Disaster Recovery Soon
-
14 Storage Tiering (Hot vs Cold Data) Soon
-
15 Data Pipelines (Batch vs Streaming) Soon
04
Scalability & Performance Patterns
Patterns for systems that need to handle growth gracefully.
-
01 Horizontal vs Vertical Scaling Soon
-
02 Load Distribution & Work Partitioning Soon
- 03 Queue-Based Systems 9 min
-
04 Async vs Sync Processing Soon
-
05 Rate Limiting Soon
-
06 Backpressure Soon
-
07 Concurrency & Parallel Processing Soon
-
08 Performance vs Scalability Soon
-
09 Bulkheading (Workload Isolation) Soon
-
10 Workload Prioritization & QoS Soon
-
11 Batching & Aggregation Soon
-
12 Caching for Performance Soon
-
13 Auto Scaling & Load-Based Scaling Soon
-
14 Handling Traffic Spikes Soon
05
Reliability & Resilience
Building systems that fail gracefully and recover automatically.
-
01 Failure Modes in Distributed Systems Soon
-
02 Timeouts & Failure Detection Soon
-
03 Retries and Backoff Soon
-
04 Circuit Breaker Soon
-
05 Graceful Degradation Soon
-
06 Multi-Region Failover Soon
-
07 Observability Soon
-
08 Chaos Engineering Soon
-
09 Bulkheading (Isolation of Failures) Soon
-
10 Idempotency & Safe Retries Soon
-
11 Redundancy & Replication for Resilience Soon
-
12 Disaster Recovery Strategies Soon
-
13 SLIs, SLOs, and Error Budgets Soon
-
14 Alerting & Incident Response Soon
06
Cost-Aware Architecture
Engineering decisions that respect budget as a hard constraint.
-
01 Cost vs Performance Tradeoffs Soon
-
02 Cost Drivers in Cloud Architecture Soon
-
03 Serverless Cost Patterns Soon
-
04 Network Cost Patterns (NAT, Data Transfer) Soon
-
05 Storage Tiering Soon
-
06 FinOps Basics Soon
- 07 NAT Cost Optimisation 12 min
-
08 Data Transfer & Egress Costs Soon
-
09 Idle Resources & Overprovisioning Soon
-
10 Cost Monitoring & Alerting Soon
-
11 Budgeting & Forecasting Soon
-
12 Cost Allocation (Tagging, Chargeback) Soon