System design interviews are the single biggest filter between senior engineers and staff-plus offers at FAANG companies. Most candidates fail not because they lack technical knowledge, but because they lack a repeatable framework for structuring their answers under pressure. This cheat sheet gives you that framework. You will learn the exact five-step process that has helped hundreds of engineers land offers at Google, Meta, Amazon, Apple, and Netflix.
Step one is always requirements clarification. Never jump into drawing boxes and arrows. Spend the first three to five minutes asking pointed questions: What is the expected scale? What are the latency requirements? Is this read-heavy or write-heavy? Do we need strong consistency or is eventual consistency acceptable? Interviewers are testing whether you think before you build.
Step two is capacity estimation. Back-of-the-envelope math is your secret weapon. Know these numbers cold: a single server handles roughly 10K to 50K concurrent connections. A standard SSD gives you 100K to 200K IOPS. Network bandwidth on a modern data center link is 10 to 25 Gbps. If the system needs to serve 100 million daily active users, break it down: that is roughly 1,200 requests per second on average, with peaks at 3x to 5x. Show the interviewer you understand scale through numbers, not hand-waving.
Step three is high-level design. Draw the core components: clients, load balancers, application servers, databases, caches, and message queues. Keep it simple at first. A typical web-scale system has a CDN layer in front, an API gateway or load balancer, a stateless application tier, a caching layer like Redis or Memcached, a primary database (relational or NoSQL depending on the access pattern), and an async processing layer for heavy work.
Step four is detailed design. This is where you go deep on the two or three most critical components. For a URL shortener, that means the hashing strategy and the database schema. For a chat application, that means the WebSocket connection management and message delivery guarantees. For a news feed, that means the fan-out strategy (push vs pull vs hybrid). Always discuss tradeoffs: "We could use fan-out on write for faster reads, but that increases write amplification for users with millions of followers. A hybrid approach handles celebrity accounts differently."
Step five is bottlenecks and monitoring. Proactively identify what could go wrong. Single points of failure, hot partitions, thundering herd problems, cache stampedes. Then propose solutions: circuit breakers, rate limiting, consistent hashing, read replicas, sharding strategies. Mention observability: distributed tracing, metrics dashboards, alerting thresholds.
Key patterns you must know for any FAANG system design round. Consistent hashing for distributed caches and partition assignment. Write-ahead logs for durability guarantees. Event sourcing for audit trails and replay capability. CQRS for separating read and write models at scale. Bloom filters for efficient membership testing. Leader election via consensus protocols for coordination.
Database selection framework. Use relational databases (PostgreSQL, MySQL) when you need ACID transactions, complex joins, or strong consistency. Use document stores (MongoDB, DynamoDB) for flexible schemas and horizontal scaling. Use wide-column stores (Cassandra, HBase) for time-series data and high write throughput. Use graph databases (Neo4j) for relationship-heavy queries. Always justify your choice with the specific requirements.
Caching strategies matter. Cache-aside (lazy loading) is the default for most read-heavy workloads. Write-through caching ensures consistency but adds write latency. Write-behind (write-back) caching optimizes writes but risks data loss. Time-to-live values should match your consistency requirements. For session data, 30 minutes. For product catalogs, 5 to 15 minutes. For real-time leaderboards, seconds or less.
Communication patterns between services. Synchronous REST or gRPC for request-response flows where the caller needs an immediate answer. Asynchronous messaging via Kafka or RabbitMQ for decoupled, resilient workflows. gRPC with Protocol Buffers when you need type safety and performance between internal services. GraphQL when clients need flexible queries across multiple data sources.
The final tip: practice out loud. System design is a conversation, not a whiteboard exam. The interviewer wants to see your thought process, your ability to make and justify decisions, and your awareness of real-world constraints. Practice explaining your designs to a rubber duck, a friend, or a recording of yourself. The candidates who get offers are the ones who communicate clearly under pressure.
Continue Reading
This content is available with BliniBot Pro or as an individual purchase.