“NALSD is not System Design. It is operational physics under constraints.”
Most candidates fail Google’s Non-Abstract Large System Design (NALSD) round because they treat it like a generic whiteboarding interview (e.g., “Design Twitter”, “Add a Redis cache”, “Use Kafka”).
NALSD is fundamentally different.
You are usually given an existing system that is:
The interviewer is evaluating whether you can reason like a Production SRE, not a Cloud Architect.
The fastest way to fail NALSD is to assume software can solve physical constraints.
- The L4 "Cloud Architect" Mindset
- "To survive a region failure, I will synchronously replicate our 5 Petabyte database to Europe."
+ The L6 "Reliability Architect" Mindset
+ "Wait. 5 Petabytes over a dedicated 10Gbps link takes ~46 days to transfer. Synchronous replication violates the laws of physics for our 200ms latency SLO. We must use async replication and accept data staleness, or change the requirement."
If you draw a box on the whiteboard before doing the Capacity Math, you have already failed.
Do not jump around. Top-tier candidates execute this exact sequence. Memorize this order.
Do not draw anything yet.
Now you can design. But design defensively.
If the interviewer says: “Latency is spiking in South America,” you must immediately shift into Incident Command mode.
- The Failing Sequence (The Debugger)
- 1. "Let me check the application logs."
- 2. "I'll look at the database CPU."
- 3. "Let's find the root cause."
+ The Passing Sequence (The Commander)
+ 1. "Is this a global outage or just South America? (Clarify Blast Radius)"
+ 2. "I am draining traffic from South America to US-East immediately. (Mitigate & Stabilize)"
+ 3. "Now that users are safe, I will investigate the root cause. (Debug)"
Skipping Step 2 (Stabilization) is an automatic down-level. Root cause analysis comes after user safety.
Keep this checklist in your head. Have you addressed all five?
| S | The NALSD Check | What Interviewers Listen For |
|---|---|---|
| Scope | What exactly are we fixing? | Are you narrowing the problem to avoid boiling the ocean? |
| Scale | What is the math? | Are you quantifying load (QPS, IOPS) instead of guessing? |
| SLIs | How do we measure it? | Do you anchor your design trade-offs in metrics? |
| Scarcity | What are the limits? | Do you respect physical limits (Network, Disk, Memory)? |
| Safety | How does it break? | Do you fail visibly and cleanly (Graceful Degradation)? |
Use these sparingly to signal operational maturity to the hiring committee:
This file teaches you what interviewers expect. But NALSD is a high-pressure, 45-minute verbal sprint.
Knowing the flowchart won’t help you if you freeze when the interviewer says: “Actually, your fallback database just ran out of inodes. Now what?”
To pass, you must train your reflexes.
I built The Complete Google SRE Career Launchpad to simulate these exact conditions. It includes:
👉 Get the Complete Google SRE Career Launchpad Here
Free resources create awareness.
Simulation changes outcomes.