Empresa engajada
System Design Round This was the deepest evaluation. Main Question: Design a pipeline that ingests, processes, and stores restaurant impression data at petabyte scale. They evaluated: A) Requirements clarification Logged-in vs logged-out? What metrics? What fields? Real-time vs batch? B) Architecture Messaging layer Storage format Bronze/silver/gold Partitioning strategy Processing tools C) Specific probing questions: What tools do you use to consume from Kafka? How do you avoid consuming the same message twice? How do you ensure idempotency? How do you enforce schema? What if producer changes data type? What format do you store in? How do you handle late-arriving data? What’s your Kafka partitioning strategy? How do you backfill historical data? How do you monitor pipeline health? This was testing: Streaming fundamentals Schema evolution Correctness guarantees Scale reasoning Tradeoffs Practical experience depth