Pergunta de entrevista da empresa Anonymous Content

Find duplicate rows in a PySpark DataFrame. Remove duplicates but keep the latest row (based on timestamp). Find employees who logged in for 3 consecutive days. Pivot sales data: rows (month, sales) → columns (Jan, Feb, Mar…). Explode JSON column (with arrays) into multiple rows. Read data from Kafka using PySpark Structured Streaming. Write a PySpark job that increments data daily using partition pruning.