They asked me about my projects and experiences. Few questions on Databricks and Azure Data Factory. Then one scenario based pyspark question: We have two data: 1. employee id, role, name 1, DE, abc 2, mgr, def 2. employee-attendance id, att_date, is_present 1,2023-07-16,1 1,2023-07-16,1
Sigiloso
When I wrote SQL. He specifically asked me to write it in pyspark. I did that. Then he asked what if an employee has two roles, then your answer will be incorrect. I replied that I will have to apply over(partition by ) but after interview I realized that it won't affect. So, the question was partially wrong. Couldn't clear this round