Pergunta de entrevista da empresa Spoonshot

Why is Cross entropy used as a loss function for classification? What are the advantages of Transformers over RNN/LSTMs?