Pergunta de entrevista da empresa Two Sigma

How do you decide which variables are the most important in a regression.

Resposta da entrevista

Sigiloso

11 de out. de 2018

There are many ways to do this. Let's say we have n features. An expensive method would be to fit regressions with n times such that the ith model leaves out the ith feature. Then, pick the k features associated to the k worst models when compared to a model using all features. Another way is to use PCA and find features with largest variance, as these almost always contain the most information.