1. Asked to implement a simple Bag of Words vectorizer from scratch using only the Python standard library. 1.a. Over the BOW -> I created a function with '_' as starting - The interviewer asked why did I start it with '_'? 1.b. So what are we doing with the words that are not in our vocabulary (let’s say we wanna know we we are trying to factorise sentence that contains some words that are not in our purpose or vocabulary) 1.c. Lets say the vocab is growing to millions - How would you reduce the vocabulary size in this case, which is gonna be dimensional? 1.d. In the output of my vector - you can see where we can see that most of the elements in vectors are zero what does that signify? 1.e. Is this class CPU or GPU bound? 2. Spiral Matrix Leetcode - discussion and scenarios
Sigiloso
1. Created the full function - calling and output 1.a. Because it was a helper function 1.b. Vector space cannot expand dynamically as fit() is there. Vector length must remain fixed. 1.c. Limit vocab size, top k, hashing 1.d. Sparsity 1.e. CPU bound