The phone interview was generally non-technical. The 2nd takehome exercise task was to write a script that processes a CSV file of customer transactions, keeps a running mean and standard deviation of transactions by customer ID, observes large deviations of individual transaction amounts from the running mean and, when such deviations are observed, generates alerts. This was to be done by implementing just one method, for which they had written a header definition. The instructions stated certain requirements: scalability, readability and code structure, logging and testing, and how simple it is to run and evaluate the code. Observe that some of these involve a large element of subjectivity. There was no mention of persistence, or function naming conventions.
Sigiloso
The solution was submitted, with the main module fully documented via docstrings and block comments, all unit tests passing, a README containing a description of the general approach taken and some runtime instructions. The CSV file processing was implemented via Pandas (Pandas dataframe), and statistics were generated using standard Pandas row- and column-operations. The solution was highly scalable, as the only limit on the size of the dataset that could be processed was whatever Pandas could process. The feedback was absolutely one of the worst examples of assessment I have ever seen. * They claimed that the code was too hard to evaluate because of inadequate use of encapsulation! Encapsulation describes classes, not methods - they never asked for a class implementation, just a single method implementation. What is the relevance of encapsulation here? Absolutely none. This immediately tells you the assessment itself was flawed. * The second claim was the method names were not good, even though the only requirement was to implement a single method which they named themselves and which I did not change - there were a couple of methods I added to support the statistics generation, and these were named appropriately and quite sensibly, using standard Python method naming conventions. * They claimed that additional functionality was added which was not required, despite their own documentation stating that new functions and classes could be added where necessary. The only new methods added were to support the statistics generation. * They claimed that the solution wasn't scalable, which is total rubbish, because the solution is as scalable as the largest CSV file that Pandas can read in to a dataframe, which can be very large indeed, even not taking into account variables such as instance size, memory, cores etc. The last claim, which is probably the most ridiculous, was that the script changed the command line argument (for the file path) from being positional to named, which is false - I did not change the type of the command line argument. Perhaps the assessor could not read Python properly or wasn't familiar with argparse. Every claim about the solution is wrong or inaccurate or misrepresenting the actual solution. To candidates for this position, or other Python positions, within this company, beware!