Pergunta de entrevista da empresa Palantir Technologies

Given a list of "threads", which contain 2 variables - starting and ending times - implement a function that will return all running threads at some time t. Optimize it. (faster than O(n) )

Respostas da entrevista

Sigiloso

5 de jul. de 2012

I believe the best solution is as follows: You iterate through each thread and make note of where it starts and where it ends. You store this information in a sort of object that holds a time, the number of the thread, and whether or not the thread started or ended. Sort this list of objects based on time. You have essentially divided your space into time ranges based on the start/stop times of threads. Now we create some array of lists that has it's size set to the number of objects we created, BUT DO NOT fill this array with the objects! The array will be used to keep track of which threads are running in each time range. Iterate through the objects you made, filling up the lists in the array as you go. Example: Thread durations: 1: |---------| 2: |---| 3: |-------| Final array: {(1),(1,2),(1),(1,3),(3),null} Now you can do O(logn) lookup in the array with a basic binary search on the time ranges. The result in the array is what threads are running at that time.

2

Sigiloso

5 de jul. de 2012

Glassdoor formatted the picture so it is useless. Here is attempt 2: 1: |---------|......... 2: ... |---|............. 3: ............|-------|

1

Sigiloso

17 de mai. de 2012

The above solution will work and will be efficient in terms of time. But the amount of storage required is going to be massive, especially since thread timings are going to be generally specified in precision of milliseconds and can potentially run for a long time. A more optimal solution in terms of space could be dividing the time between min(start) and max(end) in ranges and then store the corresponding threads for each range. Since a thread can cover a range fully or partially, we will need to store the start time and end time corresponding to each thread. The ranges can be decided based on the trade-off that is required between time and space efficiency. When size of ranges is 1, the solution will effectively turn into the one mentioned above.

1

Sigiloso

9 de ago. de 2012

@Jake The solution's pre-process can be faster. You can just store all times in an array, along with whether each time is a start/end time. The you can just traverse the array. This means that you don't need to figure out, for each discrete period in time, what threads are running. that could be pretty inefficient.

Sigiloso

26 de abr. de 2012

We need to pre process this, Given the thread timings, we can create a int array of size max(endtimings) - min(start timings) and add 1 from array[start] to array[end] for each thread. Now, given a certain time, you can just lookup in the array, which takes O(1) time.