Debug the Live Median Tracker
A teammate built a live median tracker — add(value) inserts a number, median() returns the current median over everything inserted so far — using the standard two-heap technique. The code looks reasonable and the small unit tests pass, but in production it reports a wrong median after specific insert sequences.
Fix the bugs in median_tracker.py so median() is correct after every add for any insertion order. The data structure shape (two heaps) is correct. What's broken is the bookkeeping that maintains the size and ordering invariants.
add(value)accepts any int or float and inserts it.median()returns the median over all values inserted so far: the middle element for odd counts, the average of the two middle elements for even counts.- The median must be correct after every single
add, not just at the end. (The judge checks after each step.) - Ascending sequences (
1, 2, 3, ...) and descending sequences (9, 8, 7, ...) must both produce correct medians at every step. - After every
add, the size invariant must hold:len(lo) == len(hi)orlen(lo) == len(hi) + 1. - After every
add, the ordering invariant must hold: every value inlois<=every value inhi.
add(1); add(2); add(3)
median() after each step: 1, 1.5, 2
add(9); add(8); add(7); add(6)
median() after each step: 9, 8.5, 8, 7.5
add(5); add(5); add(5)
median() after each step: 5, 5, 5
- Do not change the public API (
__init__,add,median). - Keep using two heaps backed by
heapq— don't replace the whole approach with sorted-list bisect. medianmust run in O(1);addmust run in O(log n).
How would you extend this to support `remove(value)` efficiently — say, for a sliding-window median over a stream?
Keep moving through related algorithms & data structures problems and build a stronger search-friendly practice loop around this topic.
View track →