Awhile ago I wrote a Python library called LiveStats[1] that computed any percentile for any amount of data using a fixed amount of memory per percentile. It uses an algorithm I found in an old paper[2] called P^2. It uses a polynomial to find good approximations.

The reason I made this was an old Amazon interview question. The question was basically, "Find the median of a huge data set without sorting it," and the "correct" answer was to have a fixed size sorted buffer and randomly evict items from it and then use the median of the buffer. However, a candidate I was interviewing had a really brilliant insight: if we estimate the median and move it a small amount for each new data point, it would be pretty close. I ended up doing some research on this and found P^2, which is a more sophisticated version of that insight.

[1]: https://github.com/cxxr/LiveStats

[2]: https://www.cs.wustl.edu/~jain/papers/ftp/psqr.pdf

There are some newer data structures that take this to the next level such as T-Digest[1], which remains extremely accurate even when determining percentiles at the very tail end (like 99.999%)

[1]: https://arxiv.org/pdf/1902.04023.pdf / https://github.com/tdunning/t-digest