What does HackerNews think of peloton?

The Self-Driving Database Management System

Language: C++

#14 in Database
Proud to see my name (https://twitter.com/YingjunWu) mentioned in Andy's blog. I was Andy's visiting PhD at CMU and was the top 1 contributor to Peloton (https://github.com/cmu-db/peloton).

Today, building a database from scratch is extremely difficult, for several reasons: 1. it anyways takes a long time; 2. there are so many successful (open-source) databases; 3. hiring top engineers are so expensive. 4. you won't get enough attention unless your system is drastically better than existing ones.

An interesting observation is that very few database was built since 2020 - almost all the newly built databases were developed on top of existing databases (PostgreSQL, ClickHouse, etc).

I started building RisingWave (https://github.com/risingwavelabs/risingwave) in early 2021. The only reason we built the system from scratch was that none of the existing systems can address the problem we are solving - distributed SQL stream processing at cloud scale. We tried Flink but gave up, as it's too heavy and it's architecture was not designed for the cloud environment.

If you want to build a database from scratch, or are simply interested in databases, we may talk.

From the GitHub README:

https://github.com/cmu-db/peloton

UPDATE 2019-03-17

The Peloton project is dead. We have abandoned this repository and moved on to build a new DBMS. There are a several engineering techniques and designs that we learned from this first system on how to support autonomous operations that we are doing a much better job at implementing in the second system.

We will not accept pull requests for this repository. We will also not respond to questions or problems that you may have with running with this software.

For a moment, I thought that CMU's and Andy Pavlo's amazing database project got serious funding.

https://github.com/cmu-db/peloton

I'm a bit disappointed now

I know for the advanced database course the students end up writing a new feature for an existing database called Peloton, which is a research project at CMU[0]. You obviously aren't writing the whole thing from scratch though.

[0]: https://github.com/cmu-db/peloton

Good summary. Some systems groups are already going in this direction. PeletonDB is trying to use DL to build a self-tuning DB https://github.com/cmu-db/peloton We have been trying to self-tune resource management decisions in Hadoop YARN using deep learning.
ML with human-in-the-loop (HITL). There is a feedback loop that learns based on human expectation. Peloton[1] is already learning based on data access patterns, but doesn't consider subjective feedback. Maybe there could be a query planner that learns which query plans are better via reinforcement learning (the reward is given by the human).

[1]: https://github.com/cmu-db/peloton