What does HackerNews think of VictoriaMetrics?
VictoriaMetrics: fast, cost-effective monitoring solution and time series database
* Great code readability. I can open any project in Go and instantly start reading and understanding the code. This is because of simple syntax, which doesn't provide ability to write implicitly executed code, and `go fmt` tool, which formats everybody's code to a single code style.
* Go discourages unnecessary abstractions and encourages writing essential boring code without various "smart" tricks. The end result is much shorter code base, which is easy to read and refactor.
* Fast compile times. For instance, my latest project - VictoriaMetrics [1] - contains almost 600K lines of Go code excluding "_test.go" files. Go builds the final `victoria-metrics` executable in less than a second during development. This means that I can quickly iterate on code modifications and testing without the need to wait for minutes and hours for build process to finish (hello, Rust and C++ :) ).
* Single statically linked output binary with relatively small size. For example, VictoriaMetrics is built into 19MB statically linked binary with Go 1.15.4. And this binary contains all the debug symbols. Binary size shrinks to 12MB when stripping debug symbols. Such a binary can run on any host without the need to manage external dependencies - just upload the binary and run it.
* Ability to build executable for any supported target platform by specifying GOOS and GOARCH environment variables. For example, I can build binary for FreeBSD on ARM from my x86 laptop running Ubuntu.
* Great and easy-to-use tooling for race detection and CPU/memory/locks profiling. There tools significantly simplify code optimization and allow to catch data races at much lower mental cost comparing to race-free Rust :)
P.S. I hope Go will never adopt generics. I didn't need generics during the last 10 years of active development in Go [2]. Before Go I was working with C++ and was experiencing constant pain with reading and debugging C++ templates in stl and boost libraries. I don't want to experience the same feelings with Go.
I wrote a ton of various code in Go [1] over the last 10 years and had never experienced the need in generics. The last my project in Go is VictoriaMetrics [2] - fast and cost-effective open source time series database and monitoring solution.
Before Go I was writing some code in C++ and was constantly struggling with C++ templates in stl and boost libraries. This was absolute nightmare from debugging PoV.
[1] https://github.com/wrouesnel/postgres_exporter
* InfluxDB - https://www.influxdata.com/
* VictoriaMetrics - https://github.com/VictoriaMetrics/VictoriaMetrics/
* M3DB - https://github.com/m3db/m3/
BTW, I'm working on VictoriaMetrics - open source monitoring solution that works out of the box. See https://github.com/VictoriaMetrics/VictoriaMetrics
We were successfully ingesting hundreds of billions of ad serving events per day to it. It is much faster at query speed than any Postgres-based database (for instance, it may scan tens of billions of rows per second on a single node). And it scales to many nodes.
While it is possible to store monitoring data to ClickHouse, it may be non-trivial to set up. So we decided creating VictoriaMetrics [2]. It is built on design ideas from ClickHouse, so it features high performance additionally to ease of setup and operation. This is proved by publicly available case studies [3].
If you need effectively storing trillions of rows and performing real-time OLAP queries over billions of rows, then it is better to use ClickHouse [1], since it requires 10x-100x less compute resources (mostly CPU, disk IO and storage space) than PostgreSQL for such workloads.
If you need effectively storing and querying big amounts of time series data, then take a look at VictoriaMetrics [2]. It is built on ideas from ClickHouse, but it is optimized solely for time series workloads. It has comparable performance to ClickHouse, while it is easier to setup and manage comparing to ClickHouse. And it supports MetricsQL [3] - a query language, which is much easier to use comparing to SQL when dealing with time series data. MetricsQL is based on PromQL [4] from Prometheus.
[2] https://github.com/VictoriaMetrics/VictoriaMetrics
[3] https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Metr...
[4] https://medium.com/@valyala/promql-tutorial-for-beginners-9a...
[1] https://github.com/VictoriaMetrics/VictoriaMetrics
[2] https://github.com/klauspost/compress/tree/master/zstd#zstd
[1] https://github.com/VictoriaMetrics/VictoriaMetrics/
[2] https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/Exte...
[3] https://medium.com/@valyala/measuring-vertical-scalability-f...
One question: why do you use Gorilla compression for floating-point values? It works well for integer values, but is pretty useless for floating-point values [3].
[1] https://github.com/VictoriaMetrics/VictoriaMetrics
[2] https://medium.com/@valyala/measuring-vertical-scalability-f...
[3] https://medium.com/faun/victoriametrics-achieving-better-com...
Probably, ClickHouse [1] would fit better your needs. It can write millions of rows per second [2]. It can scan billions of rows per second on a single node and it scales to multiple nodes.
Also I'd recommend taking a look at other open-source TSDBs with cluster support:
- M3DB [3]
- Cortex [4]
- VictoriaMetrics [5]
These TSDBs speak PromQL instead of SQL. PromQL is specially optimized query language for typical time series queries [6].
[2] https://blog.cloudflare.com/http-analytics-for-6m-requests-p...
[4] https://github.com/cortexproject/cortex
[5] https://github.com/VictoriaMetrics/VictoriaMetrics/
[6] https://medium.com/@valyala/promql-tutorial-for-beginners-9a...
I'd recommend:
- Storing logs in high-performance systems such as cLoki [1], which give enough flexibility to generate arbitrary metrics from high log volumes in real time.
- Storing metrics in high-performance time series database optimized for high volumes of data points (trillions) and high number of time series (aka high cardinality). An example of such a TSDB is VictoriaMetrics [2].
Implementing dashboards with time-based correlation between metrics and logs based on shared attributes (labels) such as datacenter, cluster, instance, job, subsystem, etc [3].
[1] https://github.com/lmangani/cLoki
[2] https://github.com/VictoriaMetrics/VictoriaMetrics/
[3] https://grafana.com/blog/2019/05/06/how-loki-correlates-metr...
[1] https://godoc.org/github.com/prometheus/client_golang/promet...
Go encourages writing simple straightforward code without complex abstractions. This improves programmers' productivity, which allows writing raw practical code instead of spending time on theoretical design patterns, inheritance hierarchy, generics, fancy algorithms, monads, "zero-cost" abstractions (with high mental cost), borrow checking, etc.
That's why we could write fast time series database in Go from scratch in less than a year - https://github.com/VictoriaMetrics/VictoriaMetrics/ .