What does HackerNews think of promscale?

Promscale is a unified metric and trace observability backend for Prometheus, Jaeger and OpenTelemetry built on PostgreSQL and TimescaleDB.

Language: Go

#2 in Monitoring
#10 in Monitoring
#32 in PostgreSQL
#25 in SQL
I work as a Product Manager at Timescale. :)

The data tiering feature will be only available in the Timescale cloud offering, we do not have plans to support this feature in oss/self-hosted at the moment.

Thanks for sharing your use-case with TimescaleDB. I'm curious to understand how are you ingesting metrics today into TimescaleDB, do you manage your own schema, retention, compression and downsampling?

We at Timescale have built a product named Promscale for easy metrics and traces ingestion with automatic schema-management, compression, retention and downsampling capabilities with full SQL support. Have you tried Promscale (https://github.com/timescale/promscale) for your metrics use-case?

To learn more about Promscale join us in #promscale channel in our community slack (https://slack.timescale.com/)

They say:

> if you want to have a seamless experience between metrics and traces, then current experience of stitching together Prometheus & Jaeger is not great.

But I wonder if using Promscale https://github.com/timescale/promscale would make Prometheus & Jaeger not such a big problem as SigNoz imply.

Promscale readme:

> Promscale is a unified metric and trace observability backend for Prometheus, Jaeger and OpenTelemetry built on PostgreSQL and TimescaleDB.

Either way, SigNoz seems interesting indeed. And am glad to see that SigNoz supports OpenTelemetry.

Hi! So the team is over 100 at this point, but engineering effort is spread across multiple products at this point.

The core timescaledb repo [0] currently has 10-15 primary engineers, with a few others working on DB hyperfunctions and our function pipelining [1] in a separate extension [2]. I think generally the set of outside folks who contribute to low-level database internals in C is just smaller than other type of projects.

We also have our promscale product [3], which is our observability backend powered by SQL & TimescaleDB.

And then there is Timescale Cloud [4], which is obviously a large engineering effort, most of which does not happen in public repos.

Interested? We're growing the teams aggressively! Fully remote & global.

https://www.timescale.com/careers

--

[0] https://github.com/timescale/timescaledb

[1] https://www.timescale.com/blog/function-pipelines-building-f...

[2] https://github.com/timescale/timescaledb-toolkit

[3] https://github.com/timescale/promscale ; https://github.com/timescale/tobs

[4] https://www.timescale.com/blog/announcing-the-new-timescale-...

At first, let's give the definition of `time series`. This is a series of (timestamp, value) pairs ordered by timestamp. The `value` may contain arbitrary data - a floating-point value, a text, a json, a data structure with many columns, etc. Each time series is uniquely identified by its name plus an optional set of {label="value"} labels. For example, temperature{city="London",country="UK"} or log_stream{host="foobar",datacenter="abc",app="nginx"}.

ClickHouse is perfectly optimized for storing and querying of such time series, including metrics. That's true that ClickHouse isn't optimized for handling millions of tiny inserts per second. It prefers infrequent batches with big number of rows per each batch. But this isn't the real problem in practice, because:

1) ClickHouse provides Buffer table engine for frequent inserts.

2) It is easy to create a special proxy app or library for data buffering before sending it to ClickHouse.

TimescaleDB provides Promscale [1] - a service, which allows using TimescaleDB as a storage backend for Prometheus. Unfortunately, it doesn't show outstanding performance comparing to Prometheus itself and to other remote storage solutions for Prometheus. Promscale requires more disk space, disk IO, CPU and RAM according to production tests [2], [3].

[1] https://github.com/timescale/promscale

[2] https://abiosgaming.com/press/high-cardinality-aggregations/

[3] https://valyala.medium.com/promscale-vs-victoriametrics-reso...

Full disclosure: I'm CTO at VictoriaMetrics - competing solution for TimescaleDB. VictoriaMetrics is built on top of architecture ideas from ClickHouse.

There's a big difference -- that's how the PostgreSQL community works. 2ndQuadrant (now EDB), EDB, Citus (now Microsoft) all add value to open source Postgres, contribute back to the community by bringing new features, new life, usecases, and of course committing changes upstream where possible. Timescale is actually on the more open side of that balance, with the licensing and the community version feature matrix.

Also, in this case, Timescale actually has a pretty forgiving license[0] as long as you are not a add-nothing-aaS-provider, perhaps more than it should be, which I've asked about before[1]. Even before that change was made, running just the community edition as a add-nothing-aaS-provider would have been an improvement on the status quo, given how soundly it thrashed some other solutions in the past (ex. Influx[2]) and what you can do it (promscale[3]).

I know it can't be all roses, nothing is, but I don't think they've put too many feet wrong so far.

[EDIT] - I should note that on the scale of "contributing" to Postgres, the scale heavily tips in favor of 2ndQuadrant, EDB, and Citus as obviously they have the most committers and core team members. All those companies are to be commended of course, they're making postgres work as businesses and keeping it free while also improving it.

[0]: https://news.ycombinator.com/item?id=24579905

[1]: https://news.ycombinator.com/item?id=24585564

[2]: https://blog.timescale.com/blog/timescaledb-vs-influxdb-for-...

[3]: https://github.com/timescale/promscale

> TimescaleDB is purpose-built for time-series workloads, so that you can get orders of magnitude better performance at a fraction of the cost, along with a much better developer experience. This means massive scale (100s billions of rows and millions of inserts per second on a single server), 94%+ native compression, 10-100x faster queries than PostgreSQL, InfluxDB, Cassandra, and MongoDB – all while maintaining the reliability, ease-of-use, SQL interface, and overall goodness of PostgreSQL.

WITH the links to back up their analysis/comparisons. I think I was first sold the hardest on Timescale when I read the InfluxDB post.

Timescale also releasing their hard work over the last ~3 years with a very permissive, surprisingly business friendly license[0] which was previously discussed on HN[1].

This is pretty huge -- personally one of the things I'm really looking forward to is using Postgres as a backing store for Prometheus now, which timescale actually worked on[2] already. If you look really closely, there's actually a way to get all your observability data into Postgres (i.e. zipkin/jaeger, logs, and prometheus for metrics), and Timescale is going to make all of those things easier to scale and maintain. For those following along at home, postgres already has some not terrible full text search[3] and ways to integrate with elasticsearch like zombodb[4]...

[EDIT] - Just to make what I'm hinting at a little clearer, I think you might be able to build a Graylog[5] type system really easily with just Postgres these days. Imagine deploying a single binary that only needs a single database to do everything it needs to do.

[0]: https://blog.timescale.com/blog/building-open-source-busines...

[1]: https://news.ycombinator.com/item?id=24579905

[2]: https://github.com/timescale/promscale

[3]: https://www.postgresql.org/docs/current/textsearch.html

[4]: https://www.zombodb.com/

[5]: https://github.com/Graylog2/graylog2-server/