I love OpenTelemetry and we want to trace almost every span happening. We’d be bankrupt if we went any vendor. We wired opentelemetry with Java magic, 0 effort and pointed to a self hosted Clickhouseand store 700m+ span per day with a 100$ EC2.

https://clickhouse.com/blog/how-we-used-clickhouse-to-store-...

I've got a small personal project submitting traces/logs/metrics to Clickhouse via SigNoz. Only about 400k-800k spans per day (https://i.imgur.com/s0J6Mzo.png), but running on a single t4g.small with CPU typically at 11% and IOPS at 4%. I also have everything older than a certain number of GB getting pushed to a sc1 cold storage drive.

w/ 1 month retention for traces:

    ┌─parts.table─────────────────┬──────rows─┬─disk_size──┬─engine────┬─compressed_size─┬─uncompressed_size─┬────ratio─┐
    │ signoz_index_v2             │  26902115 │ 17.06 GiB  │ MergeTree │ 6.21 GiB        │ 66.74 GiB         │   0.0930 │
    │ durationSort                │  26901998 │ 5.44 GiB   │ MergeTree │ 5.40 GiB        │ 53.02 GiB         │  0.10190 │
    │ trace_log                   │ 123185362 │ 2.64 GiB   │ MergeTree │ 2.64 GiB        │ 37.96 GiB         │   0.0695 │
    │ trace_log_0                 │ 120052084 │ 2.46 GiB   │ MergeTree │ 2.45 GiB        │ 37.60 GiB         │  0.06528 │
    │ signoz_spans                │  26902115 │ 2.21 GiB   │ MergeTree │ 2.21 GiB        │ 76.73 GiB         │ 0.028784 │
    │ query_log                   │  16384865 │ 1.91 GiB   │ MergeTree │ 1.90 GiB        │ 18.31 GiB         │  0.10398 │
    │ part_log                    │  17906105 │ 846.73 MiB │ MergeTree │ 845.39 MiB      │ 3.84 GiB          │  0.21521 │
    │ metric_log                  │   4713151 │ 820.92 MiB │ MergeTree │ 806.13 MiB      │ 14.56 GiB         │  0.05405 │
    │ part_log_0                  │  15632289 │ 702.82 MiB │ MergeTree │ 701.70 MiB      │ 3.34 GiB          │  0.20490 │
    │ asynchronous_metric_log     │ 795170674 │ 576.24 MiB │ MergeTree │ 562.50 MiB      │ 11.11 GiB         │ 0.049429 │
    │ query_views_log             │   6597156 │ 461.35 MiB │ MergeTree │ 459.75 MiB      │ 6.36 GiB          │  0.07060 │
    │ logs                        │   6448259 │ 408.59 MiB │ MergeTree │ 406.65 MiB      │ 5.99 GiB          │  0.06627 │
    │ samples_v2                  │ 949110122 │ 345.01 MiB │ MergeTree │ 325.31 MiB      │ 22.09 GiB         │ 0.014382 │
If I was less stupid I'd get a machine with the recommended Clickhouse specs and save myself a few hours of tuning, but this works great.

Downsides:

- clickhouse takes about 5 minute to start up because my tiny sc1 drive has like 4 IOPS allowed

- signoz's UI isn't amazing. It's totally functional, and they've been improving very quickly, but don't expect datadog-level polish

Thanks for mentioning SigNoz, I am one of the maintainers at SigNoz and would love your feedback on how we can improve it further.

If anyone wants to check our project, here’s our GitHub repo - https://github.com/SigNoz/signoz