funny thing, just learnt about clickhouse today. for experienced people that use columnar stores and pandas for analytics, which tool do you usually prefer for BI stuff ? do ya'll load data into clickhouse then analyse it using pandas. or all analysis is done via the clickhouse sql dialect. As i'm sure things like pivot tables and rolling windows are a PITA in SQL

Why would you bother using a database like clickhouse to store data if you're just going to analyze it in pandas? Just store it in a csv, parquet, or orc.

> As i'm sure things like pivot tables and rolling windows are a PITA in SQL

I can't speak for clickhouse, but group-by and window functions are a very standard part of any SQL analysts toolbelt.

Another option to consider here is storing your data in TileDB[1] which allows you to access it via Python/Pandas, MariaDB for SQL (embeddable or standalone), Spark, R and more. With embedded MariaDB[2] you can query directly into pandas with minimum overhead. TileDB is similar to parquet in that it can capture sparse dataframe usage, however it is more general in that TileDB supports multi-dimensional datasets and dense arrays. Other major features built directly into TileDB include handling updates, time traveling and partitioning at the library level, removing the need for using extra services like Delta Lake to deal with the numerous Parquet files you may create. TileDB also has native support for S3 and in the next release we'll have native Azure Blob Storage support.

[1] https://github.com/TileDB-Inc/TileDB

[2] https://docs.tiledb.com/developer/api-usage/embedded-sql

Disclosure: I am a member of the TileDB, Inc. team