Excited to see Steampipe shared here - thanks kiyanwang! I'm a lead on the project, so sharing some quick info below and happy to answer any questions.
Steampipe is open source [1] and uses Postgres foreign data wrappers under the hood [2]. We have 84+ plugins to SQL query AWS, GitHub, Slack, HN, etc [3]. Mods (written in HCL) provide dashboards as code and automated security & compliance benchmarks [3]. We'd love your help & feedback!
1 - https://github.com/turbot/steampipe 2 - https://steampipe.io/docs/develop/overview 3 - https://hub.steampipe.io/
Hey Nathan. Can you comment on some of the known performance pitfalls of steampipe? I'm not super familiar with the Postgres foreign data wrappers API. I assume steampipe inherits many of its technical limitations from this API.
Having done some work in this space, I'm aware that it's no small thing to compile a high-level SQL statement that describes some analytics task to be executed on a dataset into a low-level program that will efficiently perform that task. This is especially true in the context of big data and distributed analytics. Also true if you're trying to blend different data sources that themselves might not have efficient query engines.
Would love to use this tool, but just curious about some of the details of its implementation.
That said, I don't imagine this ever being a bottleneck for the main use case of Steampipe - in that case I think the APIs themselves will always be the limiting part. But it does - potentially - speak to what you can expect if you'd like to extend your usage of Steampipe to more than just DevOps data.
I've used the benchmark available in the OctoSQL README.
[0]: https://github.com/cube2222/octosql
[1]: https://github.com/apache/arrow-datafusion
Disclaimer: author of OctoSQL