What does HackerNews think of dbt-core?

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language: Python

> Could you share more specific details? Happy to look over / revise where needed.

Sure thing! I'd say first off, the solutions may look different for a small company/startup vs. a large enterprise. It can help if you explain the scale at which you are solving for.

On the enterprise side of things, they tend to buy solutions rather than build them in-house. Things like Informatica, Talend, etc. are common for large enterprises whose primary products are not data or software related. They just don't have the will, expertise, or the capital to invest in building and maintaining these solutions in-house so they just buy them off the shelf. On the surface, these are very expensive products, but even in the face of that it can still make sense for large enterprises in terms of the bottom line to buy rather than build.

For startups and smaller companies, have you looked at something like `dbt` (https://github.com/dbt-labs/dbt-core) ? I understand the desire to write some code, but often times there are already existing solutions for the problems you might be encountering.

ORM's should typically only exist on the consumer-side of the equation, if at all. A lot of business intelligence / business analysts are just going to use tools like Tableau and hook up to the data warehouse via a connector to visualize their data. You might have some consumers that are more sophisticated and may want to write some custom post-processing or aggregation code, and they could certainly use ORM's if they choose, but it isn't something you should enforce on them because it's a poor place to validate data since as mentioned there are different ways/tools to access the data and not all of them are going to go through your python SDK.

Indeed in a large enough company, you are going to have producers and consumers that are going to use different tools and programming languages, so it's a little bit presumptuous to write an SDK in python there.

Another thing to talk about, and this probably mostly applies to larger companies - have you looked at an architecture like a distributed data mesh (https://martinfowler.com/articles/data-mesh-principles.html)? This might be something to bring to the CTO more than try to push for yourself, but it can completely change the landscape of what you are doing.

> More broadly is the issue of the gap of what you think the role is, and what the role actually is when you join. There are definitely cases where this is accidental. The best way I can think of to close the gap is to maybe do a short-term contract, but may be challenging to do under time constraints etc.

Yeah this definitely sucks and it's not an enviable position to be in. I guess you have a choice to look for another job or try to stick it out with the company that did this to you. It's possible there is a geniune existential crisis for the company and a good reason why they did the bait-and-switch. Maybe it pays to stay, especially if you have equity in the company. On the other hand, it could also be the case that it is the result of questionable practices at the company. It's hard to make that call.

I really dislike SQL, but recognize its importance for many organizations. I also understand that SQL is definitely testable, particularly if managed by environments such as DBT (https://github.com/dbt-labs/dbt-core). Those who arrived here with preference to python will note that dbt is largely implemented in python, adds Jinja macros and iterative forms to SQL, and adds code testing capabilities. No ORM required whatsoever.
Intriguing! I would like to learn more about how well dbt can "take over"/reverse engineer an existing database.

> What is dbt?

https://docs.getdbt.com/docs/introduction

Collaborate on data models, version them, and test and document your queries before safely deploying them to production, with monitoring and visibility.

https://docs.getdbt.com/docs/supported-data-platforms

--

core: https://github.com/dbt-labs/dbt-core - Python, Apache 2 license

cloud: https://www.getdbt.com/pricing/