About This Talk
At SmartData 2023, I shared the experience of adopting dbt at Toloka.ai as the foundation of a modern data platform. dbt is one of the fastest-growing tools in data warehousing, and its combination of simplicity and power made it the right choice โ but no open-source tool is a silver bullet, and some “manual tweaking” was inevitable.
Key Ideas
Why dbt Is a Must-Have โ dbt brings version-controlled transformations, built-in testing, documentation generation, and a declarative SQL-first workflow. These features make it an essential part of any modern data platform stack.
Integrating dbt with Airflow โ Running dbt models as part of Airflow DAGs enables orchestration, dependency management, and scheduling. The talk covered practical patterns for making this integration work reliably at scale.
Data Mesh with dbt โ dbt’s project and package structure naturally supports domain-oriented data ownership. Teams can independently develop and publish their data models while maintaining cross-domain consistency through shared conventions and contracts.
What Required Customization โ Like any open-source tool, dbt had gaps that needed filling. The talk covered the specific customizations and workarounds the team built to make dbt production-ready for their use case.
Should You Adopt dbt Now? โ A pragmatic assessment of when dbt is the right choice, what prerequisites your team needs, and what to expect during adoption.
Why It Matters
dbt has become a de facto standard for data transformation, but real-world adoption stories โ with honest discussion of both benefits and pain points โ are more valuable than marketing materials. This talk provides a practitioner’s perspective on what it actually takes to make dbt the core of your data platform.