Check out our latest project — dmp-af.cloud, an open-source orchestration platform for dbt →
Conference Talk

What’s Under the Hood of Yandex.Taxi DWH?

About This Talk At DE or DIE 2020, I took the audience behind the scenes of the data warehouse powering Yandex.Taxi (Yandex Go) — one of the largest ride-hailing services in Eastern Europe. The talk covered the technical architecture, organizational structure, and the unique challenges of building a data platform at this scale.

  • Author

    Evgeny Ermakov

  • Category

    Conference Talk

  • Read Time

    2 min read

  • Last updated

    June 20, 2020

About This Talk

At DE or DIE 2020, I took the audience behind the scenes of the data warehouse powering Yandex.Taxi (Yandex Go) — one of the largest ride-hailing services in Eastern Europe. The talk covered the technical architecture, organizational structure, and the unique challenges of building a data platform at this scale.

Key Ideas

Scale of the DWH — Millions of daily trips, dozens of events per trip, real-time pricing, surge detection, driver-rider matching, and route optimization. Every trip generates a rich stream of events that feed into the analytical warehouse. The data volume and velocity are enormous.

Organizational Structure — The data platform team includes data engineers, analytics engineers, analysts, and domain specialists. The talk covered how ownership is distributed, how teams interact, and the evolution from a centralized model toward domain-oriented data ownership.

Technology Stack — ClickHouse for real-time analytics with sub-second query response times. Greenplum for the analytical warehouse handling complex transformations and historical analysis. Custom ETL frameworks optimized for the specific data patterns of a ride-hailing platform.

Roles and Responsibilities — How different roles contribute to the data platform: data engineers build and maintain the infrastructure, analytics engineers design the semantic layer, analysts create insights, and the data partner role (covered in my Data Fest 2021 talk) bridges the gaps between all of them.

Why It Matters

Large-scale data platforms are built by teams, not tools. Understanding how an organization like Yandex.Taxi structures its data team, distributes ownership, and chooses technology provides a blueprint that others can adapt to their own scale and context.

Watch

Watch the full talk on YouTube →

Call to Action Background
Free discovery call

Ready to Make Data Work for Your Business?

Join companies that trust iJKos & partners to build reliable data infrastructure and turn complexity into clear, confident decisions.