Home Podcasts The Data Engineering Show
The Data Engineering Show

The Data Engineering Show

The Firebolt Data Bros 60 Episodes Jun 16, 2026

The Data Engineering Show is a podcast for data engineering and BI practitioners that goes beyond theory. It features conversations with influential tech figures about their real-world data challenges and solutions in a casual setting. The show is hosted by the Firebolt Data Bros, including Eldad and Boaz Farkash, who founded Sisense and later Firebolt. Season 2 introduces Benjamin Wagner as a co-host. The podcast aims to provide practical insights for data professionals.

Episodes

AI for Data and Data for AI: The Dual Frontier of Modern Data Engineering with Pranav Motarwar Jun 16, 2026 1187 What if the data engineering skills you have today become obsolete in five years? In this episode, host Benjamin Wagner sits down with Pranav Motarwar, a data engineer who's witnessed the industry's transformation from traditional ETL to AI-powered pipelines, to explore how AI is fundamentally reshaping data engineering roles, why you need to master both "AI for data" and "data for AI" to stay rel
AI Won't Replace Engineers, But This Framework Will Change How They Build with Rohit Girme May 7, 2026 1095 What if you could build AI features with confidence while moving at the pace of innovation? In this episode, Benjamin Wagner sits down with Rohit Girma, Staff Software Engineer at Airbnb, to explore how to evaluate generative AI in production, why breaking down complex problems into smaller chunks accelerates development, and the key strategies for scaling AI-powered products beyond zero-to-one. W
The Framework Canva Uses for 200M+ Designers with Paul Tune Apr 28, 2026 1329 In this episode of The Data Engineering Show, Benjamin sits down with Paul Tune, Staff Research Scientist at Canva, to explore the advancement of machine learning at one of the world's leading design platforms. Learn how Canva is transitioning from traditional ML like recommendation engines for templates to cutting-edge agentic workflows that allow users and AI to collaborate on complex design tas
Llama 2 & 3 Safety: Soumya Batra on Agentic AI Training Apr 8, 2026 1350 What if the expertise that built foundation models could reshape how you think about AI's future? In this episode, Benjamin sits down with Soumya Batra, founder and CEO of WisePort AI and former safety lead on Llama 2 and Llama 3 at Meta, to explore how foundation models evolved from traditional NLP, why post-training holds the highest leverage for safety and controllability, and what natively age
The Data Fusion Secret & Why Custom Query Engines Fail with Nikita Lapkov Mar 24, 2026 1091 What if building a distributed SQL engine meant rethinking everything about how query execution works at scale? In this episode, Benjamin sits down with Nikita, Senior Software Engineer at Cloudflare, to explore how R2 SQL leverages object storage and distributed computing to power analytics across 300 global locations, why backward compatibility becomes critical when you can't control infrastruct
How Zipline AI Turns Weeks of Engineering Into Minutes of SQL Queries ft. Nikhil Simha Mar 10, 2026 1458 What if you could deploy ML features and real-time data pipelines without building complex infrastructure from scratch? In this episode, host Benjamin sits down with Nikhil Simha, CTO at Zipline AI and co-author of Chronon AI, to explore how Chronon, an open-source system that generates data infrastructure from simple queries, is transforming feature engineering at companies like OpenAI and Airb
The Geo-Data Problem Nobody Talks About And How Voi Solved It ft. Magnus Dahlbäck Feb 19, 2026 966 What if your data platform could power both critical business decisions and real-time product features at scale? In this episode, host Benjamin sits down with Magnus Dahlbäck, Senior Director of Data and Platform at Voi, to explore how a metrics-first approach and semantic layers transform data accessibility, why traditional ML and LLMs require different strategies for different problems, and how
Why 99% of Data Teams Give Up on Real-Time And How Artie Changes That Feb 3, 2026 1757 What happens when a team of seven engineers spends a year trying to build a production-ready CDC connector and fails? For Artie CTO and co-founder Robin Tang, it was the spark needed to build a platform that makes data streaming accessible. In this episode, Robin joins Benjamin to discuss the "DFS" (Deep First Search) approach to data sources, the engineering hurdles of real-time Postgres-to-Snowf
The $100M Problem: How Lyft's Data Platform Prevents ML Failures with Ritesh Varyani at Lyft Dec 16, 2025 1546 What if your data platform could serve AI-native workloads while scaling reliably across your entire organization? In this episode, Benjamin sits down with Ritesh, Staff Engineer at Lyft, to explore how to build a unified data stack with Spark, Trino, and ClickHouse, why AI is reshaping infrastructure decisions, and the strategies powering one of the industry's most sophisticated data platforms. W
60 Billion Predictions Daily: Inside Credit Karma’s Agentic Data Layer with Maddie Daianu Nov 19, 2025 1195 What does MLOps look like when you are deploying 60 billion machine learning predictions a day? Maddie Daianu, Head of Data and AI at Intuit Credit Karma, joins the Data Bros to pull back the curtain on one of the most high-volume data environments in FinTech. With a 100-person team serving 140 million members, standard data practices break down. Maddie shares how her team manages terabytes of da
Block Bad Data Before the Write with Nike’s Ashok Singamaneni Oct 7, 2025 1220 Nike’s Principal Data Engineer Ashok Singamaneni joins Benjamin and Eldad to discuss his open-source data quality framework, Spark Expectations. Ashok explains how the tool, which was inspired by Databricks DLT Expectations, shifts data quality checks to before the data is written to a final table. This proactive approach uses row-level, aggregation-level, and query data quality checks to fail job
Postgres vs. Elasticsearch: The Unexpected Winner in High-Stakes Search for Instacart with Ankit Mittal Sep 17, 2025 1298 Modernizing Search Infrastructure: How Instacart Transitioned from Elasticsearch to PostgreSQL for Enhanced Performance and Simplicity. In this episode of The Data Engineering Show, host Benjamin Wagner speaks with Ankit Mittal, former senior engineer at Instacart, about the company's innovative approach to modernizing their search infrastructure by transitioning from Elasticsearch to PostgreSQL f

Recommended