Home Podcasts The Databricks Data Engineer
The Databricks Data Engineer

The Databricks Data Engineer

Jakub Lasak 13 Episodes Jun 29, 2026

Helping 18k+ Databricks data engineers become seniors: interview like seniors, execute like seniors, think like seniors.

Episodes

How Photon actually makes your Databricks queries faster (and when it silently doesn't) Jun 29, 2026 00:10:46 Two engineers run the same SQL on the same Delta table. Same data, same cluster size, copy-pasted code. Alex goes to make a coffee and comes back to a query still running. Sam's is done before they finish reading the first Slack message. The only difference is one checkbox on the cluster called Photon.Most Databricks data engineers have that box ticked, pay a premium for it on every DBU, and c
The Databricks interview round nobody studies for (and almost everybody fails) Jun 22, 2026 00:11:36 Picture the debrief room after a Databricks loop. Two candidates went through that day. On paper, a coin flip: SQL tied, Spark internals solid, system design clean for both. Score only the rounds with a rubric and you cannot separate them. And yet the room isn't split. One gets the offer, and the thing that decided it wasn't any of the rounds they studied for.It was the conversation everyo
The Spark Shuffle is baggage claim: why your job waits instead of computes (and more workers won't fix it) Jun 15, 2026 00:11:09 Your Spark job has been running for forty minutes. The dashboard shows your cluster isn't even busy. So you do the obvious thing: add more workers. And it changes nothing.Here's why. During a shuffle, Spark is barely computing at all. It's tagging every row by destination, piling rows together, spilling the overflow to disk, and hauling data across the network between executors. It&#39
Your Databricks data quality framework is a Yeti: everyone talks about it, nobody has seen it work Jun 8, 2026 00:11:42 An architecture review. A platform team is presenting their data quality setup, and honestly, it's impressive. Expectations on every ingestion table. Drift metrics on the dashboard. A dedicated alerts channel. Then a finance engineer asks the only question that counts: when did this last catch something before one of us did? Silence.That silence is the whole problem. The decks, the suites, the
Why senior Databricks engineers write less code than mid-level ones Jun 2, 2026 00:11:08 Two engineers, same team, both five years in. Last quarter Mark shipped forty-seven pull requests across three pipelines. Sam shipped nine. On any dashboard, Mark wins by a mile. Sam got the staff offer. Mark got a kind note about continuing to demonstrate impact.This isn't politics, and it isn't luck. It's a pattern that specifically catches the engineers who are best at shipping, bec
4 habits that quietly turn your Databricks Delta Lake into a swamp May 26, 2026 00:11:31 You built the table right. Well-partitioned, documented, fast enough that the row count came back before you finished reading your own Slack. Six months later it takes four minutes to return that same count, and nobody on your team ever decided to make it that way. There was no meeting, no design doc, no ticket titled "let's make this unqueryable by Q3."A swamp is not a decision. It&
Liquid Clustering vs Z-Ordering: 4 questions that decide May 18, 2026 00:18:11 You open your Databricks workspace. Two Delta tables. Same size, same downstream BI workload. Table A was partitioned and z-ordered in 2023, runs fine. Table B is greenfield this quarter, liquid clustering by default. Your tech lead asks how aggressive you want to be with migration tickets. Whatever you type back is probably wrong.This is not a feature swap. It's a paradigm shift, and the migratio
The compounding curve: why some Databricks engineers' salaries grow 5x faster than others May 11, 2026 00:22:44 Year one. Two new juniors join the same Databricks platform org. Same starting salary, same skills, same desk. Year three, five thousand bucks apart. Year eight, household-car-and-a-half apart. Every year. Forever.Both worked hard. Both stayed technical. Both got positive reviews. Neither did anything wrong. So what happened? Salary in this field isn't one curve. It's two that look identic
The 90/9/1 rule of Databricks performance work - how to triage Spark optimization in 60 seconds May 4, 2026 00:17:21 Your team is three weeks into a Databricks performance push. Broadcast hints in PRs. AQE flags toggled like christmas lights. Partition counts re-tuned for the third time. The manager is asking, gently, when the gains are showing up in the bill.The staff DE on the next team finished theirs in two afternoons. Same workloads, bigger drop. They were running a triage you have never been taught.In this
The Databricks data engineer in 2026 - the four shifts that just changed your job Apr 27, 2026 00:18:30 You scroll past the cancelled junior req, the "serverless first" line on your director's planning slide, and the third Lakebase mention from your Databricks rep this quarter. Each one looks like a news item. None of them feel like they're about you.They are. Four structural shifts have already happened in the field, and the words "Databricks data engineer" don't mea
9 Behaviors Quietly Killing Your Promotion To Senior Databricks Data Engineer Apr 20, 2026 00:14:09 Mid-level is a down escalator. It looks like flat ground. You feel productive, your tickets close on Friday, your burndown chart is healthy, and your review says "reliable executor of well-defined work" for the third cycle in a row.That sentence is the official label for "not getting promoted this year" - and most Databricks data engineers never decode it. It isn't a skill
The Dashboard Theater: What Databricks Engineers Build That Nobody Opens Apr 13, 2026 00:15:48 You check the usage logs on a dashboard you spent two weeks building. Zero views. Not low views. Zero. The stakeholder who requested it hasn't logged in once. Three months later they ask the exact question the dashboard answers, in a meeting, out loud, as if the dashboard doesn't exist. Because for them, it doesn't.In this episode:- Why the most technically impressive Databricks dashbo

Recommended