
Platform Engineering Playbook Podcast
The Platform Engineering Playbook Podcast explores the intersection of AI and open-source infrastructure, offering data-backed insights and decision frameworks for senior engineers, SREs, and DevOps practitioners. Each episode is AI-researched and community-reviewed, with content published on GitHub for continuous improvement. Topics include cloud economics, AI governance, infrastructure trade-offs, and career strategy. The podcast aims to address real-world challenges like tool sprawl, PaaS cost justification, and the Shadow AI crisis.
Episodes
The Hidden Kubernetes Tax Costing Teams $43,800 a Year
**Is your company bleeding $43,800 annually on hidden Kubernetes costs?** Most platform teams have no idea they're paying this "invisible tax" – but the smartest engineers are already eliminating it.
In today's Platform Engineering Playbook, we expose the shocking truth about Kubernetes cost isolation and dive deep into virtual clusters as the solution. Plus, we break down the biggest platform eng
Your Kubernetes Stack Is Why AI Isn’t Shipping
**Why do 87% of AI models never reach production? It's not the AI - it's the infrastructure underneath.**
In this deep dive episode of Platform Engineering Playbook, we tackle the critical challenge of building cloud-native platforms that can actually support AI workloads at scale. While everyone's talking about model performance, the real bottleneck is happening at the infrastructure layer.
**Wha
AI Agents in Kubernetes Need Standards — Before Everything Breaks
**What happens when AI agents in your Kubernetes cluster start making their own scaling decisions without proper guardrails?**
In this episode of Platform Engineering Playbook, we dive deep into the emerging world of cloud native agentic standards and why they're becoming mission-critical for modern infrastructure. As AI agents become more autonomous in managing our clusters, the need for standard
AI Agents Are About to Break Kubernetes — Unless We Standardize Now
What happens when hundreds of AI agents start running in your Kubernetes cluster but can't communicate with each other? By 2026, this isn't a hypothetical problem—it's the reality platform engineers are facing right now.
In this episode of Platform Engineering Playbook, we dive deep into the CNCF's new cloud-native agentic standards and what they mean for your infrastructure. We'll break down why
How to Monitor LLMs in Production Before They Drain Your Budget
**Are you burning through your LLM budget with zero visibility into why?** You're not alone - 73% of production deployments are facing this exact problem right now.
In today's Platform Engineering Playbook, we tackle the monitoring crisis plaguing AI infrastructure and break down five game-changing developments reshaping how we deploy and secure production systems.
**🎯 What You'll Learn:**
• How t
Helm Security Is Broken. WebAssembly Fixes It.
**What if 94% of Helm chart vulnerabilities could be prevented with one unexpected technology?**
Today's Platform Engineering Playbook dives deep into the surprising intersection of WebAssembly and Kubernetes security, plus breaking news that every platform engineer needs to know.
**What You'll Learn:**
• How WebAssembly is revolutionizing Helm chart security (spoiler: it's not replacing Kubernete
The Kubernetes AI Pattern That Cuts GPU Costs
**87% of AI workloads are sitting idle on GPUs right now** - yet companies keep buying more hardware. What if the problem isn't capacity, but how we're running AI on Kubernetes?
In today's Platform Engineering Playbook, we tackle the massive inefficiencies plaguing AI infrastructure at scale. You'll discover why traditional Kubernetes patterns break down with AI workloads, what's actually happenin
You’re Monitoring the Wrong Kubernetes Metrics
**Are 73% of Kubernetes clusters really flying blind?** According to recent industry reports, most K8s deployments are drowning in meaningless metrics while missing the signals that actually matter for performance and cost optimization.
In today's Platform Engineering Playbook, we tackle the Kubernetes observability crisis head-on. You'll discover why traditional monitoring approaches are failing
The AI Security Hole Your Red Team Is Missing
**87% of enterprise AI deployments have a critical security vulnerability that red teams aren't even testing for.** Are you one of them?
In today's Platform Engineering Playbook, we expose the massive security hole plaguing enterprise AI systems and dive deep into prompt injection attacks that are slipping past traditional security measures. Plus, we cover the latest platform engineering news that
Your Kubernetes Monitoring Is Blind to AI Attacks
**Is your Kubernetes cluster blind to AI model poisoning attacks?** 73% of companies running AI workloads can't detect when their models are compromised - and traditional monitoring tools are completely useless against these threats.
In today's Platform Engineering Playbook, we dive deep into why AI workloads are breaking traditional Kubernetes observability strategies and what platform teams need
The 6 Types of AI Cloud Infrastructure
**87% of AI companies are burning cash on the wrong cloud infrastructure - and they have no idea.**
In this episode of Platform Engineering Playbook, we expose the costly mistakes plaguing AI infrastructure and reveal the framework that's helping platform teams save millions while scaling smarter.
**What You'll Learn:**
• The 6 categories of AI cloud infrastructure that matter in 2026
• How to tra
Why AI Code Is Killing Your Monitoring Budget
**Is your monitoring bill about to explode? AI-generated code is creating 10x more observability data than human-written code.**
In this deep dive episode of Platform Engineering Playbook, we unpack the hidden observability crisis that's quietly hitting DevOps teams everywhere. While AI accelerates development, it's also flooding your monitoring systems with unprecedented amounts of telemetry data
How Karpenter Fixes Kubernetes Autoscaling
**Are you throwing money away on Kubernetes compute costs?** 87% of clusters waste up to half their resources on idle nodes - but there's a solution that's changing everything.
In today's Platform Engineering Playbook, we dive deep into **Karpenter**, the game-changing autoscaler that's revolutionizing how teams think about Kubernetes resource management. You'll discover why traditional cluster au
AI Is Not the Problem — Your Infrastructure Is
**Why do 70% of AI projects crash and burn before they ever see production?** Spoiler alert: it's not the AI that's broken.
In today's Platform Engineering Playbook, we're diving deep into the AI infrastructure crisis that's keeping CTOs awake at night. While everyone's racing to deploy the latest AI models, most organizations are discovering their legacy systems simply can't handle the load.
**Wh
Why Kubernetes Doesn’t Scale Without an IDP
**Why do 97% of companies using Kubernetes never scale beyond their original expert team?** It's not a skills problem - it's an architecture problem that Internal Developer Platforms (IDPs) are uniquely positioned to solve.
In today's episode of Platform Engineering Playbook, we dive deep into the Kubernetes scaling crisis and explore how IDPs can democratize container orchestration across your en
The AWS Cost That Doesn’t Show Up in Cost Explorer
**What if your AWS bill has a hidden line item costing you thousands that doesn't even show up in Cost Explorer?**
Today on Platform Engineering Playbook, we expose the sneaky cloud costs that are bleeding your budget dry and dive deep into the AWS Well-Architected Framework's six pillars to help you architect cost-efficient, secure platforms.
**What You'll Learn:**
✅ How to identify and eliminate
87% of Ansible Playbooks Are Broken (AI Just Proved It)
**87% of production Ansible playbooks have critical flaws - but AI just revealed how to fix them.**
Today's Platform Engineering Playbook dives deep into how AI is revolutionizing infrastructure automation and Ansible development. We'll explore groundbreaking research showing most production playbooks lack proper error handling, and how collaborative AI approaches are changing the game for platfor
GrafanaCON 2026: The Agenda That Signals the Future of Observability
**GrafanaCON 2026 just dropped their agenda, and every attendee will build an AI agent from scratch on day one. What does this tell us about the future of platform engineering?**
In today's Platform Engineering Playbook, we dissect the GrafanaCON 2026 agenda to uncover what it reveals about emerging trends in observability and platform tooling. We analyze why hands-on AI workshops are becoming con
Can AI Run Your Production Systems?
What if your observability stack could debug and fix production issues while you sleep? That future might be closer than you think.
In today's Platform Engineering Playbook, we explore the cutting edge of agentic AI in observability systems and break down the biggest platform engineering news shaping March 2026.
**🎯 WHAT YOU'LL LEARN:**
• How self-healing observability stacks are revolutionizing p
Claude Went Down. The API Didn’t. Here’s Why.
What happens when a major AI platform goes dark while secretly pursuing billion-dollar government contracts? Claude's massive outage reveals critical lessons about platform engineering resilience that every infrastructure team needs to understand.
In today's Platform Engineering Playbook, we dissect Anthropic's Claude outage and uncover the hidden platform engineering challenges of serving classif
Backstage Is Becoming the Control Plane for Engineering
**What if Spotify's secret weapon for managing 2,800 microservices could transform your entire platform engineering strategy?**
Today's Platform Engineering Playbook dives deep into the Backstage revolution that's quietly reshaping how engineering teams operate at scale. We break down what a production-grade Backstage implementation actually looks like in 2026, complete with real-world examples an
The End of ingress-nginx: Kubernetes Migration Guide Before 2026
**70% of Kubernetes clusters will go dark in March 2026 when ingress-nginx support officially ends. Are you ready?**
Today's Platform Engineering Playbook dives deep into the massive ingress-nginx migration that's about to impact millions of Kubernetes workloads. We'll break down your migration options, timeline, and practical steps to avoid the chaos.
**What You'll Learn:**
✅ Why ingress-nginx is
Claude Code Remote Control Changes Developer Workflows
**What if 87% of developer productivity loss just became a thing of the past?**
Anthropic's Claude Computer Use capability is reshaping how platform engineers think about developer workflows, and today we're breaking down exactly what this means for your platform strategy.
**In this episode:**
• **Deep dive into Claude's Computer Use** - How remote control capabilities are eliminating context swi
Databricks Lakebase vs Postgres: The AI Database Shift
**Is PostgreSQL really obsolete for AI workloads?** Databricks just dropped Lakebase and it's shaking up everything we thought we knew about database architecture for machine learning pipelines.
In today's Platform Engineering Playbook, we're diving deep into Databricks' game-changing announcement and what it means for your data infrastructure strategy. Plus, we're covering the week's biggest plat
How to Secure AI Agents with MCP, OPA & Ephemeral Runners
**Your AI agents have root access to your infrastructure right now - and you don't even know it.**
What happens when we give AI agents the keys to our entire platform? In today's Platform Engineering Playbook, we dive deep into the hidden security risks of AI infrastructure automation and explore practical solutions for implementing least-privilege access controls.
**What You'll Learn:**
• How to
Cloudflare Takes Down the Internet Again — With a Config Change
**What happens when a single configuration change takes down 20% of the internet for six hours?**
In this episode of Platform Engineering Playbook, we dissect the massive Cloudflare outage from February 20th, 2026 - a catastrophic failure that started with a routine BYOIP pipeline update and ended with Cloudflare accidentally deleting their own customers' networks.
**What You'll Learn:**
• The tec
The Next Platform Engineer: AI + Observability + FinOps
**Is AI about to revolutionize how we build infrastructure? The CNCF CTO says we're not prepared for what's coming.**
In this episode of Platform Engineering Playbook, we dive deep into the future of cloud native infrastructure and why 2026 might be the year everything changes. Based on Chris Aniszczyk's latest insights, we explore how AI agents are moving beyond just consuming our platforms to ac
Ray + Kubernetes: The Production AI Stack Explained
**Why do 92% of ML models never reach production?** It's not a code problem—it's a platform engineering problem.
In today's episode of Platform Engineering Playbook, we tackle the massive infrastructure gap that's keeping AI initiatives stuck in notebooks while your data science teams wonder why their brilliant models never see the light of day.
**What You'll Learn:**
✅ The real reasons ML models
Replace 5 Databases with 1? SurrealDB for AI Agents Explained
Your AI agents are using five different databases right now - and you don't even know it. This database sprawl is silently killing your platform's performance and your team's sanity.
In today's Platform Engineering Playbook, we dive deep into SurrealDB's multi-model approach and how it's revolutionizing AI infrastructure. Plus, breaking news on vulnerability management patterns that every platform
Agoda’s API Agent Turns Any API into MCP — No Code, No Deployments
**What if API integration nightmares could disappear without writing a single line of code?**
Agoda just dropped a game-changing solution that transforms any API into MCP (Model Context Protocol) with zero deployments - and it's about to reshape how platform teams approach AI integrations.
In today's Platform Engineering Playbook, we break down this revolutionary no-code approach and explore what
LocalStack Kills Community Edition: What Breaks in March
**LocalStack just killed their open-source edition - but what does this really mean for your platform engineering stack?**
In today's episode of Platform Engineering Playbook, we break down LocalStack's shocking decision to discontinue their Community Edition and what it means for teams relying on AWS local development. Plus, we dive into the ripple effects across the developer ecosystem and provi
OpenTofu vs Terraform: What Enterprise Teams Are Actually Doing (2026)
**Is your infrastructure strategy about to become obsolete?** By 2025, half of all Terraform installations could be running OpenTofu - and the implications for platform engineering teams are massive.
In today's deep dive, we break down the OpenTofu vs. Terraform battle that's reshaping infrastructure as code. You'll learn the real mechanics behind migrating between these tools, practical decision
Why Databases Inside Kubernetes Are Becoming Technical Debt
**Is running databases in Kubernetes about to become legacy technical debt overnight?** By 2026, the inference cloud revolution is forcing platform engineers to completely rethink database architecture - and the implications are massive.
In today's deep dive, we break down the "container paradox" that's reshaping how we think about stateful workloads in Kubernetes. You'll discover why the rise of
47% of CNCF Projects Slowed Down in 2025 — Why That’s Actually Good News
**Why did 47% of CNCF projects slow down their development velocity in 2025 — and why platform engineers should celebrate this trend?**
In today's Platform Engineering Playbook, we decode what declining commit velocity across cloud native projects actually reveals about infrastructure maturity and what it means for your platform strategy.
**What You'll Learn:**
• How to interpret CNCF project velo
The Claude Skills That Stop AI From Writing Dangerous Infrastructure as Code
**Are 87% of DevOps teams unknowingly creating security vulnerabilities with AI-generated infrastructure code?**
Today's Platform Engineering Playbook dives deep into the hidden risks of AI in DevOps workflows and reveals the specialized skills that top-performing teams use to harness AI safely and effectively.
**What You'll Learn:**
• Why AI-generated infrastructure code is creating blind spot vu
Docker vs Nix: Why Your Builds Aren’t Actually Reproducible
97% of Docker containers can't reproduce the exact same build six months later—what does this mean for platform engineering, and why should you care?
In today's episode of the Platform Engineering Playbook, we delve into the critical issue of reproducibility in Docker containers. Discover why this seemingly technical detail could significantly impact your workflows and productivity. We'll explore
The Data Canary Pattern: How Netflix Prevents Bad Metadata Deploys
**What happens when 2 billion daily metadata events could crash Netflix's entire platform with one bad transformation?**
Today's Platform Engineering Playbook dives deep into Netflix's Data Canary system - a masterclass in building trust and validation into your data pipelines at scale. Plus, we cover the latest platform engineering news that's reshaping how we deploy and monitor distributed syste
Claude Opus 4.6: The First AI That Feels Like a Teammate
**Claude Opus 4.6 just demolished GPT-4 on every coding benchmark - and it's about to reshape how we think about platform engineering automation.**
In today's episode, we break down Anthropic's game-changing AI release and what it means for platform teams worldwide. We dive deep into the autonomous capabilities that could revolutionize how we handle infrastructure operations, but also explore the
Autonomous AI in DevOps Is Here — And Most Teams Are Doing It Wrong
**Will 87% of DevOps teams really be obsolete by 2026?** As AI agents take control of production infrastructure, we're witnessing the biggest transformation in platform engineering history.
In today's episode, we dive deep into **autonomous AI agents in DevOps workflows** and explore how they're reshaping everything from monitoring to incident response. You'll discover real-world examples of AI ag
Kubernetes Is Retiring Ingress NGINX (And 50% of Clusters Aren’t Ready)
"90% of Kubernetes clusters are running Ingress NGINX—abandoned in 16 months with zero maintainers left! What does this mean for your production systems? In this episode, we dive deep into the urgent need for migration and the alternatives available as the clock ticks down.
With the retirement of Ingress NGINX set for March 2026, it's critical to understand how this affects millions of deployments
OpenAI’s New macOS App: Is Agentic Coding Finally Here?
**OpenAI just made 73% of coding assistants obsolete overnight - but what does this mean for platform engineers?**
Today's episode breaks down OpenAI's game-changing macOS app for "agentic coding" and its massive implications for platform engineering workflows. We'll analyze why this isn't just another coding assistant, but a fundamental shift in how we approach infrastructure automation and devel
98% of Container CVEs Are Hiding Where You’re Not Scanning
**Are your container security scans missing 98% of critical vulnerabilities?** New research from Chainguard reveals a shocking blind spot that could be exposing your infrastructure to massive security risks.
In today's Platform Engineering Playbook, we unpack this bombshell finding and explore why traditional container scanning is failing at scale. You'll discover where these hidden vulnerabilitie
Why Forward-Deployed Engineers Are Making $300K+ (And Why Companies Are Desperate for Them)
Why are forward-deployed engineers making 40% more than traditional backend developers, and why can't companies hire enough of them?
In today's Platform Engineering Playbook, we dive deep into tech's hottest new role and explore three critical platform engineering developments reshaping the industry.
**What You'll Learn:**
• The explosive rise of forward-deployed engineers and why they're commandi
AWS DevOps Agent in Production: What Most Teams Get Wrong
**Why do 73% of AWS DevOps Agent deployments crash and burn in their first week?** It's not what you think.
In this episode of Platform Engineering Playbook, we uncover the hidden culprits behind these shocking failure rates and reveal the systematic approach that separates successful platform teams from the rest.
**What You'll Learn:**
• The real reasons AWS DevOps Agent deployments fail (hint: i
AI Agents Are Rewriting the SRE Playbook (For Better or Worse)
What if AI agents could flip the script on SRE work, turning 87% of firefighting into 87% prevention? That's exactly what's happening in the "agentic revolution" transforming platform engineering teams.
In today's Platform Engineering Playbook, we dive deep into how AI agents are reshaping SRE workflows and what this means for your platform strategy. We'll cut through the hype to examine the real-
DevOps Is Dead — Platform Engineering Replaced It
**DevOps is dead - and the companies that created it are the ones pulling the trigger.** But what's replacing it might be the most significant shift in software delivery since containerization. In today's Platform Engineering Playbook, we dive deep into how Internal Developer Platforms are fundamentally reshaping the DevOps landscape. We'll explore why platform engineering has shed its experimenta
47 Countries Went Offline — What Platform Engineers Must Learn From It
**What happens when 47 countries lose internet access in just 3 months—and it's not cyberattacks?**
Today's Platform Engineering Playbook dives deep into the shocking Q4 2025 internet disruption data that reveals critical infrastructure vulnerabilities every platform engineer needs to understand. We'll analyze how cable cuts, storms, and DNS failures brought down entire regions, and more important
Two Missing Characters Nearly Compromised AWS’s Supply Chain
**What if two missing characters could compromise every AWS-managed GitHub repository?** That's exactly what happened in a critical regex vulnerability that exposed massive supply-chain risks.
In today's Platform Engineering Playbook, we break down this shocking security flaw and explore how platform engineers can protect their infrastructure from similar attacks. You'll discover the technical det
Kubernetes Just Became Essential for AI Growth (CNCF Report)
**Why will 90% of AI workloads fail without Kubernetes in the next 18 months?** Most platform teams are walking into a disaster they can't see coming. In today's Platform Engineering Playbook, we break down the CNCF's shocking new survey results showing 82% of organizations are unprepared for AI infrastructure demands. Plus, we cover the Cloudflare BGP incident t hat took down major services and w
ChatGPT Scales PostgreSQL to power 800 million users
OpenAI is running ChatGPT for ~800 million users on PostgreSQL — and according to their own disclosures, it’s actually working.
In this episode of the Platform Engineering Playbook Daily Podcast, we break down how PostgreSQL was pushed to hyperscale, the architectural tradeoffs behind a single-primary model, and the operational playbook that makes this kind of scale possible.
This isn’t a gene
3 Skills You Need to Transition to Platform Engineer
**Will 70% of DevOps engineers disappear in the next 5 years?** That's the bold prediction kicking off today's deep dive into the massive career shift happening in tech right now.
In this episode of Platform Engineering Playbook, we explore the critical transition from DevOps to Platform Engineering and what it means for your career survival. You'll discover why traditional DevOps roles are evolvi
The Infrastructure Monitoring Tools Teams Regret Choosing
The monitoring tool everyone trusts is actually blind to 40% of your infrastructure failures—and the vendor knows it. Are you using an industry standard that misses almost half of all incidents? In this episode, we unravel the mystery of infrastructure monitoring tools and why your choice could be costing you dearly.
As platform engineering teams grapple with an overwhelming array of options—from
Your CI/CD Pipeline is a Debt Trap
**73% of engineering teams are drowning in technical debt because of their CI/CD pipelines. Not despite them—because of them.**
Are your automation tools secretly sabotaging your codebase? Today's Platform Engineering Playbook dives deep into the hidden ways CI/CD pipelines create technical debt and reveals practical strategies to break the cycle.
**What You'll Learn:**
• Why inheritance beats cop
Kubernetes Just Revolutionized Learning — Get Ahead Now!
**Are major tech companies secretly abandoning Kubernetes certifications?** What we discovered about the future of K8s learning will change how you approach platform engineering in 2026.
In today's Platform Engineering Playbook, we uncover why traditional Kubernetes education is becoming obsolete and what platform teams are doing instead. Plus, breaking news that could revolutionize your infrastru
How AWS's New Euro Cloud Changes Data Control Forever
"92% of European companies don’t trust US cloud providers with their data anymore. So, AWS just locked itself out of its own Euro Cloud! This shocking move raises critical questions about data sovereignty and compliance for businesses operating in Europe.
In this episode, we dive deep into AWS's groundbreaking decision to create a completely isolated European cloud infrastructure, one that even A
Why Pulumi's New Move Could Change Terraform Forever
Terraform’s biggest competitor just made a move that could redefine infrastructure-as-code in 2026.
Pulumi now runs Terraform and HCL natively—better than HashiCorp does. That’s not a migration tool, not a compatibility shim, but full native execution through the Pulumi engine, plus Terraform state hosted in Pulumi Cloud and financial credits to help teams exit existing HashiCorp contracts.
In thi
Astro Joins Cloudflare: What It Means for Platform Engineers
Cloudflare acquires the Astro Technology Company, adding a 1M-downloads-per-week web framework to their edge platform. We analyze the strategic implications, what stays open source, and lessons about framework sustainability for platform engineering teams.
Key Topics:
- Astro framework overview: islands architecture, framework-agnostic components, content-first approach
- Why Cloudflare acquired A
ScyllaDB X Cloud Challenges DynamoDB Cost and Performance
ScyllaDB just launched X Cloud with claims of double the performance at half the cost compared to DynamoDB. This episode breaks down the technical architecture behind their tablet-based approach, how they're achieving 80% data compression on ARM Graviton4 instances, and when this actually makes sense for platform engineering teams running high-throughput workloads.
Key Topics:
- ScyllaDB X Cloud t
Invisible Linux Malware: The Undetectable Threat to Your Cloud Infrastructure
Your Linux servers aren't just running containers anymore—they're hosting invisible tenants that security teams can't even detect.
In this episode, we deep dive into VoidLink, the new cloud-native malware framework that Check Point Research just uncovered. This isn't your typical malware that got retrofitted for the cloud—this thing was born in the cloud, designed from the ground up to evade every
The AI-Cloud Native Symbiosis - How Intelligent Infrastructure is Transforming Platform Engineering
By 2025, 90% of new enterprise applications will be AI-powered and cloud-native. This episode explores the symbiotic relationship between AI and Kubernetes - where AI isn't just another workload, but is fundamentally transforming how we build and operate cloud native platforms. We cover real-world examples like Netflix's predictive scaling achieving 92% accuracy, the emergence of AI-driven observa
MIT 10 Breakthrough Technologies 2026 - The Platform Engineering Perspective
MIT just released their 10 Breakthrough Technologies for 2026 - and three of them are infrastructure problems that platform engineers are solving right now. This episode explores hyperscale AI data centers consuming 96 GW globally by 2026, vibe coding with 41% of code now AI-generated, and LLM interpretability research from Anthropic. We break down how platform engineers enable these breakthroughs
AWS Route 53 Global Resolver - Enterprise DNS Security at the Edge
Every DNS query your hybrid environment makes could be exposing sensitive data. AWS Route 53 Global Resolver, announced at re:Invent 2025, combines anycast routing, encrypted DNS protocols (DoH/DoT), and managed threat filtering in a single service.
In this episode, we cover:
- Anycast DNS architecture routing to nearest of 11 AWS regions
- DoH and DoT encrypted DNS protocol support
- AWS RAM auth
Kubernetes Upcoming Features Deep Dive - Extended Toleration Operators and Mutable PV Node Affinity
There's a Kubernetes cluster out there right now burning ten thousand dollars a month on GPU nodes that sit idle sixty percent of the time. Why? Because the scheduler can't say "only schedule pods on nodes with MORE than four GPUs." It's 2026, and our scheduler still can't count. But that's about to change.
In this episode, we dive deep into two alpha features in Kubernetes 1.35 that represent a f
Why Is a 2016 AWS Instance Still the Best Value? (Cloudspecs Research)
New research from TUM reveals uncomfortable truths about cloud hardware stagnation. The paper "Cloudspecs: Cloud Hardware Evolution Through the Looking Glass" shows that the best-performing AWS instance for NVMe I/O per dollar was released in 2016 - and nothing since has come close.
In this episode:
• CIDR 2026 research from Technical University of Munich
• AWS i3 instances from 2016 still beat al
Iran IPv6 Blackout - When Governments Weaponize Protocol Transitions
The same IPv6 transition your infrastructure team has been procrastinating on is now being weaponized by governments. On January 8, 2026, Iran's IPv6 address space dropped 98.5% while IPv4 remained intact—a surgical strike against mobile users.
In this episode, we break down:
- Why blocking IPv6 specifically targets mobile users (hint: carrier NAT exhaustion)
- The BGP mechanics of protocol-specif
Venezuela BGP Anomaly - Deep Technical Analysis
A deep technical dive into the January 2026 Venezuela BGP route leak incident. Was it a cyberattack? The technical evidence says no - and that's actually more concerning.
In this special deep-dive episode (no news segment), Jordan and Alex break down:
- What actually happened on January 2, 2026 with AS8048 (CANTV, Venezuela's state ISP)
- Why 10x AS-path prepending proves this was misconfiguration
HolmesGPT: AI Root Cause Analysis for Kubernetes
Deep dive into HolmesGPT, the CNCF Sandbox AI agent that revolutionizes cloud-native troubleshooting. This episode covers what it is, its 40+ integrations, the project roadmap, and how to set it up today.
News Segment:
AirFrance-KLM's secure automation platform with Terraform, Vault, and Ansible
AWS ECS tmpfs mounts on Fargate for secure secrets handling
Qwen 30B running on Raspberry Pi - democra
Docker Kanvas: Infrastructure as Design
Docker just launched Kanvas, a visual tool that turns your architecture diagrams into deployable infrastructure. Built on Meshery (CNCF's 6th highest-velocity project), it converts Docker Compose files to Kubernetes manifests and challenges Helm and Kustomize dominance.
In this episode, we explore:
- The dev-to-prod gap that Kanvas solves
- How Meshery Models add semantic understanding to infrastr
Remote MCP Architecture - Running AI Tool Servers on Kubernetes
The MCP server registry hit 10,000+ integrations, but most teams are running these servers on laptops. This episode breaks down the production architecture that Google, Red Hat, and AWS are converging on: remote MCP servers deployed on Kubernetes. We cover three deployment patterns (local stdio, remote HTTP/SSE, and managed), the critical difference between wrapper-based and native API implementat
AWS DevOps Agent - Promises vs Reality
AWS launched DevOps Agent at re:Invent 2025 as an "autonomous on-call engineer." But before you cancel your PagerDuty subscription, we separate marketing from mechanics.
NEWS THIS EPISODE:
• KubeCon Europe 2026: March 23-26 in Amsterdam, 224 sessions across 5 tracks
• Platform Engineering 2026 Predictions: Agentic infrastructure becomes standard
In this deep-dive episode, we cover:
WHAT IT PROMISE
AWS Graviton5: 192 Cores, 5x Cache - ARM Takes Over the Data Center
AWS doubled the core count on their flagship ARM processors with Graviton5—192 cores in a single socket, 5x L3 cache (180MB), and 3nm fabrication. We go deep on ARM vs x86 architecture, cache hierarchy latencies, NUMA elimination benefits, formal verification security proofs, and a complete migration framework with multi-arch CI/CD patterns. With 98% of top EC2 customers already on Graviton, the A
Can OpenTelemetry Save Observability in 2026?
OpenTelemetry has won the instrumentation wars with 95% adoption predicted for 2026. But winning data collection doesn't solve observability's real problems: spiraling costs, signal-to-noise ratios declining, and too much distance between seeing a problem and fixing it.
In this episode, we break down:
• Netflix's evolution to high-cardinality analytics processing 1M+ spans per episode
• The cost-c
When Serverless Fails: Unkey's 6x Performance Migration to Containers
Why did an API key management platform abandon edge serverless for stateful containers? Unkey hit 30ms p99 cache latency when they needed sub-10ms—so they rebuilt everything on AWS Fargate. This episode covers the technical decision-making framework for choosing between serverless and containers, plus a deep dive into Kubernetes 1.35's new structured z-pages for debugging.
In This Episode:
- The s
From Alert Fatigue to Signal-Driven Ops: The Observability Shift
Why do 73% of organizations experience outages from alerts they ignored? This episode breaks down the technical shift from reactive thresholds to SLO-driven observability. Learn multi-window burn-rate alerting patterns, AIOps implementations that actually work, and an 8-week migration path to cut alert noise by 80%.
In This Episode:
- The alert fatigue paradox: 2000+ weekly alerts with only 3% act
Security Ops Specialty: The Underrated Skill Every Platform Engineer Needs in 2026
Platform engineers who understand security operations—secrets management, vulnerability scanning, and compliance automation—are commanding premium salaries in 2026. This episode breaks down the security ops specialty: what it includes, why organizations are desperate for it, and how to build these skills alongside your existing platform engineering expertise.
In this episode:
• Security ops specia
Agentic AI Foundation - MCP and the Future of AI-Native Platform Engineering
The Linux Foundation announced the Agentic AI Foundation (AAIF) on December 9, 2025, bringing together AWS, Anthropic, Google, Microsoft, OpenAI, Block, Cloudflare, and Bloomberg. This episode breaks down MCP (Model Context Protocol) - the "HTTP for AI" with 97M+ monthly downloads.
📰 NEWS: Docker hardened images now free, MongoBleed CVE patch alert, Cloudflare "Fail Small" resilience plan, DORA me
FinOps 2026 for Platform Engineers: The Complete Skills Guide
FinOps is becoming an essential skill for platform engineers in 2026. This episode provides a complete guide to the skills, certifications, and tools you need to add cloud cost management to your platform engineering toolkit.
📰 News Segment:
• GPG.fail documents 14 critical GnuPG vulnerabilities - check your signing tools
• MongoBleed CVE-2025-14847: Critical MongoDB exploit - patch immediately
•
Platform Engineering Salary Report 2026: Skills That Pay
Platform engineers are commanding $172K-$207K in 2026, a 13-27% premium over DevOps roles. This episode breaks down salary benchmarks from Dice, Motion Recruitment, and Levels.fyi, revealing which skills are S-tier ($200K+) and which are table stakes.
We cover:
- Platform Engineer vs DevOps salary gap (13-27% premium)
- S-tier skills: LLM/GenAI ($195K-$312K), Platform Engineering, DevSecOps, MLOps
Platform Engineering 2026 Predictions Roundup (Platform Engineering 2026 Look Forward Series - Part 5/5)
The series finale of our five-part Platform Engineering 2026 Look Forward Series. We synthesize everything from agentic AI operations, mainstream adoption, developer experience metrics, and boring Kubernetes into ten concrete predictions for 2026. Learn what to invest in versus ignore, and discover our 2026 platform engineering thesis.
In this episode:
- High confidence predictions: IDP market con
Kubernetes Enters the Boring Era (Platform Engineering 2026 Look Forward Series - Part 4/5)
The best thing happening to Kubernetes in 2026 is that it's becoming boring. After a decade of explosive innovation, Kubernetes is entering its "mature infrastructure" phase - stable, predictable, and increasingly invisible. Like Linux and PostgreSQL before it, boring Kubernetes enables platform teams to build abstractions without worrying about breaking changes. Part of the Platform Engineering 2
Recommended

100% Mixtape Podcast

100 With The Hunter's

10-41: A UCSO Podcast

108.3 WGKSRADIO DEEP HOUSE PARTY

10 at a Time

10Fold Founders

10% Happier with Dan Harris

10-Minute Contrarian

10 Minutes Korean - Learn Korean & English Naturally

10 Minutes with Jesus

10 Minute Teacher Podcast with Cool Cat Teacher

10 minutos con Jesús