Vanishing Gradients

The Future of Agentic Data Science May 25, 2026 3877 So I think we’re really at a historical moment, and the opportunity is massive. Almost 15 years ago, we were promised that data science was going to be this incredible thing and create all this value for people. And I think nowadays it’s mostly viewed as a cost center in most companies. I think we can really now fulfill that original promise with agentic data science. Thomas Wiecki, Co-creator of

Agent-Harness.ipynb* May 20, 2026 4786 One thing that I don’t like about Claude is that you get into this weird mental state: oh, I think I trust the model. Let’s do the slot machine. Hit click, which puts you in an inactive mode of thinking. Maybe it’s better to use a worse model….Vincent Warmerdam, senior data professional and prolific open-source maintainer (some packages with over a million downloads), now Engineer at marimo, join

Agentic Engineering and the Lost Art of Verification May 12, 2026 5546 > I almost don’t read code now. My approach with Roborev is it’s like my code reader. The mantra is: Roborev reads every line of code that is generated. It gets read multiple times. And so, whenever I push up a pull request, the branch gets re-reviewed. And so by the time I’m merging a pull request into a repository, the code has all been read by agents four or five times minimum. I look at the c

Next Level AI Evals for 2026 Apr 23, 2026 3214 There are a lot of reasons why we should do AI evals. For many companies doing AI evals is the way to build the feedback loop into the product development lifecycle. So it is like your compass. We’re using AI evals as a compass to guide product development and also product iteration. And also, many times we need evals to function as the pass or fail gate in release decisions. Whether this product

Privacy Theater Is Not Privacy Engineering: What It Actually Takes to Ship Safe AI Apr 15, 2026 3991 Katharine Jarmul, Privacy in ML/AI Expert & Author of Practical Data Privacy, joins Hugo to unpack why most AI privacy advice is theater: and what technical privacy actually looks like when you’re shipping LLMs, agents, and multimodal systems into the real world.In this episode, we dig into how to build defensible systems in an era of AI agents and multimodal models: why system prompts (and your

LLM Architecture in 2026: What You Need to Know with Sebastian Raschka Apr 13, 2026 4682 If you take a model release as an anchor point, let’s say Nemotron 3 or Qwen 3.5, you can go in both directions: You can either plug them into an agent and play around with that, or you can look, okay, what does the model look like under the hood? What are the ingredients? What type of attention mechanism do they use? What are currently research techniques that could make that even better in the n

Episode 72: Why Agents Solve the Wrong Problem (and What Data Scientists Do Instead) Mar 20, 2026 5619 I often see what I would consider to be b******t evals, especially in data, like write this dumb SQL. Almost every one of these dumb SQL questions that I’ve seen for benchmarks are just so either obviously easy or overwhelmingly adversarial. They just, they don’t feel valuable as a data scientist, it’s something that you probably would never ask a real data scientist to do. So I went out my way to

Episode 71: Durable Agents - How to Build AI Systems That Survive a Crash with Samuel Colvin Feb 18, 2026 3087 Our thesis is that AI is still just engineering… those people who tell us for fun and profit, that somehow AI is so, so profound, so new, so different from anything that’s gone before that it somehow eclipses the need for good engineering practice are wrong. We need that good engineering practice still, and for the most part, most things are not new. But there are some things that have become more

Episode 70: 1,400 Production AI Deployments Feb 12, 2026 4192 There’s a company who spent almost $50,000 because an agent went into an infinite loop and they forgot about it for a month.It had no failures and I guess no one was monitoring these costs. It’s nice that people do write about that in the database as well. After it happened, they said: watch out for infinite loops. Watch out for cascading tool failures. Watch out for silent failures where the agen

Episode 69: Python is Dead. Long Live Python! With the Creators of pandas & Parquet Feb 3, 2026 3327 > It’s the agent writing the code. And it’s the development loop of writing the code, building testing, write the code, build test and iterating. And so I do think we’ll see for many types of software, a shift away from Python towards other programming languages. I think Go is probably the best language for those like other types of software projects. And like I said, I haven’t written a line of G

Episode 68: A Builder’s Guide to Agentic Search & Retrieval with Doug Turnbull & John Berryman Jan 23, 2026 5322 The best way to build a horrible search product? Don’t ever measure anything against what a user wants.Search veterans Doug Turnbull (Led Search at Reddit + Shopify; Wrote Relevant Search + AI Powered Search) and John Berryman (Early Engineer on Github Copilot; Author of Relevant Search + Prompt Engineering for LLMs), join Hugo to talk about how to build Agentic Search Applications.We Discuss:* Th

Episode 67: Saving Hundreds of Hours of Dev Time with AI Agents That Learn Jan 14, 2026 4702 This is continual learning, right? Everyone has been talking about continual learning as the next challenge in AI. Actually, it’s solved. Just tell it to keep some notes somewhere. Sure, it’s not, it’s not machine learning, but in some ways it is because when it will load this text file again, it will influence what it does … And it works so well: it’s easy to understand. It’s easy to inspect, it

Episode 66: The Agent Paradox - Why Moderna's Most Productive AI Systems Aren't Agents Jan 8, 2026 2578 Surprise. We don’t have agents. I actually went in and did an audit of all the LLM applications that we’ve developed internally. And if you were to take Anthropic’s definition of workflow versus agent, we don’t have agents. I would not classify any of our applications as agents. xEric Ma, who leads Research Data Science in the Data Science and AI group at Moderna, joins Hugo on moving past the hyp

Episode 65: The Rise of Agentic Search Dec 19, 2025 3113 We’re really moving from a world where humans are authoring search queries and humans are executing those queries and humans are digesting the results to a world where AI is doing that for us.Jeff Huber, CEO and co-founder of Chroma, joins Hugo to talk about how agentic search and retrieval are changing the very nature of search and software for builders and users alike.We Discuss:* “Context engin

Episode 64: Data Science Meets Agentic AI with Michael Kennedy (Talk Python) Dec 3, 2025 3776 We have been sold a story of complexity. Michael Kennedy (Talk Python) argues we can escape this by relentlessly focusing on the problem at hand, reducing costs by orders of magnitude in software, data, and AI.In this episode, Michael joins Hugo to dig into the practical side of running Python systems at scale. They connect these ideas to the data science workflow, exploring which software enginee

Episode 63: Why Gemini 3 Will Change How You Build AI Agents with Ravin Kumar (Google DeepMind) Nov 22, 2025 3613 Gemini 3 is a few days old and the massive leap in performance and model reasoning has big implications for builders: as models begin to self-heal, builders are literally tearing out the functionality they built just months ago... ripping out the defensive coding and reshipping their agent harnesses entirely.Ravin Kumar (Google DeepMind) joins Hugo to breaks down exactly why the rapid evolution of

Episode 62: Practical AI at Work: How Execs and Developers Can Actually Use LLMs Oct 31, 2025 3544 Many leaders are trapped between chasing ambitious, ill-defined AI projects and the paralysis of not knowing where to start. Dr. Randall Olson argues that the real opportunity isn't in moonshots, but in the "trillions of dollars of business value" available right now. As co-founder of Wyrd Studios, he bridges the gap between data science, AI engineering, and executive strategy to deliver a practic

Episode 61: The AI Agent Reliability Cliff: What Happens When Tools Fail in Production Oct 16, 2025 1684 Most AI teams find their multi-agent systems devolving into chaos, but ML Engineer Alex Strick van Linschoten argues they are ignoring the production reality. In this episode, he draws on insights from the LLM Ops Database (750+ real-world deployments then; now nearly 1,000!) to systematically measure and engineer constraint, turning unreliable prototypes into robust, enterprise-ready AI.Drawing f

Episode 60: 10 Things I Hate About AI Evals with Hamel Husain Sep 30, 2025 4396 Most AI teams find "evals" frustrating, but ML Engineer Hamel Husain argues they’re just using the wrong playbook. In this episode, he lays out a data-centric approach to systematically measure and improve AI, turning unreliable prototypes into robust, production-ready systems.Drawing from his experience getting countless teams unstuck, Hamel explains why the solution requires a "revenge of the da

Episode 59: Patterns and Anti-Patterns For Building with AI Sep 23, 2025 2857 John Berryman (Arcturus Labs; early GitHub Copilot engineer; co-author of Relevant Search and Prompt Engineering for LLMs) has spent years figuring out what makes AI applications actually work in production. In this episode, he shares the “seven deadly sins” of LLM development — and the practical fixes that keep projects from stalling. From context management to retrieval debugging, John explains

Episode 58: Building GenAI Systems That Make Business Decisions with Thomas Wiecki (PyMC Labs) Sep 9, 2025 3645 While most conversations about generative AI focus on chatbots, Thomas Wiecki (PyMC Labs, PyMC) has been building systems that help companies make actual business decisions. In this episode, he shares how Bayesian modeling and synthetic consumers can be combined with LLMs to simulate customer reactions, guide marketing spend, and support strategy. Drawing from his work with Colgate and others, Th

Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank) Aug 29, 2025 2488 While many people talk about “agents,” Shreya Shankar (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply. Drawing from work on projects ranging from databases of police misconduct reports to large-scale customer transcripts, Shreya explains the frameworks,

Episode 56: DeepMind Just Dropped Gemma 270M... And Here’s Why It Matters Aug 14, 2025 2741 While much of the AI world chases ever-larger models, Ravin Kumar (Google DeepMind) and his team build across the size spectrum, from billions of parameters down to this week’s release: Gemma 270M, the smallest member yet of the Gemma 3 open-weight family. At just 270 million parameters, a quarter the size of Gemma 1B, it’s designed for speed, efficiency, and fine-tuning. We explore what makes 2

Episode 55: From Frittatas to Production LLMs: Breakfast at SciPy Aug 12, 2025 2289 Traditional software expects 100% passing tests. In LLM-powered systems, that’s not just unrealistic — it’s a feature, not a bug. Eric Ma leads research data science in Moderna’s data science and AI group, and over breakfast at SciPy we explored why AI products break the old rules, what skills different personas bring (and miss), and how to keep systems alive after the launch hype fades. You’ll h

Episode 54: Scaling AI: From Colab to Clusters — A Practitioner’s Guide to Distributed Training and Inference Jul 18, 2025 2478 Colab is cozy. But production won’t fit on a single GPU.Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software.We talk through: • From Colab to clusters: why scaling isn

Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin on Shipping Reliable LLMs Jul 8, 2025 2690 Demos are easy; durability is hard. Samuel Colvin has spent a decade building guardrails in Python (first with Pydantic, now with Logfire), and he’s convinced most LLM failures have nothing to do with the model itself. They appear where the data is fuzzy, the prompts drift, or no one bothered to measure real-world behavior. Samuel joins me to show how a sprinkle of engineering discipline keeps th

Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them) Jul 2, 2025 1718 Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries?In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy demos and ship LLM features that actually help people

Episode 51: Why We Built an MCP Server and What Broke First Jun 26, 2025 2862 What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data?In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. More recently, he and his team built a production-re

Episode 50: A Field Guide to Rapidly Improving AI Products -- With Hamel Husain Jun 17, 2025 1662 If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post A Field Guide t

Episode 49: Why Data and AI Still Break at Scale (and What to Do About It) Jun 5, 2025 4906 If we want AI systems that actually work in production, we need better infrastructure—not just better models.In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execution.They discuss:🔁 Why reactive execution matters—a

Episode 48: How to Benchmark AGI with Greg Kamradt (ARC-AGI) May 23, 2025 3866 If we want to make progress toward AGI, we need a clear definition of intelligence—and a way to measure it.In this episode, Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that focus on memorization or task completion, ARC is de

Episode 47: The Great Pacific Garbage Patch of Code Slop with Joe Reis Apr 7, 2025 4753 What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed?In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between.Joe is the co-author of Fundamentals of Data Engineering and a longtime voice on the sys

Episode 46: Software Composition Is the New Vibe Coding Apr 3, 2025 4137 What if building software felt more like composing than coding?In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reason about systems, and collaborate across roles.Hugo

Episode 45: Your AI application is broken. Here’s what to do about it. Feb 20, 2025 4650 Too many teams are building AI applications without truly understanding why their models fail. Instead of jumping straight to LLM evaluations, dashboards, or vibe checks, how do you actually fix a broken AI app? In this episode, Hugo speaks with Hamel Husain, longtime ML engineer, open-source contributor, and consultant, about why debugging generative AI systems starts with looking at your data.

Episode 44: The Future of AI Coding Assistants: Who’s Really in Control? Feb 4, 2025 5652 AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with Tyler Dunn, CEO and co-founder of Continue, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows.In this episode, we dive into:- The trade-offs between proprietary vs. open-source AI coding a

Episode 43: Tales from 400+ LLM Deployments: Building Reliable AI Agents in Production Jan 16, 2025 3663 Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex's extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn't—when deploying AI agents in production.In this episode, we dive into:- The current state of AI agents in produ

Episode 42: Learning, Teaching, and Building in the Age of AI Jan 4, 2025 4804 In this episode of Vanishing Gradients, the tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications.They dive into the realities of deploying LLM applications, overcoming “proof-of-concept purgatory,”

Episode 41: Beyond Prompt Engineering: Can AI Learn to Set Its Own Goals? Dec 30, 2024 2632 Hugo Bowne-Anderson hosts a panel discussion from the MLOps World and Generative AI Summit in Austin, exploring the long-term growth of AI by distinguishing real problem-solving from trend-based solutions. If you're navigating the evolving landscape of generative AI, productionizing models, or questioning the hype, this episode dives into the tough questions shaping the field.The panel features:

Episode 40: What Every LLM Developer Needs to Know About GPUs Dec 24, 2024 6215 Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you. Charles and Hugo dive into the practical side of GPUs—from running inference on large models, to fine-tuning and even t

Episode 39: From Models to Products: Bridging Research and Practice in Generative AI at Google Labs Nov 25, 2024 6209 Hugo speaks with Ravin Kumar,*Senior Research Data Scientist at Google Labs. Ravin’s career has taken him from building rockets at SpaceX to driving data science and technology at Sweetgreen, and now to advancing generative AI research and applications at Google Labs and DeepMind. His multidisciplinary experience gives him a rare perspective on building AI systems that combine technical rigor with

Episode 38: The Art of Freelance AI Consulting and Products: Data, Dollars, and Deliverables Nov 4, 2024 5027 Hugo speaks with Jason Liu, an independent AI consultant with experience at Meta and Stitch Fix. At Stitch Fix, Jason developed impactful AI systems, like a $50 million product similarity search and the widely adopted Flight recommendation framework. Now, he helps startups and enterprises design and deploy production-level AI applications, with a focus on retrieval-augmented generation (RAG) and s

Episode 37: Prompt Engineering, Security in Generative AI, and the Future of AI Research Part 2 Oct 8, 2024 3036 Hugo speaks with three leading figures from the world of AI research: Sander Schulhoff, a recent University of Maryland graduate and lead contributor to the Learn Prompting initiative; Philip Resnik, professor at the University of Maryland, known for his pioneering work in computational linguistics; and Dennis Peskoff, a researcher from Princeton specializing in prompt engineering and its applicat

Episode 36: Prompt Engineering, Security in Generative AI, and the Future of AI Research Part 1 Sep 30, 2024 3827 Hugo speaks with three leading figures from the world of AI research: Sander Schulhoff, a recent University of Maryland graduate and lead contributor to the Learn Prompting initiative; Philip Resnik, professor at the University of Maryland, known for his pioneering work in computational linguistics; and Dennis Peskoff, a researcher from Princeton specializing in prompt engineering and its applicat

Episode 35: Open Science at NASA -- Measuring Impact and the Future of AI Sep 19, 2024 3494 Hugo speaks with Dr. Chelle Gentemann, Open Science Program Scientist for NASA’s Office of the Chief Science Data Officer, about NASA’s ambitious efforts to integrate AI across the research lifecycle. In this episode, we’ll dive deeper into how AI is transforming NASA’s approach to science, making data more accessible and advancing open science practices. We exploreMeasuring the Impact of Open Sci

Episode 34: The AI Revolution Will Not Be Monopolized Aug 22, 2024 6172 Hugo speaks with Ines Montani and Matthew Honnibal, the creators of spaCy and founders of Explosion AI. Collectively, they've had a huge impact on the fields of industrial natural language processing (NLP), ML, and AI through their widely-used open-source library spaCy and their innovative annotation tool Prodigy. These tools have become essential for many data scientists and NLP practitioners in

Episode 33: What We Learned Teaching LLMs to 1,000s of Data Scientists Aug 12, 2024 5111 Hugo speaks with Dan Becker and Hamel Husain, two veterans in the world of data science, machine learning, and AI education. Collectively, they’ve worked at Google, DataRobot, Airbnb, Github (where Hamel built out the precursor to copilot and more) and they both currently work as independent LLM and Generative AI consultants.Dan and Hamel recently taught a course on fine-tuning large language mode

Episode 32: Building Reliable and Robust ML/AI Pipelines Jul 27, 2024 4511 Hugo speaks with Shreya Shankar, a researcher at UC Berkeley focusing on data management systems with a human-centered approach. Shreya's work is at the cutting edge of human-computer interaction (HCI) and AI, particularly in the realm of large language models (LLMs). Her impressive background includes being the first ML engineer at Viaduct, doing research engineering at Google Brain, and software

Episode 31: Rethinking Data Science, Machine Learning, and AI Jul 9, 2024 5765 Hugo speaks with Vincent Warmerdam, a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning.In this episode, they dive deep into rethinking established methods in data science, machine learning, and AI. We explore V

Episode 30: Lessons from a Year of Building with LLMs (Part 2) Jun 26, 2024 4524 Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley.These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Language Models (LLMs). They've distilled their experi

Episode 29: Lessons from a Year of Building with LLMs (Part 1) Jun 26, 2024 5422 Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley.These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Language Models (LLMs). They've distilled their experi

Episode 28: Beyond Supervised Learning: The Rise of In-Context Learning with LLMs Jun 9, 2024 3938 Hugo speaks with Alan Nichol, co-founder and CTO of Rasa, where they build software to enable developers to create enterprise-grade conversational AI and chatbot systems across industries like telcos, healthcare, fintech, and government.What's super cool is that Alan and the Rasa team have been doing this type of thing for over a decade, giving them a wealth of wisdom on how to effectively incorpo

Episode 27: How to Build Terrible AI Systems May 31, 2024 5545 Hugo speaks with Jason Liu, an independent consultant who uses his expertise in recommendation systems to help fast-growing startups build out their RAG applications. He was previously at Meta and Stitch Fix is also the creator of Instructor, Flight, and an ML and data science educator.They talk about how Jason approaches consulting companies across many industries, including construction and sale

Episode 26: Developing and Training LLMs From Scratch May 15, 2024 6695 Hugo speaks with Sebastian Raschka, a machine learning & AI researcher, programmer, and author. As Staff Research Engineer at Lightning AI, he focuses on the intersection of AI research, software development, and large language models (LLMs).How do you build LLMs? How can you use them, both in prototype and production settings? What are the building blocks you need to know about?In this episode,

Episode 25: Fully Reproducible ML & AI Workflows Mar 18, 2024 4839 Hugo speaks with Omoju Miller, a machine learning guru and founder and CEO of Fimio, where she is building 21st century dev tooling. In the past, she was Technical Advisor to the CEO at GitHub, spent time co-leading non-profit investment in Computer Science Education for Google, and served as a volunteer advisor to the Obama administration’s White House Presidential Innovation Fellows.We need open

Episode 24: LLM and GenAI Accessibility Feb 27, 2024 5404 Hugo speaks with Johno Whitaker, a Data Scientist/AI Researcher doing R&D with answer.ai. His current focus is on generative AI, flitting between different modalities. He also likes teaching and making courses, having worked with both Hugging Face and fast.ai in these capacities.Johno recently reminded Hugo how hard everything was 10 years ago: “Want to install TensorFlow? Good luck. Need data? Pe

Episode 23: Statistical and Algorithmic Thinking in the AI Age Dec 20, 2023 4837 Hugo speaks with Allen Downey, a curriculum designer at Brilliant, Professor Emeritus at Olin College, and the author of Think Python, Think Bayes, Think Stats, and other computer science and data science books. In 2019-20 he was a Visiting Professor at Harvard University. He previously taught at Wellesley College and Colby College and was a Visiting Scientist at Google. He is also the author of t

Episode 22: LLMs, OpenAI, and the Existential Crisis for Machine Learning Engineering Nov 27, 2023 4808 Jeremy Howard (Fast.ai), Shreya Shankar (UC Berkeley), and Hamel Husain (Parlance Labs) join Hugo Bowne-Anderson to talk about how LLMs and OpenAI are changing the worlds of data science, machine learning, and machine learning engineering.Jeremy Howard (https://twitter.com/jeremyphoward) is co-founder of fast.ai, an ex-Chief Scientist at Kaggle, and creator of the ULMFiT approach on which all mode

Episode 21: Deploying LLMs in Production: Lessons Learned Nov 14, 2023 4101 Hugo speaks with Hamel Husain, a machine learning engineer who loves building machine learning infrastructure and tools 👷. Hamel leads and contributes to many popular open-source machine learning projects. He also has extensive experience (20+ years) as a machine learning engineer across various industries, including large tech companies like Airbnb and GitHub. At GitHub, he led CodeSearchNet (ht

Episode 20: Data Science: Past, Present, and Future Oct 5, 2023 5200 Hugo speaks with Chris Wiggins (Columbia, NYTimes) and Matthew Jones (Princeton) about their recent book How Data Happened, and the Columbia course it expands upon, data: past, present, and future.Chris is an associate professor of applied mathematics at Columbia University and the New York Times’ chief data scientist, and Matthew is a professor of history at Princeton University and former Gugge

Episode 19: Privacy and Security in Data Science and Machine Learning Aug 14, 2023 5011 Hugo speaks with Katharine Jarmul about privacy and security in data science and machine learning. Katharine is a Principal Data Scientist at Thoughtworks Germany focusing on privacy, ethics, and security for data science workflows. Previously, she has held numerous roles at large companies and startups in the US and Germany, implementing data processing and machine learning systems with a focus o

Episode 18: Research Data Science in Biotech May 24, 2023 4373 Hugo speaks with Eric Ma about Research Data Science in Biotech. Eric leads the Research team in the Data Science and Artificial Intelligence group at Moderna Therapeutics. Prior to that, he was part of a special ops data science team at the Novartis Institutes for Biomedical Research's Informatics department.In this episode, Hugo and Eric talk about What tools and techniques they use for drug di

Episode 17: End-to-End Data Science Feb 17, 2023 4575 Hugo speaks with Tanya Cashorali, a data scientist and consultant that helps businesses get the most out of data, about what end-to-end data science looks like across many industries, such as retail, defense, biotech, and sports, includingscoping out projects,figuring out the correct questions to ask,how projects can change,delivering on the promise,the importance of rapid prototyping,what it mean

Episode 16: Data Science and Decision Making Under Uncertainty Dec 14, 2022 4996 Hugo speaks with JD Long, agricultural economist, quant, and stochastic modeler, about decision making under uncertainty and how we can use our knowledge of risk, uncertainty, probabilistic thinking, causal inference, and more to help us use data science and machine learning to make better decisions in an uncertain world. This is part 2 of a two part conversation in which we delve into decision ma

Episode 15: Uncertainty, Risk, and Simulation in Data Science Dec 7, 2022 3211 Hugo speaks with JD Long, agricultural economist, quant, and stochastic modeler, about decision making under uncertainty and how we can use our knowledge of risk, uncertainty, probabilistic thinking, causal inference, and more to help us use data science and machine learning to make better decisions in an uncertain world. This is part 1 of a two part conversation. In this, part 1, we discuss risk,

Episode 14: Decision Science, MLOps, and Machine Learning Everywhere Nov 20, 2022 4151 Hugo Bowne-Anderson, host of Vanishing Gradients, reads 3 audio essays about decision science, MLOps, and what happens when machine learning models are everywhere.LinksOur upcoming Vanishing Gradients live recording of Data Science and Decision Making Under Uncertainty with Hugo and JD Long! (https://www.eventbrite.com/e/data-science-and-decision-making-under-uncertainty-tickets-467379864757?aff=v

Episode 13: The Data Science Skills Gap, Economics, and Public Health Oct 11, 2022 4962 Hugo speak with Norma Padron about data science education and continuous learning for people working in healthcare, broadly construed, along with how we can think about the democratization of data science skills more generally.Norma is CEO of EmpiricaLab, where her team‘s mission is to bridge work and training and empower healthcare teams to focus on what they care about the most: patient care. In

Episode 12: Data Science for Social Media: Twitter and Reddit Sep 30, 2022 5578 Hugo speakswith Katie Bauer (https://twitter.com/imightbemary) about her time working in data science at both Twitter and Reddit. At the time of recording, Katie was a data science manager at Twitter and prior to that, a founding member of the data team at Reddit. She’s now Head of Data Science at Gloss Genius so congrats on the new job, Katie!In this conversation, we dive into what type of challe

Episode 11: Data Science: The Great Stagnation Sep 16, 2022 6353 Hugo speaks with Mark Saroufim, an Applied AI Engineer at Meta who works on PyTorch where his team’s main focus is making it as easy as possible for people to deploy PyTorch in production outside Meta. Mark first came on our radar with an essay he wrote called Machine Learning: the Great Stagnation (https://marksaroufim.substack.com/p/machine-learning-the-great-stagnation), which was concerned wit

Episode 10: Investing in Machine Learning Aug 18, 2022 5206 Hugo speaks with Sarah Catanzaro, General Partner at Amplify Partners, about investing in data science and machine learning tooling and where we see progress happening in the space.Sarah invests in the tools that we both wish we had earlier in our careers: tools that enable data scientists and machine learners to collect, store, manage, analyze, and model data more effectively. As you’ll discover,

9: AutoML, Literate Programming, and Data Tooling Cargo Cults Jul 19, 2022 6117 Hugo speaks with Hamel Husain, Head of Data Science at Outerbounds, with extensive experience in data science consulting, at DataRobot, Airbnb, and Github.In this conversation, they talk about Hamel's early days in data science, consulting for a wide array of companies, such as Crocs, restaurants, and casinos in Las Vegas, diving into what data science even looked like in 2005 and how you could t

Episode 8: The Open Source Cybernetic Revolution May 16, 2022 3967 Hugo speaks with Peter Wang, CEO of Anaconda, about what the value proposition of data science actually is, data not as the new oil, but rather data as toxic, nuclear sludge, the fact that data isn’t real (and what we really have are frozen models), and the future promise of data science.They also dive into an experimental conversation around open source software development as a model for the de

Episode 7: The Evolution of Python for Data Science May 1, 2022 3760 Hugo speaks with Peter Wang, CEO of Anaconda, about how Python became so big in data science, machine learning, and AI. They jump into many of the technical and sociological beginnings of Python being used for data science, a history of PyData, the conda distribution, and NUMFOCUS.They also talk about the emergence of online collaborative environments, particularly with respect to open source, and

Episode 6: Bullshit Jobs in Data Science (and what to do about them) Apr 4, 2022 5233 Hugo speaks with Jacqueline Nolis, Chief Product Officer at Saturn Cloud (formerly Head of Data Science), about all types of failure modes in data science, ML, and AI, and they delve into bullshit jobs in data science (yes, that’s a technical term, as you’ll find out) –they discuss the elements that are bullshit, the elements that aren’t, and how to increase the ratio of the latter to the former.T

Episode 5: Executive Data Science Mar 23, 2022 6509 Hugo speaks with Jim Savage, the Director of Data Science at Schmidt Futures, about the need for data science in executive training and decision, what data scientists can learn from economists, the perils of "data for good", and why you should always be integrating your loss function over your posterior.Jim and Hugo talk about what data science is and isn’t capable of, what can actually deliver va

Episode 4: Machine Learning at T-Mobile Mar 9, 2022 6265 Hugo speaks with Heather Nolis, Principal Machine Learning engineer at T-mobile, about what data science, machine learning, and AI look like at T-mobile, along with Heather’s path from a software development intern there to principal ML engineer running a team of 15.They talk about: how to build a DS culture from scratch and what executive-level support looks like, as well as how to demonstrate ma

Episode 3: Language Tech For All Mar 1, 2022 5566 Rachael Tatman is a senior developer advocate for Rasa, where she’s helping developers build and deploy ML chatbots using their open source framework.Rachael has a PhD in Linguistics from the University of Washington where her research was on computational sociolinguistics, or how our social identity affects the way we use language in computational contexts. Previously she was a data scientist at

Episode 2: Making Data Science Uncool Again Feb 20, 2022 6360 Jeremy Howard is a data scientist, researcher, developer, educator, and entrepreneur. Jeremy is a founding researcher at fast.ai, a research institute dedicated to making deep learning more accessible. He is also a Distinguished Research Scientist at the University of San Francisco, the chair of WAMRI, and is Chief Scientist at platform.ai.In this conversation, we’ll be talking about the history o

Episode 1: Introducing Vanishing Gradients Feb 16, 2022 330 In this brief introduction, Hugo introduces the rationale behind launching a new data science podcast and gets excited about his upcoming guests: Jeremy Howard, Rachael Tatman, and Heather Nolis!Original music, bleeps, and blops by local Sydney legend PlaneFace (https://planeface.bandcamp.com/album/fishing-from-an-asteroid)! Get full access to Vanishing Gradients at hugobowne.substack.com/subscrib

Episodes

Recommended