Kubernetes Podcast from Google

Abdel Sghiouar, Kaslin Fields

A weekly podcast focused on what's happening in the Kubernetes community hosted by Abdel Sghiouar and Kaslin Fields. We cover Kubernetes, cloud-native applications, and other developments in the ecosystem. Abdel and Kaslin on Twitter at @KubernetesPod or by email at kubernetespodcast@google.com. read less


Ray & KubeRay, with Richard Liaw and Kai-Hsun Chen
5d ago
Ray & KubeRay, with Richard Liaw and Kai-Hsun Chen
In this episode, guest host and AI correspondent Mofi Rahman interviews Richard Liaw and Kai-Hsun Chen from Anyscale about Ray and KubeRay. Ray is an open-source unified compute framework that makes it easy to scale AI and Python workloads, while KubeRay integrates Ray’s capabilities into Kubernetes clusters.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod   News of the week CNCF Blog - LitmusChaos audit complete! Kubernetes Podcast from Google episode 234 - LitmusChaos, with Karthik Satchitanand Google Cloud Blog - Run your AI inference applications on Cloud Run with NVIDIA GPUs Diginomica article - KubeCon China - at 33-and-a-third, Linux is a long player. So, why does Linus Torvalds hate AI? CNCF-Hosted Co-Located Event Schedule for KubeCon NA 2024  Google Kubernetes Engine Release Notes - August 20, 2024 (1.31 available in Rapid Channel) Kubernetes Podcast from Google - Kubernetes v1.31: "Elli", with Angelos Kolaitis Red Hat Press Release - Red Hat OpenStack Services on OpenShift is Now Generally Available Red Hat Enables OpenStack to Run Natively on OpenShift Platform Broadcom Revamps Tanzu to Simplify Cloud-Native App Development and Deployment Tanzu Platform 10 Offers Cloud Foundry Users Deep Visibility and Productivity Enhancements VMware Explore Conference Website CNCF Blog - Announcing 500 Kubestronauts CNCF - Kubestronaut FAQ Dapr Day 2024 Virtual Event Website Links from the interview Kai-Hsun Chen on LinkedIn Richard Liaw on LinkedIn Ray from the RISE Lab at UC Berkeley Ray: A Distributed System for AI by Robert Nishihara and Philipp Moritz - Jan 9, 2018 KubeRay Docs KubeRay on GitHub PyTorch Apache Airflow Apache Spark Kubeflow Apache Submarine (retired) Jupyter Notebooks VS Code Examples of schedulers for Batch/AI workloads in Kubernetes Kueue Volcano Apache Yunikorn Examples of observability tools for Batch/AI workloads in Kubernetes Prometheus Grafana Fluentbit Examples of loadbalancers Nginx Istio Ray Data: Scalable Datasets for ML Dask Python - Parallel Python Ray Serve: Scalable and Programmable Serving HPA - Horizontal Pod Autoscaling in Kubernetes Karpenter - “Just-in-time nodes for any Kubernetes cluster” Lazy Computation Graphs with the Ray DAG API Types of hardware accelerators Google Cloud Tensor Processing Units (TPUs) AMD Instinct AMD Radeon AWS Trainium AWS Inferentia Pandas Numpy KubeCon EU 2024 - Accelerators(FPGA/GPU) Chaining to Efficiently Handle Large AI/ML Workloads in K8s - Sampath Priyankara, Nippon Telegraph and Telephone Corporation & Masataka Sonoda, Fujitsu Limited NVidia Megatron Links from the post-interview chat DRA - Dynamic Resource Allocation in Kubernetes Different ways of Running RayJob on Kubernetes Ray framework diagram in the docs
Observability & Engineering Management, with Charity Majors
Observability & Engineering Management, with Charity Majors
Charity Majors is the co-founder and CTO of honeycomb.io. She pioneered the concept of modern Observability, drawing on her years of experience building and managing massive distributed systems at Parse (acquired by Facebook), then subsequently at Facebook, and at  Linden Lab building Second Life. She is the co-author of Observability Engineering and Database Reliability Engineering (O'Reilly). She loves free speech, free software and single malt scotch.    Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod   News of the week CNCF Blog: Vitess 20 is now Generally Available Vitess Blog: Announcing Vitess 20 Anthropic Blog: Claude 3.5 Sonnet KubeCon India 2024 CFP Apps on Azure Blog: Announcing support of OCI v1.1 specification in Azure Container Registry VMware Tanzu Blog: Announcing VMware Tanzu Greenplum 7.2: Powering Your Business with Enhanced Performance and Advanced Capabilities VMware Tanzu Blog: Join the public beta for GenAI on Tanzu Platform today! CNCF: Adobe End User Journey Report Links from the interview Honeycomb.io O’Reilly Book: Observability Engineering O’Reilly Book: Database Reliability Engineering Charity’s blog site: charity.wtf Charity Blog: Questionable Advice: “My boss says we don’t need any engineering managers. Is he right?” Daniel H. Pink book: “Drive: The Surprising Truth About What Motivates Us” In which, “He examines the three elements of true motivation—autonomy, mastery, and purpose-and offers smart and surprising techniques for putting these into action in a unique book that will change how we think and transform how we live.” Charity blog on Stack Overflow: “Generative AI is not going to build your engineering team for you” In which she talks about how the tech industry is an apprenticeship industry. Charity Majors in the Google Cloud Next 2024 Developer Keynote honeycomb.io blog: “How Time Series Databases Work—And Where They Don't” by Alex Vondrak honeycomb.io blog: “Why Observability Requires a Distributed Column Store” by Alex Vondrak Links from the post-interview chat CNCF Kubernetes Community Days (KCDs) CNCF Kubernetes Community Days (KCDs) on GitHub Julia Evans Blog Wizard Zines by Julia Evans “Help! I Have a Manager!” zine by Julia Evans Aja Hammerly aka “thagomizer” blog “The Toaster Parable” “Manager Toolkit: Manage The Person In Front Of You” “Manager Toolkit: Useful Manager Phrases for 1:1s” “Manager Toolkit: You Talk, I Type”
AI/ML in Kubernetes, with Maciej Szulik, Clayton Coleman, and Dawn Chen
AI/ML in Kubernetes, with Maciej Szulik, Clayton Coleman, and Dawn Chen
In this episode, we talk to three active leaders who have been around since the very beginning of Kubernetes. We explore how Kubernetes has changed since its inception, with a particular focus on current efforts in Open source Kubernetes to support AI/ML style workloads.   Maciej Szulik is currently taking a seat in the Kubernetes Steering Committee. He’s also leading Special Interests Groups responsible for kubectl, workload and batch controllers. Maciej has been contributing to Kubernetes since the early days, jumping from one area to another where help was needed. He authored the first version of audit and helped shape its current one, as well as touched multiple other places in apimachinery. He was also responsible for designing and implementing Job and CronJob controllers. In kubectl he was responsible for the plugin mechanism and several major refactors to simplify the code. Since May 2024 he joined the ranks of Production Readiness Review (PRR) approvers helping ensure high production standards for the future of Kubernetes releases.    Clayton Coleman is a long-time Kubernetes contributor, having helped launch Kubernetes as open source, being on the bootstrap steering committee, and working across a number of SIGs to make Kubernetes a reliable and powerful foundation for workloads.  At Red Hat he led OpenShift’s pivot onto Kubernetes and its growth across on-premise, edge, and into cloud.  At Google he is now focused on enabling the next generation of key workloads, especially AI/ML in Kubernetes and on GKE.   Dawn Chen has been a Principal Software Engineer at Google cloud since May 2007. Dawn has worked on an open source project called Kubernetes before the project was founded. She has been one of tech leads in both Kubernetes and GKE, and founded SIG Node from scratch. She also led Anthos platform team for the last 4 years, and mainly focuses on the core infrastructure. Prior to Kubernetes, she was the one of the tech leads for Google internal container infrastructure -- Borg for about 7 years. Outside of work, she is a wife, a mother of a 16-year old boy and a good friend. She enjoys reading, cooking, hiking and traveling.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod News of the week Kubernetes 1.31 Code Freeze is on July 9th Links from the interview Kubernetes Working Group Batch Kubernetes Working Group Serving Blog: Introducing Indexed Jobs (2021) Docs: Kubernetes Jobs KEP: Elastic Indexed Jobs Docs: Kubernetes CronJobs KubeCon EU 2021: The Long, Winding and Bumpy Road to CronJob’s GA - Maciej Szulik, Red Hat & Alay Patel, Red Hat KubeCon EU 2018: Writing Kube Controllers for Everyone - Maciej Szulik, Red Hat (Beginner Skill Level) Kubernetes Working Group Device Management Kubernetes Enhancement Proposal process README DockerCon 2014: The announcement of Kubernetes at DockerCon Blog: AI & Kubernetes (by Kaslin) Kueue - “Kueue is a cloud-native job queueing system for batch, HPC, AI/ML, and similar applications in a Kubernetes cluster.” Whitepaper: Large-scale cluster management at {Google} with {Borg} Email: “Containers: Introduction” - An email introducing the concept of Linux containers to the Linux community Links from the post-interview chat Blog - “Scaling Kubernetes to 7,500 nodes” - OpenAI Ray on Kubernetes
Leading Kubernetes into its Second Decade
Leading Kubernetes into its Second Decade
We talk with Nikhita Raghunath, Nabarun Pal, and Paco Xu. Nikhita, Nabarun, and Paco have each held various leadership positions related to the Kubernetes project. They talk about their journeys, the various leadership roles they’ve been in, and offer advice for new contributors and those who want to move into leadership in the project.   Nikhita is a Staff Software Engineer at Broadcom. She is currently a member of the CNCF Technical Oversight Committee (TOC) overseeing all technical matters of the CNCF. In the past, she was a member of the Kubernetes Steering Committee, a technical lead for SIG Contributor Experience and has also won the CNCF Top Committer Award. Currently, she is also a co-chair of the KubeCon+CloudNativeCon conference. Nabarun is a Staff Software Engineer at Broadcom, a maintainer of the Kubernetes project, a member of the Kubernetes Steering Committee and a chair of Kubernetes SIG Contributor Experience. In the past, he was the release lead for Kubernetes 1.21 and has served eight release teams. Nabarun also works actively with the Python community by organizing PyCon India and has been recognized in media publications for his work. Paco is an open source team lead in DaoCloud. He started to work on container/docker in 2016 and later started to participate in the Kubernetes Community in 2018. He is a current member of Kubernetes Steering Committee and works mainly on kubeadm and sig-node. He is Co-chair of KubeCon+CloudNativeCon China 2024.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod   News of the week Blog: 10 Years of Kubernetes CNCF-Hosted Co-Located Events Overview CFP for CNCF-hosted Co-located Events Kubernetes Community Days Links from the interviews CNCF Technical Oversight Committee SIG ContribEx Google Summer of Code CNCF Top Committer Award 2021 - Nikhita Raghunath Blog Post: Google Summer of Code with Kubernetes by Nikhita Raghunath Kubernetes Docs: Extend the Kubernetes API with CustomResourceDefinitions SIG API Machinery SIG Testing SIG Release CNCF Chop Wood Carry Water Award 2018 - Nikhita Raghunath Kubernetes Steering Committee KubeCon India KubeCon NA Kubernetes 1.21: Power to the Community Pycon India Kubernetes Python Client on GitHub Kubernetes Contributor Summit 2019 YouTube Playlist Kubernetes Release Team KubeCon NA 2024 Scholarships (applications due by September 1, 2024) Kubeadm SIG Node KubeCon China 2024 Kubelet Kubernetes Production Readiness Review Process Kubernetes Release Team CI Signal Lead Runbook
A Decade of Kubernetes Contribution
A Decade of Kubernetes Contribution
This episode is the first in our four-part Kubernetes 10 Years Anniversary special! The focus of this episode is on Kubernetes maintainers who have been involved with the project since its early days, and who are still active today. Featuring guests: David Eads, Davanum Srinivas (Dims), and Federico Bongiovanni. David is a senior principal software engineer at Red Hat.  He started contributing to Kubernetes before v1 and now serves as a sig-auth tech lead and sig-apimachinery tech lead and chair. Dims is a principal engineer at AWS, long term contributor to Kubernetes who served in multiple committees for the project. Today dims is in the Technical Oversight Committee or TOC. Welcome to the show Dims! Federico Bongiovanni is an engineering manager at Google. He started using Kubernetes in the early days at a previous company, and became a contributor about 6 years ago when he joined Google. Today, he’s a Co-chair of SIG-APIMachinery. Welcome to the show! Would you like to tell us more about yourself? Do you have something cool to share? Some questions? Let us know: - web: [kubernetespodcast.com](https://kubernetespodcast.com) - mail: [kubernetespodcast@google.com](mailto:kubernetespodcast@google.com) - twitter: [@kubernetespod](https://twitter.com/kubernetespod)   News of the week https://istio.io/latest/news/releases/1.22.x/announcing-1.22/ https://kubernetes.io/blog/2024/05/09/gateway-api-v1-1/ https://traefik.io/blog/traefik-3-0-ga-has-landed-heres-how-to-migrate/ https://devblogs.microsoft.com/dotnet/dotnet-build-2024-announcements/ https://events.linuxfoundation.org/kuber10es-birthday-bash/ https://www.cncf.io/kubertenes/   Links from the interview Kubernetes SIG Auth Kubernetes SIG API Machinery Automagic kubectl config merging causes hair loss Safety or Usability: Why Not Both? Towards Referential Auth in K8s - Rob Scott, Google & Mo Khan, Microsoft Open Stack Kubernetes Cloud Provider OpenStack RedHat OpenShift Kubernetes SIG Architecture Kubernetes Kubelet Blog: Completing the Largest Migration in Kubernetes History Dims’ PR removing over 1 million lines of Cloud Provider code from Kubernetes KubeCon EU 2024 talk: Kubernetes Is FINALLY Removing in-Tree Cloud Providers - Bridget Kromhout & Chris Privitere KEP-2395: Removing In-Tree Cloud Provider Code Blog from 2019 about the reasoning behind the removal of cloud provider code Blog about setting cloud provider code to disabled by default in v1.29 The March 2024 Spotlight blog on SIG Cloud Provider   Links from the post-interview chat Kubernetes Maintainers Read Mean Comments - Tim Hockin, Google & Davanum Srinivas, Amazon Web Services “Working in Public: The Making and Maintenance of Open Source Software” by Nadia Eghbal Keynote: A Vision for Vision - Kubernetes in Its Second Decade - Tim Hockin SIG K8s Infrastructure
Postgres on Kubernetes, with Álvaro Hernández
Postgres on Kubernetes, with Álvaro Hernández
Álvaro Hernández is the founder and CEO of OnGres a company that provides among other things a distribution of Postgres that runs on Kubernetes, called “StackGres”. Álvaro is also an AWS Data Hero and a passionate database and open source software developer   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod   Note: This episode was edited on May 17th to remove a chatter segment from episode 219, which had been mistakenly edited into it. News of the week Kubernetes code cleanup KEP-2395: Removing In-Tree Cloud Provider Code - GitHub KEP Readme Remove gcp in-tree cloud provider and credential providers - GitHub PR Spotlight on SIG Cloud Provider - Blog The Future of Cloud Providers in Kubernetes - Blog Kubernetes 1.29: Cloud Provider Integrations Are Now Separate Components - Blog Google I/O KubeCon + CloudNativeCon Europe 2024 Report KuberTENes Birthday Bash The Kubernetes Community takes over kubernetesio on X WG-Serving on GitHub DoK Community Ambassador Applications   Links from the interview Álvaro Hernández: LinkedIn Twitter/X OnGres PostgreSQL Stackgres.io Stackgres github Kubernetes Pg_repack Data on Kubernetes (DoK) Community Data On Kubernetes 2022 Report Data on Kubernetes Whitepaper - Database Patterns - by CNCF TAG Storage Istio Apache Zookeeper Strimzi - CNCF Project for running Apache Kafka on Kubernetes Apache Kafka Postgres extensions The Kubernetes Operator Pattern Presentation about PostreSQL Hooks from PostgreSQL wiki OCI - Open Container Initiative Why Postgres Extensions should be packaged and distributed as OCI images
API Machinery, Chaos and Dishwashers, with Lucas Käldström
API Machinery, Chaos and Dishwashers, with Lucas Käldström
Lucas Käldström is a CNCF Ambassador, Kubernetes contributor and expert. Lucas Co-led SIG cluster lifecycle, ported Kubernetes to ARM and shepherded kubeadm from inception to GA. Today Lucas runs three meetup groups in Finland, studies at Aalto University, and, when time allows, contributes to cloud native software as a contractor. We chatted about Kubernetes API machinery, Chaos, Entropy, and Dishwashers. Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod News of the week Weaveworks shutdown their operations Weavwork CEO Alexis Richardson post on Linkedin kubetrain.io Bytedance KubeAdmiral on GitHub Bytedance KubeAdmiral Announcement on InfoQ Strimzi joins the CNCF Incubator Microsoft new Cost Management tools for Azure Links from the interview Lucas Käldström LinkedIn Twitter/X Kubernetes as a dishwasher Understanding Kubernetes Through Real-World Phenomena and Analogies - Lucas Käldström Lucas research thesis Paper - Large-scale cluster management at Google with Borg API Machinery Dr. Stefan Schimanski KCP - Kubernetes-Like Control Plane Kubernetes API Conventions SIG Architecture Ingress2gateway - Ingress to Gateway Migrator Promise Theory: Principles and Applications (Mark Burgess, Jan Bergstra) In Search of Certainty: The Science of Our Information Infrastructure (Mark Burgess) Sweden Finns Links from the post-interview chat Keynote: Reperforming a Nobel Prize Discovery on Kubernetes - Ricardo Rocha & Lukas Heinrich Why Service Is the Worst API in Kubernetes, & What We’re Doing About It - Tim Hockin Gateway API TCP Routes Community-Powered Kubernetes LTS: Ensuring Stability and Compatibility While Driving Innovation Jeremy Rickard https://github.com/yannh/kubeconform
Kubernetes stale reads, with Madhav Jivrajani
Kubernetes stale reads, with Madhav Jivrajani
Madhav Jivrajani is an engineer at VMware, a tech lead in SIG Contributor Experience and a GitHub Admin for the Kubernetes project. He also contributes to the storage layer of Kubernetes, focusing on reliability and scalability. In this episode we talked with Madhav about a recent post on social media about a very interesting stale reads issue in Kubernetes, and what the community is doing about it.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod Chatter of the week Mofi Rahman co-host this episode with Kaslin Twitter/X LinkedIn Kubernetes Podcast episode 211 News of the week Google announced a new partnership with Hugging Face RedHat self-managed offering of Ansible Automation Platform on Microsoft Azure The schedule for KubeCon CloudNativeCon EU 2024 is out CNCF Ambassador applications are open The CNCF Hackathon at KubeCon CloudNativeCon EU 2024 CFP is open now The annual Cloud Native Computing Foundation report for 2023 CNCF's certification expiration period will change to 24 months starting April 1st, 2024. Sysdig 2024 Cloud Native Security and Usage Report Links from the interview Madhav Jivrajani Twitter/X LinkedIn Priyanka Saggu Interview Stale reads Twitter/X thread by Madhav "Kubernetes is vulnerable to stale reads, violating critical pod safety guarantees" - GitHub Issue tracking the stale reads CAP Theorem issue CMU Wasm Research Center "A CAP tradeoff in the wild" blog by Lindsey Kuper "Reasoning about modern datacenter infrastructures using partial histories" research paper The Kubernetes Storage Layer: Peeling the Onion Minus the Tears - Madhav Jivrajani, VMware KEP-3157: allow informers for getting a stream of data instead of chunking. KEP 2340: Consistent Reads from Cache Journey Through Time: Understanding Etcd Revisions and Resource Versions in Kubernetes - Priyanka Saggu, KubeCon NA 2023 Kubernetes API Resource Versions documentation
Cilium and eBPF, with Bill Mulligan
Cilium and eBPF, with Bill Mulligan
Guest is Bill Mulligan. Bill is Community Pollinator at Isovalent working on Cilium and eBPF. We learned how to properly pronounce Isovalent and what it actually means. We also spoke in depth about eBPF, Cilium, network function in Kubernetes and more.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod News of the week The Kubernetes legacy Linux package repositories are going away in January 2024 Kubernetes 1.29 is now available on GKE in the Rapid Channel The Vmware Tanzu Application Catalog is fully compliant with the SLSA Level 3 AWS extended support for Kubernetes minor versions pricing update The Kubernetes Contributor Summit Paris CFP is Open, closes Feb 4th KubeCon and CloudNativeCon EU 2024 co-located events agenda is live The Cloud Native Glossary is now available in French Blixt a new experimental LoadBalancer based on the Gateway API and eBPF Links from the interview Bill Mulligan: LinkedIn Twitter/X Covalent bonds on Wikipedia Isovalent Hybridization on Wikipedia Isovalent company site BPF - Berkeley Packet Filtering eBPF project site Fast by Friday: Why eBPF is Essential - Brendan Gregg GKE Dataplane V2 Cilium project site Hubble documentation Cilium Service Mesh Cilium annual report Cilium Certified Associate (CCA) CCA Study Guide from Isovalent on GitHub Istio Certified Associate (ICA) Certified Kubernetes Administrator (CKA) Certified Kubernetes Application Developer (CKAD) Kubernetes and Cloud Native Associate (KCNA) Resources to prepare for the CCA certification Isovalent library The World of Cilium Cisco acquired Isovalent Developing eBPF Apps in Java BGP in eBPF
NAIS, with Johnny Horvi and Frode Sundby
NAIS, with Johnny Horvi and Frode Sundby
This week’s guests are Johnny Horvi and Frode Sundby from NAVs (Norwegian Labour and Welfare Administration) platform team. We talked about NAIS. A kubernetes-based team centric platform aiming at providing the tools needed to deploy and operate apps easily.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod   News of the week Kubernetes 1.29 features: https://kubernetes.io/blog/2023/12/14/cloud-provider-integration-changes/ https://kubernetes.io/blog/2023/12/20/contextual-logging-in-kubernetes-1-29/ https://kubernetes.io/blog/2023/12/19/pod-ready-to-start-containers-condition-now-in-beta/ https://kubernetes.io/blog/2023/12/19/kubernetes-1-29-taint-eviction-controller/ https://kubernetes.io/blog/2023/12/18/read-write-once-pod-access-mode-ga/ https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/ https://kubernetes.io/blog/2023/12/15/kubernetes-1-29-volume-attributes-class/ https://kubernetes.io/blog/2023/12/15/csi-node-expand-secret-support-ga/ Kubernetes 1.29 release lead Interview Cisco acquired Isovalent Cilium 2023 Annual report KubeCon and CloudNativeCon Paris 2024 Hackathon https://www.cncf.io/blog/2023/12/20/kubecon-cloudnativecon-europe-hackathon-challenges-brought-to-you-by-the-united-nations/  https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/ https://unite.un.org/  https://sdgs.un.org/goals OpenFeature incubated as a CNCF project   Links from the interview Guests: Johnny Horvi Frode Sundby Nais Nais.io Twitter/X Github NAV JBoss IBM Websphere Apache Mesos   Links from the post-interview chat Nais on GitHub