At conferences we tend to focus on success stories, how we accomplish something, projects that are great successes and, although it’s inspirational, we forget about the failures that we find during the journey, which can be powerful sources of learning and growth.
This is the story of a failure that could have been avoided. The journey was plagued with monsters such as synchronous and asynchronous communication between services, unexpected production problems, and more. We will reflect on how we tackled these challenges, and the lessons learned that shaped us for future projects.
Join me on this trip about embracing failures as an integral part of success.
I'm a backend developer with several years of experience in e-commerce, marketplaces, and logistic. I really care about developing a maintainable code applying best practices and testing but I also like to be involved in the product and understand why we invest time in some new feature, and how are we impacting the users.
Apart from that, I enjoy sharing knowledge as a speaker talking about testing, event storming, or microservices or preparing courses about Kotlin in Spanish.
k6 can run distributed tests and can be controlled remotely through a CLI and API.
As the Technical Evangelist at Varnish Software, Thijs Feryn focuses on web performance, scalability, and content delivery. He demonstrates this through presentations, videos, books, blog posts, and other media.
Thijs is a published author and wrote Getting Started with Varnish Cache and Varnish 6 by Example. As a public speaker, he has a track record of over 270 presentations in 20 different countries, where he is often praised for his energetic and engaging presentation style.
One of the most important cloud computing benefits is to focus on efficiency in every aspect of our infrastructure, and builders can accelerate the sustainability of their workloads through optimization and informed architecture patterns.
In this session we’ll show you practical examples that combine different techniques how to build for sustainability on Amazon EKS by making use of Graviton, EC2 Spot, effective autoscaling with Karpenter, among other topics.
Our aim is to provide a direction on reducing the energy and carbon impact of cloud architectures running on Amazon EKS, that at the same time will help you to optimize for costs.
Federica is a Sr Containers Specialist Solutions Architect at Amazon Web Services. She is passionate about networking and containers.
Outside of the office, she enjoys reading, drawing, and spending time with her friends, preferably in restaurants trying out new dishes from different cuisines.
Christian helps customers to build fault-tolerant, elastic, reliable, and cost optimized workloads in AWS. He’s passionate about Kubernetes, programming, and building tech communities.
Christian has spent 15 years working in different companies to build modern solutions using the cloud.
DevOps has been growing in popularity in recent years, particularly in (software) companies that want to reduce their lead time to be measured in days/weeks instead of months/years.
But, what about the secrets? The current trend increases the number of secrets required to run our services. This places a new level of maintenance on our security teams. How can we share and manage the secrets (certificates, passwords, SSH, API keys) for our services in this dynamic scenario, where instances are started automatically, where there are multiple instances of the same services for scalability reasons? Are you keeping up?
How are these secrets managed in GitOps?
Come to this session to learn how to keep secrets secret for the whole lifecycle of the application, from the early beginning when you start developing it until the application is up and running in the Kubernetes cluster.
Alex Soto is a Director of Developer Experience at Red Hat. He is passionate about the Java world, software automation and he believes in the open-source software model. Alex is the co-author of Testing Java Microservices, Quarkus Cookbook, Kubernetes Secrets Management, and GitOps Cookbook books, and a contributor to several open-source projects.
A Java Champion since 2017, he is also an international speaker, radio collaborator at Onda Cero, and teacher at Salle URL University.
In this talk, I'll share the lessons from a month-long journey into a production-scaled incident that degraded the quality of service in a large-scale multi-tenant, multi-region, multi-cluster Kubernetes cluster. We'll start with a seemingly innocent error and delve deeper into a series of unexpected issues affecting application performance.
We'll explore the importance of Service Level Indicators (SLIs) and Service Level Objectives (SLOs) and their role in incident management.
Then, I'll discuss how to leverage data, employ observability tools, and iterate on feedback loops to navigate complex issues.
This talk will highlight the significance of structured incident management and a data-driven approach to ensuring system reliability.
I am a senior SRE from the "Runtime" team running Kubernetes at scale for Adevinta. I am originally from Bangkok, Thailand, and have just moved to Barcelona for over a year.
I started my career as a software engineer, but now I consider myself a SRE practitioner.
I loved sharing my knowledge as a way to reflect on myself and share with others the good and the bad I learned.
In this talk we show how we design and implement automation process to manage one of the most important programs in Allianz (Global Platform) in automatic way.
Deployed in multiple countries, clusters and managed by different teams with full responsibility and autonomy. How we are doing cloudification strategy with our cloud providers, to manage the platform fully in cloud.
We talk about APIs, Kubernetes, AWS, CI/CD, tooling and process and also about how enablement teams are supporting business teams to deliver business requirements in a quick way.
Oliver Tena works as an IT Architect in Allianz Technology SE.
He used to work as a regional Architect and Automation expert for the IberoLatam region and is currently part of the Global team working for many countries around the world.
He is currently focused in automation.
Pere Alcoberro works as an Cloud and DevOps Architect in Allianz Technology SE involved developing for the Global Platform.
He has strong knowledge in Cloud Infrastructure Architectures and Infrastructure as Code.
Join us as we walk you through our application of GitOps in both our Infrastructure as Code (IaC) and application delivery processes. We'll share how we've integrated ArgoCD and GitHub Actions as core components in our GitOps journey.
Furthermore, we'll give you a glimpse into our microservices and multi-platform environment in Kubernetes, emphasizing security standards, such as cross-account KMS for encryption both at rest and in transit, and multi-account and multi-region operations.
By the end of this talk, we hope to give you some insights and practical tips that you can apply in your own tech journey about KMS, Secret management, IRSA, mTLS, static analysis checks (tflint, trivy, etc.), helm charts, Karpenter and monorepo.
Roc is a Principal Engineer at UserZoom, now part of UserTesting, where he plays a crucial role in the Fulfillment team. He initially joined as a FullStack engineer and quickly made significant contributions to both the team and platform architecture. He was promoted to the position of Senior Manager Software Engineer. In this role, he continued to excel by contributing across various platform areas, including development, CI/CD, and infrastructure.
Recognizing his valuable contributions and expertise, Roc was further promoted to the role of Principal Engineer. Where he took charge of critical architectural decisions, spearheaded the creation of a new platform, and played a vital role in migrating the platform to a more robust CI/CD process and Trunk Based Development. Additionally, Roc served as the Principal DevOps Engineer for nearly a year, where he successfully revamped the platform infrastructure using Infrastructure as Code (IaC) with Terraform and automation.
Currently, Roc is heavily focused on facilitating the transition of the company and platforms from UserZoom to UserTesting. He actively leads significant changes in the company's IoC cross-team practices. Additionally, he plays a crucial role in driving major evolutions within the Fulfillment team's platform to meet the requirements of UserTesting.
Outside of work, Roc enjoys going on hikes and traveling. However, his true passion lies in attending plays, and he regularly attends performances in Barcelona's renowned theaters like TNC or Sala Beckett.
Sergi works as a Senior Cloud Engineer and DevOps at UserZoom, now part of UserTesting. He is passionate about anything related with Kubernetes, IaC, automation and cloud.
In UserTesting he is focussed in helping any team with Terraform and kubernetes and also the adoption of a GitOps methodology
Operators are extensions to Kubernetes that simplify application install and management by leveraging on Kubernetes Custom Resources and Controllers. The Kubernetes Operator pattern tries to emulate the role of a human operator, who uses their deep knowledge of the application to install, operate and debug it. The Kubernetes Operators search to automate these tasks and facilitate the whole application life cycle.
Last year at DevOps Barcelona, Aurélie and Horacio showed how to build a Terraform provider, using as example an external API and lots of Gophers. This year they would like to take a similar approach to show you, step by step, how to create a Kubernetes Operator. By taking again as base the Gopher REST API, they will explain the basics of an operator creation, give some pointers on how to do a simple yet efficient operator architecture that manages not only Kubernetes objects but also resources outside Kubernetes, and show you the code and the provider in action.
Will they succeed in this new mission?
I'm a DevRel at OVHcloud in Toulouse, France. I am been working as a Developer and Ops for over 16 years. Cloud enthusiast and advocates DevOps/Cloud/Golang best practices.
I'm one of the leaders of Duchess France, an association that promotes women developers and women working in IT.
Conferences and meetups organizer since 2016. Technical writer (dev.to/aurelievache), a book author & reviewer, a sketchnoter and a speaker at international conferences.
I created a new visual way for people to learn and understand Cloud technologies: "Understanding Kubernetes/Istio/Docker in a visual way" in sketchnotes and videos.
Spaniard lost in Brittany, coder, speaker, dreamer and all-around geek.
Horacio loves web development in general and everything around Web Components and standards web in particular, but he also loves to discuss Kubernetes, AI and cloud in general. He is a Google Developer Expert (GDE) in Web Technologies and Flutter.
In today’s fast-paced and complex technological landscape, observability has become a critical aspect of ensuring the reliability and performance of software systems.
However, traditional observability tools and techniques can only go so far in providing insights into the behavior of these systems. Enter ChatGPT, a conversational AI tool that can help bridge the gap between observability and human understanding.
In this session, we will explore ChatGPT and how it could theoretically be used to enhance an enterprise’s observability practices. We will briefly look at how ChatGPT can be trained to understand and interpret system logs, metrics, and other observability data, and explore whether it can provide useful real-time insights and recommendations to engineers and operators.
By attending this session, you will gain a deeper understanding of how conversational AI can be leveraged to improve observability in software systems, and how ChatGPT help you avoid potential issues before they become critical problems.
Cloud strategist with a background in architecting and building software & services, currently working @Dynatrace, where I help to shape cloud strategy. Opinions are my own!
Experienced Database Administrators (DBAs) bring invaluable expertise and historical context to DevOps endeavors.
This session explores effective strategies for leveraging their knowledge, fostering collaboration, and maximizing their contributions to drive DevOps success.
Discover the importance of experienced DBAs, their role in bridging the gap between traditional practices and modern DevOps methodologies, and practical approaches for inclusive environments that value their insights.
Join us as we explore the immense value experienced DBAs bring to DevOps and unlock their potential to drive success.
Augusto Bott, an experienced IT professional with a strong database background and over two decades of expertise, is a member of the Cloud and Governance team at Wallbox Chargers.
With comprehensive knowledge in data management, software design, and systems administration, Augusto excels at tackling challenges in data-driven DevOps environments.
Join us for valuable discussions and practical strategies, leveraging his expertise to drive excellence in the convergence of databases and DevOps.
Platform teams' mission is to lay the necessary foundation for product teams to increase business value and delight customers. They do this by building platform products that ensure and accelerate high-quality delivery and operations.
Very often, however, platform teams are so overwhelmed with operational workload that cannot deliver the product roadmap they promised. Recovering production, troubleshooting applications, assisting product/dev teams, managing vendors... consume nearly their entire capacity.
In this talk, we will review together some Lean techniques that can help us reduce our operational load so that we can focus on building the best platform products.
I am an individual immensely passionated about Technology, specially about Platforms. I started my career working in a platform team building the core of CASTOR at CERN, the largest file system at the time. Ever since, my career has been nearly always linked to platform world.
I now work as Technical Product Manager at Thoughtworks, helping teams to bring Product Thinking to the Platform space
Once upon a time, in the ever-expanding realm of cloud computing, a group of brave engineers embarked on a quest to protect their organization's cloud infrastructure from the perilous threat of misconfigurations. This is the story of their adventure at Datadog, where they unravel the mysteries of detecting and fixing cloud misconfigurations at scale.
In this talk, we invite you to join us on this remarkable journey. We will dive into the world of security, system and software engineers running services in the cloud, as we unveil practical insights and effective strategies for addressing misconfigurations.
Our story commences by shedding light on the significance of cloud misconfigurations and their potential ramifications. Through real-world anecdotes and cautionary tales, we will highlight the profound impact that misconfigurations can have on security, performance, cost and compliance in cloud environments.
As our intrepid adventurers venture deeper into their mission, they uncover a plethora of tools and techniques to detect cloud misconfigurations at Datadog scale. We will delve into the realm of automated monitoring and auditing, exploring how Datadog's security team harnesses the power of intelligent checks and comprehensive scanning to swiftly identify vulnerabilities and misconfigurations.
But detection alone is not enough to protect against the perils of misconfigurations. Our heroes press forward to discover the art of remediation. Through their trials and triumphs, we will unravel the secrets behind prioritizing and rectifying misconfigurations effectively. Topics covered include workflows, leveraging infrastructure-as-code principles for automated remediation, and integrating continuous integration and continuous delivery (CI/CD) pipelines to enforce best practices.
Throughout this enchanting narrative, we will weave together practical examples and relatable experiences that resonate with both beginners and those seeking to enhance their cloud security knowledge. By the end of our tale, attendees will be armed with actionable insights to implement robust detection and remediation strategies in their own cloud environments.
In a world where cloud services are under constant scrutiny, this talk serves as a guiding light, empowering attendees to fortify their cloud infrastructure against the ever-evolving threat landscape. So, come, embark on this extraordinary adventure and discover the story of detecting and fixing cloud misconfigurations at Datadog scale."
Adam is a Senior Security Engineer in Datadog’s internal Platform Security team, where his mission is to improve the security posture of the company's cloud infrastructure. On a day to day basis he deals with automating detection and remediation of cloud misconfigurations at scale. Prior to Datadog, Adam worked for many companies like Glovo and Schibsted (now known as Adevinta). He is passionate about new technologies, security and programming.
When not in the office, he can be seen wandering around discovering new places, doing sports or reading a good book.
One of the emerging standards for cloud (native) security is OPA, the Open Policy Agent; an open source standard under the Cloud Native Computing Foundation.
This talk gives an overview of what OPA can do for you and how you can write declarative policies to check your APIs, Kubernetes, or applications. It's structured into three segments:
1. Why do you want to add a continuous runtime checker to your APIs or applications and what gaps is it covering?
2. How do you write declarative policies with OPA?
3. What does it look like in hands-on examples against APIs, Kubernetes, and applications?
Philipp lives to demo interesting technology.
Having worked as a web, infrastructure, and database engineer for over ten years, Philipp is now a developer advocate and DevRel team lead at Elastic — the company behind the Elastic Stack consisting of Elasticsearch, Kibana, Beats, and Logstash.
When launching new products or dealing with an outage, customer-facing teams see terms thrown around internally that customers might not identify with immediately.
Good communication practices, including good translation of engineering terminology to customer-friendly communication can save time and increase customer satisfaction when moments matter. Customer Support's unique ability to understand both what is happening internally as well as what customers need to hear positions your team as the experts to steer communications during major incidents.
Come to this talk to learn how to partner with your customer facing teams during major incidents.
Kat is a developer advocate at PagerDuty. She enjoys talking and thinking about incident response, customer support, and automating the creation of a delightful end-user and employee experience. She previously ran Global Customer Support at PagerDuty, and as a result, it’s hard to get her to stop talking about the potential career paths for tech support professionals.
In her spare time, Kat is a mediocre plant parent and a slightly less mediocre pet parent to two rabbits, Lupin and Ginny.
AI is an amazingly powerful tool that has the potential to revolutionize DevSecOps workflows. Unfortunately, it also has the potential to take over much of the important work that we currently perform. Our DevOps and SecOps teams have started experimenting with AI in several areas:
> Creating an intentionally vulnerable Python program for a Capture-the-Flag class
> Using our existing FAQs to create a Tier 1 Support chatbot for our software engineers
> Using our incident response logs to create a predictive model to avoid future incidents
> Using our AWS usage metrics to automatically generate auto-scaling policies
> The use of AI log analytics to alert on security threats
Our talk will demonstrate these AI interactions and will discuss the tools and techniques that we used to create these AI models. The objective is to provide the audience with a great start so they can explore AI use cases in their own organizations.
We will conclude the talk with some examples of innovative use cases that we discovered while researching how AI is being used in other organizations.
Alex has over 15 years experience building software and tools.
He has been at Amplify Education since 2015, and in his current role as Director of Devops since 2021.
His team oversees the long list of pipelines, tools, and infrastructure that are required to take code from developer laptops and run it in production environments.
Johnathan has over a decade of experience in quality assurance, automation, software engineering, and DevOps, taking on roles as both a leader and an individual contributor.
In 2021, he moved into his current role as the Director of Security Operations at Amplify Education.
In this role, Johnathan leads the SecOps team at Amplify, overseeing the introduction of new security technologies and conducting continuous security testing of Amplify's infrastructure.
I started my career at Bell Laboratories and had the opportunity to be mentored by several amazing engineers. From there, I’ve been on some great teams in fields of biomedical engineering, financial services and eCommerce start-ups.
I joined Amplify as the Director of DevOps in 2015. Amplify is dedicated to collaborating with educators to create learning experiences that are rigorous and riveting for all students. Our team supports the product owners and software engineers that develop those applications.
I compare our work in DevSecOps to the air traffic control system. Our job is to keep the planes flying and help them land safely.
Authentication is scary, difficult, dangerous, and… essential. Most apps need some form of it, often as a prerequisite for almost all end-user requests.
Join me for a fresh and practical perspective on authentication architecture for the modern, open web, using modern standards.
We'll set up an authentication layer that's fast, distributed, secure, isolated from the rest of your system, and minimally annoying to integrate and maintain.
Dora is a Developer Relations Engineer at Fastly. She cut her teeth on building a global news website, and cultivated her compassion by leading data protection and site reliability engineering teams.
She lives in London, has a pet fox, and dreams of helping to build a faster, more secure, more reliable – a better web – for everyone.