We Make Resilience Work.

Most organisations are in a resilience crisis they haven't named yet. Similar incidents, every quarter, despite the postmortems, the dashboards, and increasingly, the AI. We help you see what your practices stopped showing you.

Book a Diagnostic Call or Read the Book

Trusted by Amazon BMW Group Autodesk Steadybit Mesh AI

Is this your reality

You'll recognise this.

01 The same types of incidents keep happening despite the postmortems and the action items

02 You invested in chaos engineering but can't tell if it's actually making you more resilient

03 Your teams do incident reviews, but the learning never spreads beyond the room

04 You're spending more time fighting fires than improving your systems

05 You know something's broken organisationally, but can't pinpoint what

06 Leadership is asking "why does this keep happening?" and you don't have a good answer

07 You're deploying AI into operations but can't tell if it's making better decisions or just faster ones

08 You're betting on AI to improve operations but nobody can explain what happens when it's wrong

These aren't isolated problems. They share a common root.

What this actually is

Most of our engagements are undeclared crisis management projects.

The organisation has the practices — incident reviews, chaos engineering, DR plans — but the feedback loops that should turn those practices into learning are broken. Signals get absorbed without action. Prevention work loses to the backlog every quarter. The people closest to the problems can see what's wrong, but the organisational structure doesn't carry that information to where decisions get made.

How we work

Start with diagnosis,
then build from there.

Most organisations start with the Resilience Assessment. Once we've identified what's actually broken, we can help you strengthen specific capabilities or partner for long-term transformation. But diagnosis comes first — you need to know what to fix.

The problem you're facing: Your organisation keeps having the same types of incidents despite doing retros, chaos engineering, and architecture reviews. Something organisational is broken — but what?

We diagnose the feedback loop failures and organisational patterns that prevent your teams from learning and adapting. We embed with your teams to see how work actually happens versus how it's described. We participate in your incident reviews, observe GameDays, sit in on chaos experiments, and join operational readiness reviews. We watch how teams interact, what incentives and pressures they face, and where the gaps appear between policy and practice.

Through this combination of observation and structured interviews, we identify exactly what's blocking resilience and give you a clear roadmap to fix it.

Embedded observation of your actual resilience practices (incident reviews, GameDays, ORRs, chaos experiments)
Stakeholder interviews across engineering, ops, and leadership
Deep analysis of your feedback loops and incident patterns
Written report with prioritised, actionable recommendations
2-hour executive readout session with your leadership team

6-8 weeks from kickoff to delivery

Once we've identified what's broken, we help you build specific capabilities to address the gaps. These are focused 2-4 month engagements that build one capability deeply.

Recent Strengthen engagements have included designing chaos engineering programmes from scratch, rebuilding incident analysis processes to focus on organisational learning rather than action-item compliance, and establishing Operational Readiness Reviews that teams actually trust.

Chaos Engineering Programs — Design and implement systematic resilience testing
Operational Readiness Reviews — Validate systems are actually ready for production
Incident Analysis Process — Improve how your teams learn from failures

2-4 months, after assessment

For organisations ready for comprehensive change, we offer ongoing strategic partnership to embed resilience into your culture and operations. We become a long-term partner embedded in your leadership rhythm, joining monthly strategy sessions, running quarterly health assessments, and serving as a sounding board when new challenges emerge.

Monthly strategic sessions with leadership
Quarterly organisational health assessments
Ongoing advisory as you implement changes
Access for architecture reviews and escalations

6-12+ months of collaboration

Our approach

No prescriptive checklists. We study your organisation's culture first

Founder-led. The person you talk to is the person doing the work

Diagnosis before prescription. We don't sell solutions to problems we haven't seen

What you need to hear, not what you want to hear

Ready to start? Book a Diagnostic Call

What people say

The work speaks
for itself.

On credibility

"More often than not, 'consultants' can talk the talk, but cannot walk the walk. If you want to improve the resilience of your systems and operations, Adrian has proven that he can deliver. He is an educator at heart, with in-depth knowledge based on real experience."

Werner Vogels

VP & CTO, Amazon

On the approach

"Adrian doesn't come to you with a prescriptive checklist. Instead, he studies your organisation's culture carefully to understand deep underlying contributing factors that impact resilience. Be prepared for what you need to hear, not what you want to hear. But fear not - Adrian understands human psychology and delivers his insights in a respectful and constructive manner that drives effective and sustainable change. He is an accelerator for organisational learning and improvement."

Jason Niemczyk

Senior Principal SRE, Autodesk

On domain expertise

"Adrian has now become the go-to independent expert in this space. Most companies don't realize that a good resilience program will speed up their time to market for everything else, and Adrian can help you get there."

Adrian Cockcroft

Tech Advisor, Former VP Architecture, AWS

On enterprise impact

"Collaborating with Adrian has been transformative for our team's and BMW Group as enterprise approach to resiliency and chaos engineering. As a fellow techy at BMW, I had the opportunity to work closely with Adrian on several key projects. His deep understanding of resiliency best practices and chaos engineering was instrumental in scaling our chaos experimentation initiatives."

Hrvoje Lukavski

Lead Product Manager, BMW Group

On socio-technical depth

"Adrian has an exceptional ability to understand the deep interplay between people, teams, and the complex technical problems they are trying to solve. He navigates highly complex socio-technical systems with ease and helps organisations focus on what truly matters. Adrian has a rare talent for enabling teams to work more closely together and build systems that are not only reliable, but resilient by design."

Benjamin Wilms

CEO, Steadybit

On practical impact

"Adrian brought a blend of deep expertise and practical insight to our team. He didn't just teach resilience patterns, he challenged our engineers to think differently about operational excellence and how to design systems with failure in mind. He was engaging, thought-provoking, and left the team with actionable ways to improve how we build and operate software."

Steve Bryen

CTO, Mesh-AI

Adrian Hornsby

Founder / Resilium Labs

LinkedIn Newsletter

Adrian Hornsby has spent nearly 25 years building and operating software systems, from research and telecommunications at Nokia through multiple startups to nearly a decade at AWS, where he progressed from Solutions Architect to Principal Engineer on the AWS Fault Injection Service team. He authored much of AWS's resilience and chaos engineering guidance, trained field communities across the organisation, and worked with internal teams including Prime Video, Amazon Search, and Lambda. He also holds a patent for fault-injection impact zone identification.

He is the author of Why We Still Suck at Resilience and writes the Resilience Bites newsletter. His work draws on resilience engineering research to explain why organisations keep having similar incidents despite doing all the right things, and his framework goes beyond tooling and process into the organisational patterns, feedback loops, and tensions that determine whether resilience efforts actually work.

Today, through Resilium Labs, Adrian works with Fortune 500 companies and growth-stage organisations across the world to diagnose broken resilience programmes and help teams build the capabilities to fix them. Most of these engagements are undeclared crisis management projects — organisations that aren't in flames, but can't see what's wrong. He advises VC portfolio companies on engineering practices, serves on advisory boards, and speaks regularly at conferences across the globe.

Based in Finland. Working globally.

Insights

Writing on resilience,
organisations, and AI.

The Invoice That Arrives After the Incident Is Over Why learning always loses when the other side of the comparison is zero

May 2026 26

The Severity Argument You Keep Having The argument no rubric will ever settle — and what would

May 2026 25

When No One in the Room Has Carried the Pager Why I built the Resilience Companion — and why “bootstrap” is the right word for it.

May 2026

View all insights →

Tools

Free tools. No sign-up.

Built to make visible the things most organisations leave invisible.

Calculator Cost of Downtime What an outage actually costs — including the months that follow. Self-assessment Where Does Learning Stop? Six questions to surface the patterns that keep incidents recurring. Open source Resilience Companion AI facilitator for operational readiness reviews and incident analysis. Open source Changebook Production change tracking from plan to verification. A pilot's checklist for ops.

Start a conversation

Ready to find out what's actually broken?

The best first step is a conversation to understand your current challenges and resilience goals. We'll help you figure out which step in the journey makes sense for your organisation.

Resilium Labs Oy +358 (0)504361615 adhorn@resiliumlabs.com

We Make Resilience Work.

You'll recognise this.

Start with diagnosis,then build from there.

The work speaksfor itself.

Why We Still Suck at Resilience

Writing on resilience,organisations, and AI.

Free tools. No sign-up.

Ready to find out what's actually broken?

Start with diagnosis,
then build from there.

The work speaks
for itself.

Writing on resilience,
organisations, and AI.