Approach Framework Services Results About Start with Explorer

Your customers know about outages before your team does.

I help engineering leaders stop firefighting and start designing for reliability. Through reliability targets, operational ownership, and measurable outcomes — not more tools.

Start with an Explorer
1
Explorer
Find out where you stand
2
Assessment
Build the plan
3
Implementation
Make it real

You know reliability is the problem. You just don't know where to start.

New leadership. Years of accumulated issues nobody owns. Monitoring that exists but doesn't drive decisions. Your teams are aware something needs to change — but fixing everything at once isn't an option. You need a structured way to prioritise.

01

See what you actually have

Most organisations overestimate their monitoring and underestimate their gaps. I map what's real — what's measured, what's missing, and who owns what — so you stop guessing.

02

Turn ambition into targets

Leadership wants “better reliability.” But what does that mean for each team? I help you define measurable targets that connect engineering work to what your customers actually experience.

03

Make it part of how teams work

Reliability that depends on heroics doesn't scale. Error budgets, ownership, and reviews become part of your operating rhythm — not an extra burden on top of delivery.

04

Transfer capability, then leave

Everything I build is designed to be owned by your team. Documented processes, trained engineers, proven playbooks. My goal is to make myself unnecessary.

SLO Maturity Model — where are you today?

Most organisations start at Level 1-2. The biggest ROI is in the jump to Level 3. Not everyone needs Level 5.

1
Ad-hoc
You learn about problems from customers
2
Defined
Reliability targets exist on paper
3
Operational
Targets are monitored, alerts fire, teams respond
4
Standardised
Error budgets drive decisions, cross-team consistency
5
Strategic
Reliability informs investment and roadmap

Self-Assessment

Most teams score Level 1-2 and don't realise it until a customer finds the problem first. Leave your email — I'll send you the self-assessment so you can see where you stand and what closing the gap looks like.

Done — check your inbox shortly.

Three tiers. Start small. Scale what works.

Every engagement starts with Explorer. If the findings don't warrant going further, you keep the assessment. No lock-in, no assumptions.

Start Here

Explorer

2–5 days · On-site or remote
From €6,000

Find out where you stand. A structured reliability audit across your product domains — giving you the data to decide what to prioritise and where to invest.

  • Reliability scorecard across your product domains
  • Gap analysis: monitoring coverage, alerting gaps, ownership blind spots
  • Recommended pilot domain with target dimensions and measurement approach
  • Scoped recommendation for next steps — if findings warrant it
Build the Plan

Assessment

Strategy → Presentation → Workshops → Pilot
Scoped based on Explorer findings

From “we know the gaps” to “we have a plan and a working pilot.” I build the reliability strategy, present it to your leadership for alignment, then run workshops where your teams define their own targets — and implement monitoring for the pilot domain.

  • Reliability strategy tailored to your organisation and presented to leadership
  • Facilitated workshops: teams define and own their reliability targets — not imposed top-down
  • Pilot implementation: monitoring, alerting, and dashboards for one domain
  • Scalable playbook for rolling out to remaining domains
Make It Real

Implementation

Ongoing · Flexible commitment
Scoped to your needs

Hands-on support for teams rolling out reliability improvements across domains. I work alongside your engineers — building, coaching, and reviewing until the capability is theirs.

  • Multi-domain rollout of reliability targets, alerting, and dashboards
  • Team coaching, incident reviews, and operational shadowing
  • Quarterly maturity reviews and course correction
  • Infrastructure, observability, and cost governance improvements as needed
“8 production alarms in ALARM state. Zero subscribers. The team learned about outages from customers.”

Trading platform, AWS EKS — Level 1 to Level 3 in one quarter. First issue caught by monitoring before a customer reported it.

The advisor behind the practice.

Mateusz Szymczyk
Mateusz Szymczyk
Founder, Reliability Lead

Reliability problems are rarely technical problems. They're organizational problems wearing technical costumes. Teams don't need another monitoring tool. They need clear ownership, measurable targets, and the operational discipline to act on what the data tells them.

12+ years in DevOps and SRE — from enterprise DevOps frameworks to global Kubernetes platforms and observability architectures. I work alongside teams until the practices stick and the capability is theirs to own. Currently implementing reliability frameworks for trading platforms and data platforms on AWS.

Let's start with a conversation
about where your team stands today.

No pitch — just a useful discussion about your reliability challenges and whether working together makes sense.

Start a conversation