Service Delivery and Incident Response Lead
OpsWerks

About the job
We’re looking for a Service Delivery & Incident Response Lead who thrives at the intersection of people’s leadership, operational reliability, and continuous improvement. You’ll lead engineers supporting mission-critical cloud and infrastructure environments, ensuring stability, responsiveness, and operational excellence 24×7.
This role combines real-time incident command with team development, process optimization, and cross-functional collaboration to keep our systems and our team performing at their best.
Your Role
People & Team Leadership
- Lead, coach, and mentor IT engineers to build strong technical and leadership capabilities
- Set clear performance goals aligned with our Beliefs, Vision, Mission, Methods (BVMM)
- Conduct 1:1s, performance reviews, and career growth discussions
- Foster a culture of ownership, collaboration, and continuous learning
- Maintain balanced workloads, shift coverage, and clear succession plans to sustain healthy 24×7 operations
Service Operations & Reliability
- Oversee daily service health, capacity, and reliability across all supported environments
- Ensure compliance with operational KPIs through proactive planning and improvement
- Balance demand vs. capacity and manage shift coverage to prevent burnout
- Partner with engineering teams to maintain runbooks, knowledge bases, and escalation paths
- Drive automation and workflow optimization to reduce manual overhead
- Use data insights to guide decisions and improvements
Incident & Problem Management
- Lead end-to-end incident response, triage, communication, and resolution in real time
- Act as Incident Commander for high-impact events across a global environment
- Track and improve metrics like MTTD, MTTM, and MTTR
- Champion blameless Post-Incident Reviews (PIRs) and translate learnings into long-term system and process improvements
Strategic & Cross-Functional Impact
- Represent in customer reviews, operational syncs, and briefings
- Collaborate with SREs, product owners, and partner engineers to align priorities and reliability goals
- Contribute to frameworks and governance initiatives
- Lead service onboarding/off-boarding and strengthen operational readiness checkpoints
- Identify and close systemic operational gaps through process and tool improvements
Your Qualifications
- Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related discipline
- 3+ years in Service Delivery, Incident Response, or Operations Leadership within enterprise-scale, 24×7 environments
- Proven experience managing technical teams, driving performance, and leading through critical situations
- Strong grounding in ITSM / ITIL principles (Incident & Problem Management)
- Familiarity with cloud, distributed systems, or enterprise infrastructure
- Skilled in monitoring, alerting, and ticketing tools (e.g., PagerDuty, Datadog, Grafana, Splunk, ServiceNow).
Core Competencies
- People and Performance Leadership
- Incident Command and Escalation Management
- Analytical and Problem-Solving Skills
- Communication and Decision-Making Under Pressure
- Root Cause and Post-Incident Analysis
- Operational Planning and Service Governance
- Stakeholder and Partner Management
- IT Service Management (Incident & Problem Management)
- Observability, Monitoring, and Automation Tools
- Passion for People Development, Operational Discipline, and Continuous Improvement
Good to Have
- ITIL V3 or V4 certification
- AWS Certified SysOps Administrator
- SRE Foundation or Crisis/Incident Management certifications
- Background in SRE practices and operational frameworks that promote reliability and automation
What You’ll Help Us Maintain
- Enterprise-grade reliability: Ensuring highly available, resilient systems powering global business operations.
- Customer-grade experience: Seamless, always-on access to applications, cloud workloads, and core services.
How to apply
To apply for this job you need to authorize on our website. If you don't have an account yet, please register.
Post a resumeSimilar jobs
RCI: Finance Analyst (Taxation)

DE030718-Client Financial Mgmt Analyst

Service Support Coordinator II
