ITIL Problem Management Analyst
Irvine, CA
Contracted
Mid Level
ITIL Problem Management Analyst
Location: Irvine, CA (5 days onsite) 92614
Duration: 6 Months
Position Summary
We are seeking an experienced Problem Management Lead Analyst to drive service stability and continuous improvement across IT Operations. This role will own the end-to-end Problem Management process, lead root cause investigations, identify systemic issues, and partner with technology teams to eliminate recurring incidents and improve overall service reliability.
Key Responsibilities
Problem Management Leadership
Required Qualifications
Location: Irvine, CA (5 days onsite) 92614
Duration: 6 Months
Position Summary
We are seeking an experienced Problem Management Lead Analyst to drive service stability and continuous improvement across IT Operations. This role will own the end-to-end Problem Management process, lead root cause investigations, identify systemic issues, and partner with technology teams to eliminate recurring incidents and improve overall service reliability.
Key Responsibilities
Problem Management Leadership
- Own and govern the end-to-end Problem Management lifecycle in alignment with ITIL best practices.
- Lead proactive and reactive problem investigations to identify underlying causes of recurring incidents and service disruptions.
- Facilitate Root Cause Analysis (RCA) sessions using structured methodologies such as 5 Whys, Fishbone Analysis, Fault Tree Analysis, and Kepner-Tregoe.
- Establish and maintain a Known Error Database (KEDB), ensuring accurate documentation of known errors and workarounds.
- Track corrective and preventive actions through resolution and verify effectiveness of implemented fixes.
- Drive accountability across Infrastructure, Network, Cloud, Security, End User Computing, and Application teams to resolve systemic issues.
- Analyze incident, change, and operational data to identify trends, recurring issues, and opportunities for service improvement.
- Develop and present actionable recommendations to improve platform stability, reduce incident volumes, and enhance service performance.
- Lead recurring service review meetings focused on problem trends, chronic issues, and risk mitigation.
- Identify automation opportunities and process improvements that reduce operational effort and prevent recurring incidents.
- Contribute to operational excellence initiatives, knowledge management, and runbook enhancements.
- Utilize ServiceNow Problem Management capabilities to manage problem records, known errors, corrective actions, and reporting.
- Establish KPIs and metrics related to problem management effectiveness, including recurring incident reduction, RCA completion, and corrective action closure.
- Create executive-level dashboards and reports highlighting service health trends, top recurring issues, and improvement initiatives.
- Ensure compliance with ITIL processes, documentation standards, and audit requirements.
- Partner with Major Incident Management teams to ensure high-priority incidents are transitioned into formal problem investigations when appropriate.
- Lead Post-Incident Reviews (PIRs) focused on identifying root causes and preventive actions.
- Collaborate with Change Management teams to ensure corrective actions are properly planned, tested, and implemented.
- Assess risks associated with recurring issues and provide recommendations for long-term remediation.
- 5+ years of experience in IT Operations with at least 3 years focused on Problem Management, Service Reliability, or IT Service Management.
- ITIL Foundation certification required; ITIL Managing Professional or Advanced certifications preferred.
- Strong hands-on experience with ServiceNow Problem Management, Incident Management, and reporting modules.
- Proven experience conducting complex Root Cause Analysis and facilitating cross-functional problem review sessions.
- Strong understanding of enterprise IT infrastructure including Servers, Cloud, Network, End User Computing, and Applications.
- Experience developing metrics, dashboards, and executive reporting.
- Excellent facilitation, communication, and stakeholder management skills.
- Ability to influence technical teams and drive resolution of long-standing operational issues.
- Experience implementing Problem Management programs or maturing ITSM processes.
- Familiarity with SRE, Reliability Engineering, or Operational Excellence frameworks.
- Experience with Power BI, Tableau, or ServiceNow Performance Analytics.
- Knowledge of automation platforms and operational process optimization.
Apply for this position
Required*