Role description
Key Responsibilities 1. Incident & Major Incident Management
- Lead the Incident Management function to ensure timely service restoration and SLA compliance.
- Actively participate and take ownership of Major Incident Management, demonstrating strong command during high-severity outages.
- Drive and moderate Major Incident bridge calls, ensuring real-time coordination among technical teams.
- Ensure structured communication to stakeholders during critical incidents, including business impact, technical updates, and recovery timelines.
- Conduct detailed Post-Incident Reviews (PIRs) and drive closure of corrective/preventive actions.
- Continuously refine the Major Incident Management process for improved response and resolution times.
2. Problem Management
- Identify recurring or impactful issues and lead detailed root cause analysis (RCA).
- Facilitate Problem Review Board (PRB) sessions to drive permanent fixes.
Maintain and enhance the Known Error Database (KEDB). - Use data to identify patterns and propose long-term stability improvements.
3. Change Management
- Govern the end-to-end Change Management process ensuring risk-controlled change deployments.
- Lead CAB meetings to evaluate proposed changes for risk, impact, and alignment with business priorities.
- Monitor key metrics such as change success rates and backout ratios; drive continuous improvement.
- Ensure process compliance, documentation quality, and appropriate approvals for all changes.
4. ITSM Tool Setup & Implementation (e.g., ServiceNow)
- Lead the end-to-end ITSM tool implementation, including platform configuration, workflow design, dashboard setup, and integrations.
- Configure Incident, Problem, Change, Service Request, CMDB, and other modules in alignment with ITIL standards.
- Manage data migration, UAT cycles, rollout planning, and stakeholder training.
Document workflows, governance models, and operational handbooks for long-term tool adoption. - Continuously refine the ITSM platform based on operational feedback and evolving business needs.
5. Service Quality Assurance & Reporting
- Prepare weekly, monthly, quarterly, and ad-hoc service performance decks.
- Consolidate metrics such as incident trends, MTTR, MTTD, RCA summaries, change success rates, SLA performance, and service stability indicators.
- Present insights, risks, improvement ideas, and KPI performance to leadership and customer stakeholders.
- Ensure data quality and consistency across all reporting streams.
- Maintain historical reporting data for audit and compliance.
6. Process Governance & Continuous Improvement
- Ensure ITIL/ITSM best practices are adopted consistently across the organization.
- Conduct periodic process evaluations and implement corrective actions.
- Lead continuous service improvement (CSI) initiatives across IT operations.
- Deliver process training and awareness sessions for relevant teams.
7. Stakeholder & Vendor Management
- Serve as the primary point of contact for service-related escalations and governance.
- Work closely with business teams, technical units, and external vendors to ensure SLA adherence.
- Facilitate service review meetings and monitor third-party performance.
Required Skills & Qualifications
- 6-12 years of experience in IT Service Management or IT Operations.
- Strong experience in Incident, Major Incident, Problem, and Change Management.
- Demonstrated capability in leading and actively participating in Major Incident Management with strong decision-making abilities under pressure.
- Proven hands-on experience in setting up ITSM tools (preferably ServiceNow) end-to-end.
- Strong understanding of ITIL framework; ITIL v3/v4 certification preferred.
- Excellent communication skills with the ability to handle executive-level updates during critical events.
- Strong analytical and reporting abilities (Power BI, Excel, PowerPoint).
- Proficiency with ITSM tools like ServiceNow, BMC Remedy, Freshservice, or ManageEngine SDP.
Skills
Major incident management, Change management, ITSM
|