Telemetry Engineering Operations Manager
![]() | |
![]() United States, Texas, Irving | |
![]() 7000 State Highway 161 (Show on map) | |
![]() | |
OverviewThe Cloud & AI organization accelerates Microsoft's mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers' heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme of inspiring excellence, and encouraging teams and leaders to bring their best each day. In doing so, we create life-changing innovations that impact billions of lives around the world. Microsoft is one of the largest enterprise service companies in the world. We are hiring a Telemetry Engineering Operations Manager for the Telemetry Engineering team! This role plays a pivotal part in orchestrating the intake, triage, and execution of telemetry-related requests across engineering, product, and operational domains. It ensures that telemetry processes are scalable, standardized, and responsive to evolving business needs by driving the development of Standard Operating Procedures (SOPs), managing service onboarding, and coordinating cross-functional handoffs. The Operations Manager partners closely with engineering leads, product managers, and stakeholders such as the Global Hunting, Oversight, and Strategic Triage (Ghost) team and the Infrastructure and Configuration Engine (ICE) team to clarify ownership boundaries, resolve solutioning gaps, and maintain accountability for request fulfillment. By systematizing intake paths (e.g., predefined services, engineering asks, and ambiguous cases), the role enables efficient routing, reduces ad hoc decision-making, and supports strategic initiatives like heartbeat monitoring and backlog reduction. Success in this role requires proficiency in process design acumen, stakeholder engagement, and the ability to translate complex workflows into actionable playbooks that enhance telemetry reliability, compliance, and data aggregation at scale.Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
ResponsibilitiesLead Intake and Triage Processes: Design and manage structured intake paths for telemetry-related requests, distinguishing between known operational cases and complex engineering asks. Ensure efficient routing to appropriate teams and reduce ambiguity in request handling.Own Operational Handoffs and Escalations: Oversee the transition of telemetry issues from engineering to operations, particularly in scenarios like data drift or loss. Ensure timely escalation, follow-through, and resolution tracking across service owners and stakeholders.Drive Service Onboarding and Stakeholder Engagement: Manage onboarding workflows for partner teams such as Infrastructure and Configuration Engine (ICE) and Operations Substrate (OPS Sub), building operational muscle and fostering collaboration. Capture lessons learned to improve future onboarding efforts.Monitor Backlog and Service Level Agreement (SLA) Performance: Track intake volumes, backlog age, and SLA adherence across telemetry request queues. Develop dashboards and reporting mechanisms to surface trends and inform leadership decisions.Standardize and Document Operational Models: Create and maintain playbooks and SOPs that define prioritization models, escalation paths, and ownership boundaries. Ensure these models are externally communicable and scalable as the team expands its scope.Support Strategic Initiatives and Dependency Management:Coordinate delivery of strategic projects like heartbeat monitoring, ensuring external dependencies are tracked, escalated, and resolved. Report progress and risks to leadership with transparency.Enable Knowledge Management and Process Improvement: Address gaps in data cataloging, request tracking, and historical context by implementing centralized knowledge systems. Promote continuous improvement through retrospectives and stakeholder feedback.Embody our Culture and Values |