New
Senior Software Engineer
![]() | |
![]() United States, Texas, Irving | |
![]() 7000 State Highway 161 (Show on map) | |
![]() | |
OverviewThe Microsoft Azure Compute platform is transforming industries and supporting individuals worldwide by delivering scalable cloud infrastructure to host services and workloads at global scale. Within this platform, the Azure Holmes team is dedicated to building a resilient foundation for running critical workloads with uninterrupted availability, reliability, and scalability. We are seeking engineers with experience in distributed systems who are ready to take on complex challenges and contribute to advancing this essential platform.Azure Compute is a fault-tolerant distributed system built on standard datacenter hardware. The Holmes team plays a key role by delivering dynamic resource management capabilities that improve customer availability and platform efficiency. Our services enable innovations such as placement reshaping, defragmentation, overbooking, and transparent maintenance-powered by intelligent algorithms designed for optimal performance.In this role, you will design and build highly available, event-driven microservices that enhance customer experience. You will collaborate with Microsoft Research to integrate advanced machine learning (ML) and artificial intelligence (AI) models, contributing to the evolution of a platform that powers mission-critical workloads at global scale.Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesCollaborates with appropriate stakeholders to determine user requirements for a scenario.Drives identification of dependencies and the development of design documents for a product, application, service, or platform.Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale. |