Job Description
We are searching for a motivated Site Reliability Engineer (SRE) to help drive Azure special projects. In this role, you will help Microsoft become a world leader at running and operating mission-critical workloads, running on dedicated hardware. We're an agile and nimble team in Azure focused on bringing the state of the art of mission-critical software into Microsoft and providing bare-metal machines in the Azure Cloud. Come join us!
Customers around the world depend on us to run their mission critical workload and place their trust in us to deliver the services they need. In order to make this work for our growing customer base, we need continual effort to make Azure highly reliable. Join a growing team, owning the system reliability of Azure Specialized.Our team, represents a deep investment in improving the availability, reliability, operational efficiency of our systems and services.
Our SRE team in Azure Specialized partners with other product engineering teams, and you will work closely with engineers, operations to ensure mission critical systems continue to work optimally for our customers.
Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Qualifications:
Required/Minimum Qualifications
- 3+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field.
- 2+years of industry experience using any of the following programming languages: C# OR Python OR Java OR C++, OR GO
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
- Microsoft Cloud Background Check:This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
- 4+ years technical experience in software engineering, network engineering, or systems administration
- OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
- OR Master's Degree in Computer Science, Information Technology, or related field.
Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable laws, regulations and ordinances. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request via the Accommodation request form .
Benefits/perks listed below may vary depending on the nature of your employment with Microsoft and the country where you work.
#azurecore
Responsibilities:
Technical Knowledge and Domain-Specific Expertise
- Develops a foundational understanding of distributed systems design, interactions between cloud technology layers and components, basic dependencies at scale, and the code that defines infrastructures. Can contribute to the code base the defines components or features of systems or cloud technologies to improve the reliability and operability of supported products, with direction with other engineers.
- Develops an understanding of the code, features, and operations of specific products at scale as required to contribute to incremental improvements in product availability, reliability, efficiency, observability, and/or performance; participates in on-boarding, code/design reviews, and regular meetings with the engineering teams that develop and/or manage those products.
- Develops and tests basic changes to optimize code and improve the observability, reliability and operability of a defined range of platform, system, or product components or features with direction from other engineers.
- Supports ongoing engagements with product engineering teams by participating in code/design reviews, regular meetings, on-call rotations, and incident responses throughout product development and operations cycles; draws insights from engagements with product engineering teams and basic analyses of telemetry data to propose potential improvements to code and designs for a defined set of product components or features with guidance from other engineers.
- Implements simple configuration and data changes across a predefined range of product components or features with guidance from other engineers to develop an understanding of how configurations, binaries, and data can be managed using code, tooling, and automation.
- Develops an understanding of how to safely and reliably manage changes in production by using existing tools and automation to enable product engineering teams implement changes across a defined range of components or features, with direction from other engineers.
- Uses existing tools to troubleshoot problems or flaws affecting the availability, reliability, performance, and/or efficiency of components or features with guidance from other engineers. Suggests potential solutions to resolve and prevent recurring issues and brings them to the attention of other engineers or team leads.
- Responds to incidents during regular on-call rotations by identifying the level of impact, troubleshooting basic issues, and deploying appropriate fixes to resolve root cause(s); alerts product teams or owners to major customer impacting issues and escalates the resolution of complex issues and/or those affecting multiple components or features to other engineers as needed. Shares details related to incidents and their resolution through post-mortem reports and during regular review meetings.
- Develops an understanding of key learnings, insights, and best practices that can be applied to improve system, platform, and/or product development and operations by participating in code/design reviews, incident drills and debriefs, and regular meetings, as well interactions with more experienced Site Reliability Engineers (SREs) and members of product engineering teams.
Explore More
Date Posted
08/08/2023
Views
0
Similar Jobs
Software Engineer II, Graphics/Vulkan - DigitalFish
Views in the last 30 days - 0
DigitalFish is seeking a Software Engineer II Graphics to join their dynamic team The ideal candidate will have experience in realtime graphics and ma...
View DetailsSr. RF Silicon Software Engineer (Starlink) - SpaceX
Views in the last 30 days - 0
SpaceX is actively developing technologies to make human life on Mars possible and deploying Starlink the worlds largest satellite constellation provi...
View DetailsSr. Software Engineer, Starlink Ground Stations - SpaceX
Views in the last 30 days - 0
SpaceX is a company that aims to make human life on Mars possible by developing advanced technologies for a future of outdoor exploration They are cur...
View DetailsSoftware Engineer, Starlink Ground Stations - SpaceX
Views in the last 30 days - 0
SpaceX is a company that aims to make human life multiplanetary by developing technologies for a future where humanity explores the stars They are cur...
View DetailsSenior Software Engineer, Networking Software - NVIDIA
Views in the last 30 days - 0
NVIDIAs platforms have made significant impacts in AI and SoftwareDefined Networking with widespread use across leading academic institutions startups...
View DetailsIT Engineer, End User Support - NVIDIA
Views in the last 30 days - 0
NVIDIA is seeking an IT Engineer to support Field Office sites manage IT inventory ensure compliance resolve issues communicate updates and improve op...
View Details