SR Site Reliability Engineer
Job Description
Resmed is seeking a Site Reliability Engineer - SRE to help define and execute against a Site Reliability
Engineering strategy for its rapidly expanding Digital Health Technology group. You will use your software engineering
expertise to constantly automate processes and innovate in a push to improve the reliability of the system. You will plan,
design, build and maintain large scale engineering solutions. Whether a bug fix or an awesome feature, you will own
your work and deliver the most elegant and scalable solutions
Let's talk about Responsibilities
• Monitoring and metrics - establishing desired service behavior, measuring how the service is actually behaving availability, latency, and overall system health), and correcting discrepancies• Emergency response - noticing and responding effectively to service failures in order to preserve the service's conformance to its SLA (service-level agreement)• Work to simplify and automate deployment processes, run-time operations, and provide non-disruptive releases• Change management - altering the behavior of a service while preserving service reliability• Capacity planning - projecting future demand and ensuring that a service has enough computing resources in appropriate locations to satisfy that demand• Service turn-up and turn-down - deploying and removing computing resources for a service in a data center in a predictable fashion, often as a consequence of capacity planning• Performance - design, development, and engineering related to scalability, isolation, latency, throughput, and efficiency• Scaling systems sustainably through mechanisms such as automation• Participate in planning discussions with Product Development and other IT teams• Maintain expertise in the area of architecture, including industry trends, strategies, and products to ensure that our assets are effectively and efficiently utilized• Evolving systems by pushing for changes that improve reliability and velocity• Conducting incident responses and blameless postmortems
Let's talk about Qualifications and Experience
Required:• Bachelor's degree in Computer Science or Information Systems or equivalent technical discipline, minimum 4 years working experience in an enterprise 24/7 production environment supporting
critical, real-time applications.• Minimum 5 years of experience focused on site reliability for high-traffic applications• Systematic problem-solving approach, combined with strong communication skills and a sense of ownership and
drive• Meticulous analytical skills to identify and understand the root cause of critical issues• Excellent planning and communication skills, including the use of PowerPoint, Excel spreadsheets and database
queries to analyze and present data• Expert full-stack debugging and performance optimization ability, including hands-on knowledge of AWS• Extensive experience with monitoring tools such as Site24x, DataDog and AWS native tools• Track record monitoring and analyzing system performance, isolating issues or bottlenecks that could impact
reliability, performance and scalability• Good verbal and written communication skills, and be able to work effectively with geographically remote teams
Good to have:
• Able to write terraform, python and lambda code in the AWS environment
• Experience using Atlassian tools like Bamboo, Confluence and JIRA
• Understanding of Product Development Life Cycle, including Agile SCRUM, TDD, BDD• Experience with Machine Learning
Joining us is more than saying "yes" to making the world a healthier place. It's discovering a career that's challenging, supportive and inspiring. Where a culture driven by excellence helps you not only meet your goals, but also create new ones. We focus on creating a diverse and inclusive culture, encouraging individual expression in the workplace and thrive on the innovative ideas this generates. If this sounds like the workplace for you, apply now!
Date Posted
10/21/2022
Views
6
Similar Jobs
Software Engineer - Mulligan Funding
Views in the last 30 days - 13
Mulligan Funding is a leading provider of working capital to small and mediumsized businesses They are seeking a Senior Software Engineer with full st...
View DetailsSenior Online Engineer - Visual Concepts
Views in the last 30 days - 6
Visual Concepts is a game development studio looking for an experienced Online Engineer to build impactful features and services for players and devel...
View DetailsSoftware Engineer, Gameplay and Camera - Visual Concepts
Views in the last 30 days - 8
Visual Concepts is a game development studio looking for a Gameplay Software Engineer to work on WWE 2K The role requires experience in game developme...
View DetailsSoftware Engineer, Tech Lead: Ads Attribution - 14+ Years of Experience - Snap Inc.
Views in the last 30 days - 9
Snap Inc is a technology company that contributes to human progress by empowering people to express themselves live in the moment learn about the worl...
View DetailsEnterprise Account Executive - Los Angeles - Arkose Labs
Views in the last 30 days - 10
Arkose Labs is a world leader in Account Security trusted by major digital brands like Microsoft PayPal and Snap With a 145 Net Retention Rate Arkose ...
View DetailsProduct Designer - Snackpass
Views in the last 30 days - 6
Snackpass is a fastgrowing marketplace that aims to unify the physical and digital world for local commerce They are looking for a Product Designer to...
View Details