Site Reliability Engineering Manager

Agero · Remote

Company

Agero

Location

Remote

Type

Full Time

Job Description

About Agero:Wherever drivers go, we’re leading the way. Agero’s mission is to rethink the vehicle ownership experience through a powerful combination of passionate people and data-driven technology, strengthening our clients’ relationships with their customers. As the #1 B2B, white-label provider of digital driver assistance services, we’re pushing the industry in a new direction, taking manual processes, and redefining them as digital, transparent, and connected. This includes: an industry-leading dispatch management platform powered by Swoop; comprehensive accident management services; knowledgeable consumer affairs and connected vehicle capabilities; and a growing marketplace of services, discounts and support enabled by a robust partner ecosystem. The company has over 150 million vehicle coverage points in partnership with leading automobile manufacturers, insurance carriers and many others. Managing one of the largest national networks of service providers, Agero responds to approximately 12 million service events annually. Agero, a member company of The Cross Country Group, is headquartered in Medford, Mass., with operations throughout North America. To learn more, visit www.agero.com.

About the Role:

  • Responsible for SRE team arrangement and project management, guiding basic SRE work to be more effective, and improving the overall SRE efficiency.
  • Drive the design and engineering of tools, as well as platform solutions, to optimize product engineering and operation efficiencies.
  • Manage on call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.  
  • Influence and motivate teams across a diverse set of vertical domains and geographic locations to ensure customer and merchant incidents are addressed rapidly and efficiently so that our software applications are available and functional 24x7x365.
  • Work with senior management in the event of issue escalation.
  • Provide clear communication to executives and key stakeholders regarding the business impact, risks, prioritization, mitigation, and estimated time-to-fix for these incidents on a timely basis.
  • Ensure appropriate monitoring is in place for reliable operations of all applications and initiate corrective action plans when appropriate.
  • Document incident information in the incident management system and ensure data is accurate, and complete. Doing so will help you identify incident and data trends (including gaps and inaccuracies) through the normal course of incident management (post mortem).
  • Provide informal incident process and requirement training to cross-functional teams, as needed, to support consistent incident management execution.
  • Collaborate with Service Owners to define the SLOs and build SLIs to ensures systems are meeting the SLAs
  • Responsible for training team members and putting process & procedure in place to support the system and to handle the critical incidents.
  • Coordinate appropriate resources to resolve critical incidents in accordance with service level agreements and operational level agreements.
  • Own all communication during a major system outage, ensuring IT management and the businesses are kept updated until the incident is resolved.
  • With thorough understanding of technology assets/environments/services, business needs and SLAs/SLOs, lead the creation, revision and implementation of monitoring tools, processes and uptime reports. 

Key Outcomes:

  • Build and invest in relationships with key partners while learning the business and supporting model
  • Implement AIOps machine learning solutions to automate the detection, consolidation, and remediation of alerts, events, and metrics in our platforms.
  • Modernize processes to enable automation for change control, runbooks, documentation publishing, and monitoring solutions.
  • Drive adoption of unified processes for Monitoring, Alerting, Incident Response and cross-product visibility as the enterprise product portfolios evolve.

Skills, Experiences and Education:

  • B.S. in Electrical or Computer Engineering, Computer Science or relevant work experience
  • 7+  years of experience in large complex information systems, and/or Cloud environments
  • 7 years of experience in an engineering centric workflow environment.
  • Broad experience in troubleshooting large-scale distributed systems covering application, cloud, OS, networking, and storage areas
  • Self-motivated and proactive, with demonstrated creative and critical thinking capabilities
  • A clear communicator, compassionate leader who loves SRE
Hiring In:
  • Canada: Province of Ontario

D, E & I Mission & Culture at Agero:

We are all Change Drivers at Agero. Each day, we speak to thousands of drivers and tow professionals across one of the most diverse countries in the world. Our mission to safeguard drivers on the road, strengthen our clients’ relationships with their drivers, and support the communities we live and work in unites us together as one force driving positive change.

The road to positive change starts inside Agero. In celebrating each other’s differences, we lift each other up and create space for innovation and community. Bringing our whole selves to work powers our commitment, drive, agility, and courage - ensuring we are not only changing the landscape of the driver services industry, we also are making a difference in the lives of our customers with each call, chat, and rescue.

THIS DESCRIPTION IS NOT INTENDED TO BE A COMPLETE STATEMENT OF JOB CONTENT, RATHER TO ACT AS A GUIDE TO THE ESSENTIAL FUNCTIONS PERFORMED. MANAGEMENT RETAINS THE DISCRETION TO ADD TO OR CHANGE THE DUTIES OF THE POSITION AT ANY TIME.

To review Agero's privacy policy click the link: https://www.agero.com/privacy.

Apply Now

Date Posted

01/27/2023

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

© 2026 Job Transparency. All rights reserved.