Site Reliability Engineering Manager

Agero · Remote

Company

Agero

Location

Remote

Type

Full Time

Job Description

About Agero:Wherever drivers go, we’re leading the way. Agero’s mission is to rethink the vehicle ownership experience through a powerful combination of passionate people and data-driven technology, strengthening our clients’ relationships with their customers. As the #1 B2B, white-label provider of digital driver assistance services, we’re pushing the industry in a new direction, taking manual processes, and redefining them as digital, transparent, and connected. This includes: an industry-leading dispatch management platform powered by Swoop; comprehensive accident management services; knowledgeable consumer affairs and connected vehicle capabilities; and a growing marketplace of services, discounts and support enabled by a robust partner ecosystem. The company has over 150 million vehicle coverage points in partnership with leading automobile manufacturers, insurance carriers and many others. Managing one of the largest national networks of service providers, Agero responds to approximately 12 million service events annually. Agero, a member company of The Cross Country Group, is headquartered in Medford, Mass., with operations throughout North America. To learn more, visit www.agero.com.

About the Role:

  • Responsible for SRE team arrangement and project management, guiding basic SRE work to be more effective, and improving the overall SRE efficiency.
  • Drive the design and engineering of tools, as well as platform solutions, to optimize product engineering and operation efficiencies.
  • Manage on call processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.  
  • Influence and motivate teams across a diverse set of vertical domains and geographic locations to ensure customer and merchant incidents are addressed rapidly and efficiently so that our software applications are available and functional 24x7x365.
  • Work with senior management in the event of issue escalation.
  • Provide clear communication to executives and key stakeholders regarding the business impact, risks, prioritization, mitigation, and estimated time-to-fix for these incidents on a timely basis.
  • Ensure appropriate monitoring is in place for reliable operations of all applications and initiate corrective action plans when appropriate.
  • Document incident information in the incident management system and ensure data is accurate, and complete. Doing so will help you identify incident and data trends (including gaps and inaccuracies) through the normal course of incident management (post mortem).
  • Provide informal incident process and requirement training to cross-functional teams, as needed, to support consistent incident management execution.
  • Collaborate with Service Owners to define the SLOs and build SLIs to ensures systems are meeting the SLAs
  • Responsible for training team members and putting process & procedure in place to support the system and to handle the critical incidents.
  • Coordinate appropriate resources to resolve critical incidents in accordance with service level agreements and operational level agreements.
  • Own all communication during a major system outage, ensuring IT management and the businesses are kept updated until the incident is resolved.
  • With thorough understanding of technology assets/environments/services, business needs and SLAs/SLOs, lead the creation, revision and implementation of monitoring tools, processes and uptime reports. 

Key Outcomes:

  • Build and invest in relationships with key partners while learning the business and supporting model
  • Implement AIOps machine learning solutions to automate the detection, consolidation, and remediation of alerts, events, and metrics in our platforms.
  • Modernize processes to enable automation for change control, runbooks, documentation publishing, and monitoring solutions.
  • Drive adoption of unified processes for Monitoring, Alerting, Incident Response and cross-product visibility as the enterprise product portfolios evolve.

Skills, Experiences and Education:

  • B.S. in Electrical or Computer Engineering, Computer Science or relevant work experience
  • 7+  years of experience in large complex information systems, and/or Cloud environments
  • 7 years of experience in an engineering centric workflow environment.
  • Broad experience in troubleshooting large-scale distributed systems covering application, cloud, OS, networking, and storage areas
  • Self-motivated and proactive, with demonstrated creative and critical thinking capabilities
  • A clear communicator, compassionate leader who loves SRE
Hiring In:
  • Canada: Province of Ontario

D, E & I Mission & Culture at Agero:

We are all Change Drivers at Agero. Each day, we speak to thousands of drivers and tow professionals across one of the most diverse countries in the world. Our mission to safeguard drivers on the road, strengthen our clients’ relationships with their drivers, and support the communities we live and work in unites us together as one force driving positive change.

The road to positive change starts inside Agero. In celebrating each other’s differences, we lift each other up and create space for innovation and community. Bringing our whole selves to work powers our commitment, drive, agility, and courage - ensuring we are not only changing the landscape of the driver services industry, we also are making a difference in the lives of our customers with each call, chat, and rescue.

THIS DESCRIPTION IS NOT INTENDED TO BE A COMPLETE STATEMENT OF JOB CONTENT, RATHER TO ACT AS A GUIDE TO THE ESSENTIAL FUNCTIONS PERFORMED. MANAGEMENT RETAINS THE DISCRETION TO ADD TO OR CHANGE THE DUTIES OF THE POSITION AT ANY TIME.

To review Agero's privacy policy click the link: https://www.agero.com/privacy.

Apply Now

Date Posted

01/27/2023

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Senior Design Manager (Infrastructure) - Canonical

Views in the last 30 days - 0

Canonical a leading opensource provider seeks a Senior Design Manager to drive innovation in cloud and AI technologies The role offers remote work glo...

View Details

Product Manager Wallet SDKs - Startale

Views in the last 30 days - 0

The text describes a job alert system where applicants must mention UNSELFISH and use a specific tag to demonstrate they read the post It explains the...

View Details

Senior Product Designer - Org & Security - Typeform

Views in the last 30 days - 0

This job description outlines a role in developing an intelligent contact management system with AI capabilities The position involves designing user ...

View Details

Executive Director Patient Advocacy - Kyverna Therapeutics

Views in the last 30 days - 0

Kyverna Therapeutics is seeking an Executive Director for Patient Advocacy to lead initiatives in autoimmune disease treatment The role involves build...

View Details

Medical Affairs Writer Contract - Kyverna Therapeutics

Views in the last 30 days - 0

Kyverna Therapeutics seeks a Medical Affairs Writer to develop scientific publications and communications for cell therapy innovations The role requir...

View Details

Recovery Analyst Underpayments - Trend Health Partners

Views in the last 30 days - 0

TREND Health Partners seeks an Underpayment Recovery Analyst to optimize client reimbursement through collaboration and detailed claim analysis The ro...

View Details