Senior Site Reliability Engineer - Team Lead

Trumid · Remote

Company

Trumid

Location

Remote

Type

Full Time

Job Description

Who are we?

Trumid is a financial technology company and fixed income electronic trading platform focused on US dollar-denominated Investment Grade, High Yield, Distressed, and Emerging Market bonds. Trumid optimizes the credit trading experience by combining agile technology and market expertise, with a focus on product design. The result is a differentiated ecosystem of protocols and trading solutions delivered within one intuitive platform. Learn more at www.trumid.com.

Who are you?

Hope is not a strategy. 

The Senior Site Reliability Engineer – Team Lead combines software and systems engineering to build and run large scale, distributed, low latency, fault tolerant systems.  You’ll have the opportunity to manage the challenges of scaling complex systems while using your expertise in coding, system design/architecture and system analysis.  In this newly created role, you’re a player coach, setting the roadmap to drive best practices coupled with leveraging your technical expertise to make an impact daily.  This role involves collaborating with cross-functional teams, implementing best practices and leveraging your technical expertise to ensure seamless operation of a high throughput, complex cloud-based infrastructure. 

Are you someone who thrives in environments characterized by rapid innovation and the impending large acceptance of imminent change?  Do you live by the core tenants of availability, latency, performance, efficiency, change management, monitoring and capacity planning?  As an SRE you will own the lifecycle of a suite of services that enable observability across our development and product environments, allowing our teams to maintain metrics and log platforms using Prometheus, Grafana, Terraform and Kubernetes.

What will you do in this role?

  • Implement strategies to achieve high performance in the four key DORA metrics:
    • Deployment Frequency: Facilitate frequent and reliable deployments to production.
    • Lead Time for Changes: Improve the speed from code commit to code running in production.
    • Mean Time to Recovery: Minimize the time it takes to restore a service after an incident or downtime.
    • Change Failure Rate: Reduce the number of deployment failures relative to the total number of deployments.
  • Provide technical leadership in areas of system design and architecture, infrastructure automation and performance optimization; strong technical background in areas such as distributed systems, cloud infrastructure and software engineering
  • Design and architect strategies to ensure high availability, scalability, monitoring and alerting of critical systems
  • Collaborate with engineering teams to optimize system design, deployment and maintenance to ensure systems are built with scalability, reliability and observability at top of mind
  •  Collaborate with engineering teams to ensure smooth deployment of release processes, driving a culture of continuous integration and delivery (CI/CD)
  • Define and monitor SLOs and KPIs for systems and services to proactively identify and resolve issues
  • Drive the adoption of best practices, standards and methodologies for incident management and response, monitoring and alerts with a focus on metrics and reporting

A bit about our growing tech stack and more about us!

  • Hands on experience with containerization technologies such as Docker and Kubernetes
  • Infrastructure as code tools such as Terraform and Ansible
  • Monitoring and observability tools such as Prometheus and Grafana
  •  Deep focus on continuous improvement, problem solving and troubleshooting Collaborative team environment, strong desire for you to share your technical expertise and knowledge and demonstrate your leadership abilities

Just a few perks that our employees enjoy!

  • Remote first
  • Highly competitive compensation
  • Fully paid medical, dental and vision coverage
  • Team-oriented and collaborative company culture

Trumid is an equal opportunity employer.

In compliance with New York City Pay Transparency Law, the base salary range for this role in New York City is between $175,000 - $275,000. This range does not include discretionary bonus or other forms of compensation or benefits offered in connection with this job. Trumid incorporates several factors when determining a candidate’s compensation. 

Apply Now

Date Posted

06/06/2023

Views

6

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0.7
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories