Site Reliability Engineer Technical Lead

Nethermind · Europe

Company

Nethermind

Location

Europe

Type

Full Time

Job Description

Are you the one? We're seeking an experienced Site Reliability Engineer to lead and mentor our SRE team. You're a seasoned professional with a proven track record in designing and implementing robust SRE processes at scale. You excel in cloud and hybrid environments have a deep understanding of containerization and are passionate about creating resilient high-performance systems that can handle extreme traffic peaks. Beyond technical expertise you're a skilled communicator and collaborator able to bridge the gap between technical teams and stakeholders. You thrive in cross-functional environments and can effectively represent SRE concerns at the leadership level.

Responsibilities:

  • Lead the implementation and refinement of SRE practices across the organization including SLOs error budgets and blameless postmortems

  • Design and implement automation to eliminate toil and improve system reliability and efficiency

  • Lead initiatives and architect scalable hybrid cloud solutions for Web3 infrastructure

  • Manage error budgets and make data-driven decisions about when to prioritize reliability vs. new features

  • Drive SRE practices to ensure high availability performance and reliability under varying load conditions

  • Collaborate closely with Platform engineering team to build reliability into services from the ground up

  • Collaborate closely with Nethermind’s Infrastructure Leadership department to align SRE strategies with overall technical vision

  • Drive the adoption of observability best practices and implement comprehensive monitoring systems

  • Develop and maintain service level indicators (SLIs) and objectives (SLOs) working with product owners to define appropriate reliability targets

  • Mentor team members in SRE practices and foster a culture of continuous learning

  • Lead capacity planning efforts using quantitative analysis to predict and address future scaling challenges

  • Contribute to long-term technical roadmaps balancing reliability concerns with product innovation

Skills:

  • 5+ years of experience in Site Reliability Engineering or DevOps

  • Expert knowledge of cloud platforms (AWS GCP)

  • Expert knowledge of Kubernetes

  • Proven experience in designing and implementing scalable efficient resilient systems

  • Deep understanding of Linux/Unix systems and networking protocols

  • Strong programming skills in Python or Go

  • Strong background in monitoring observability and logging systems (e.g. Grafana Prometheus Loki)

  • Expertise in CI/CD tools (e.g. GitHub Actions ArgoCD)

  • Excellent communication skills both written and verbal with the ability to explain complex technical concepts to various audiences

  • Experience in producing technical documentation runbooks presentations and post-mortem reports

  • Experience and passion for mentoring and upskilling team members

Nice to have:

  • Experience leading technical teams

  • Contributions to open-source projects or thought leadership in SRE

  • Familiarity with MLOps and big data technologies

  • Knowledge of blockchain technology and infrastructure

  • Experience with chaos engineering principles and tools

  • Familiarity with traffic management and CDN technologies

  • Systems or backend engineering background

Apply Now

Date Posted

09/14/2024

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Senior Full Stack Engineer - Swissblock

Views in the last 30 days - 0

Swissblock seeks a Full Stack Software Engineer to develop innovative financial tools The role involves creating userfriendly interfaces and improving...

View Details

Senior AI Full-Stack Software Engineer - Skedda

Views in the last 30 days - 0

Skedda is seeking a senior AIfocused fullstack developer to contribute to innovative workplace management solutions The role offers competitive compen...

View Details

Senior Go-to-Market (RevOps) Engineer - Skedda

Views in the last 30 days - 0

Skedda offers a competitive salary flexible work and a collaborative environment The role involves software development and innovation with a focus on...

View Details

Senior Platform Engineer - Infrastructure - Kalepa

Views in the last 30 days - 0

This job description highlights a senior engineering role with a competitive salary range of 85k155k equity options and benefits like PTO gym reimburs...

View Details

Senior Support Engineer - n8n

Views in the last 30 days - 0

n8n is a rapidly growing AI platform with a strong community and impressive achievements They offer competitive roles and a positive work culture emph...

View Details

Staff Backend Engineer - PHP + Go - Hostaway

Views in the last 30 days - 0

Hostaway offers a remote backend engineer role in Europe with competitive pay equity and a dynamic team culture The position involves integrating with...

View Details