Senior Site Reliability Engineer (m/f/d)
Job Description
Team: IT
This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer (m/f/d) in Ireland.
This role offers the opportunity to shape and scale the infrastructure powering a modern AI-driven platform used by frontline employees across industries worldwide. As part of a highly collaborative Platform Squad, you will take ownership of critical reliability and scalability initiatives while driving architectural decisions that directly impact system resilience and performance. You will work on high-throughput, cloud-native environments built on Kubernetes and modern observability stacks, helping engineering teams operate more efficiently and securely. The position combines hands-on technical leadership with mentoring responsibilities, making it ideal for experienced engineers who enjoy solving complex infrastructure challenges while elevating team capabilities. You will play a key role in defining platform reliability standards, improving operational excellence, and enabling global scalability in a fast-growing tech environment. This is a high-impact opportunity for engineers passionate about automation, distributed systems, and cloud-native infrastructure.
Accountabilities:
- Drive the architecture and evolution of scalable cloud infrastructure and Kubernetes environments designed for high availability and global growth.
- Define and implement platform reliability strategies, including zero-downtime deployments, disaster recovery, rollback mechanisms, and resilience improvements.
- Improve and maintain observability systems, monitoring frameworks, and telemetry infrastructure to support operational excellence and system transparency.
- Build and optimize Infrastructure as Code and self-service platform capabilities to reduce operational overhead and improve developer experience.
- Lead platform-related incident response activities, conduct blameless post-mortems, and implement long-term systemic improvements.
- Collaborate closely with engineering teams to define technical roadmaps, architecture standards, and scalable operational practices.
- Mentor and support teammates through technical guidance, design reviews, and knowledge sharing initiatives.
- Drive continuous improvement in CI/CD pipelines, GitOps workflows, automation strategies, and cloud-native infrastructure operations.
- 5+ years of hands-on experience in Site Reliability Engineering, Platform Engineering, DevOps, Cloud Infrastructure, or similar infrastructure-focused engineering roles.
- Proven expertise operating and scaling high-throughput, highly available production systems.
- Deep practical experience with Kubernetes in cloud environments such as Azure, AWS, or GCP.
- Strong understanding of observability concepts, including monitoring, SLIs, SLOs, error budgets, logging, and distributed tracing.
- Proficiency in Go or Python, with strong software engineering and automation skills.
- Experience with Infrastructure as Code tools such as Pulumi, Terraform, or OpenTofu, along with GitOps workflows and CI/CD automation.
- Strong knowledge of cloud-native technologies, distributed systems, and reliability engineering best practices.
- Demonstrated experience leading infrastructure initiatives, writing technical proposals, and driving architecture decisions.
- Strong communication skills with the ability to collaborate effectively across technical teams and stakeholders.
- Comfortable participating in on-call rotations and managing critical production incidents.
- Additional experience with service meshes, API gateways, Kubernetes operators, or highly available PostgreSQL environments is considered a plus.
- Remote-first work environment with flexibility to work from home across eligible locations.
- Opportunities for in-person collaboration through team events, workshops, and office gatherings.
- Flexible work arrangements supporting strong work-life balance.
- Wellness and lifestyle benefits, including fitness memberships and bike leasing programs.
- Inclusive, collaborative, and growth-focused company culture.
- Opportunity to contribute directly to the scaling of a fast-growing international technology platform.
- Access to regular team events, culture initiatives, and company gatherings.
- Possibility to work remotely from locations within the European Union depending on team arrangements.
- Strong emphasis on personal development, ownership, and long-term career growth.
Requirements:
Benefits:
Explore More
Date Posted
05/27/2026
Views
0