Senior Site Reliability Engineer

Cloudbeds Remote

Company

Cloudbeds

Location

Remote

Type

Full Time

Job Description

How You'll Make an Impact:

As a Sr. Site Reliability Engineer you'll be the guardian of our platform's reliability and performance ensuring millions of hospitality transactions flow seamlessly across the globe. You'll architect and implement scalable AWS cloud solutions that keep the most ambitious hotels running 24/7 while fostering a culture of automation resilience and continuous improvement across our engineering teams.

Our SRE Team:

We're a bottom-up collaborative team that thrives on healthy debate and shared ownership of our infrastructure. You'll have endless opportunities to influence architecture decisions while working with cutting-edge cloud technologies at scale. We believe the best solutions come from engineers who are empowered to innovate experiment and challenge the status quo.

What You Bring to the Team:

  • Design and implement reliable scalable and efficient cloud infrastructure to support our global platform's growth

  • Maintain and support highly loaded Kubernetes (EKS) clusters and infrastructure-related components

  • Develop and continuously improve monitoring and logging systems using Prometheus DataDog and Loki stacks

  • Participate in on-call rotation to support production environment and ensure rapid response to outages

  • Lead incident response efforts ensuring minimal service impact while documenting learnings and implementing preventive measures

  • Collaborate with development teams to establish Service Level Objectives (SLOs) and ensure systems meet or exceed reliability targets.

  • Champion SRE best practices across engineering mentoring teams on resiliency performance optimization and scalability

  • Automate platform operations with infrastructure-as-code (Terraform) and configuration management tools

What Sets You Up for Success:

  • 5+ years of hands-on experience as a SRE or Systems Engineer working extensively with AWS cloud infrastructure

  • 3+ years of production experience with Kubernetes Docker and Helm charts at scale

  • Proven track record implementing and scaling Elastic Kubernetes Service (EKS) platforms

  • Strong expertise with monitoring logging and alerting technologies (ELK Datadog Loki or AWS CloudWatch)

  • Experience with GitOps and ArgoCD

  • Working knowledge of web infrastructure including NGiNX Ingress controllers MySQL/PostgreSQL/Aurora Redis/Memcached and SQS

Bonus Skills to Stand Out:

  • Advanced Database Administration experience with Aurora MySQL or PostgreSQL

  • Support and enhance the release process through CI/CD pipeline development and optimization

  • Experience working in PCI-compliant environments and security-focused infrastructure

  • Familiarity with Kong API Gateway and API management at scale

Apply Now

Date Posted

12/07/2025

Views

0

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.9

Similar Jobs

Senior Design Manager (Infrastructure) - Canonical

Views in the last 30 days - 0

Canonical a leading opensource provider seeks a Senior Design Manager to drive innovation in cloud and AI technologies The role offers remote work glo...

View Details

Senior Product Designer - Org & Security - Typeform

Views in the last 30 days - 0

This job description outlines a role in developing an intelligent contact management system with AI capabilities The position involves designing user ...

View Details

Senior Business Analyst - Xpansiv

Views in the last 30 days - 0

Xpansiv promotes its role as an energy market innovator with a global platform for environmental commodities The job posting seeks a Business Analyst ...

View Details

Senior Specialist Senior Accountant Shared Financial Services - Make-A-Wish America

Views in the last 30 days - 0

The text describes Make a Wish Foundations mission to grant childrens wishes and their community efforts It outlines job positions with remotehybrid o...

View Details

Software Engineer Networking Software and Services - xAI

Views in the last 30 days - 0

The text describes xAIs mission to develop AI systems for understanding the universe and advancing human knowledge It outlines a role involving networ...

View Details

Associate Technical Support Engineer - Recharge

Views in the last 30 days - 0

Recharge is a subscription platform for innovative brands offering customer retention solutions They seek Technical Support roles with 247 coverage em...

View Details