Senior Site Reliability Engineer at Jobgether

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer in Brazil.

This role sits at the core of a fast-scaling logistics technology environment where reliability, performance, and automation are critical to powering large-scale distributed systems. You will help design and operate the internal platform that enables engineering teams to deliver high-quality software with confidence and speed. The position blends infrastructure engineering, cloud operations, and software reliability practices in a highly collaborative global setup. You will take ownership of mission-critical systems while continuously improving observability, incident response, and system resilience. Working closely with multiple engineering squads, you will influence architectural decisions and drive platform-wide reliability initiatives. This is a high-impact role where your work directly strengthens system stability, efficiency, and scalability across the organization.

Accountabilities:

You will be responsible for ensuring the reliability, scalability, and performance of critical infrastructure and platform services while enabling engineering teams to operate efficiently in production environments.

Design, deploy, and operate scalable cloud-based systems while balancing reliability, cost, and development velocity
Own and improve SLIs/SLOs, ensuring platform services consistently meet reliability targets
Lead incident response, root-cause analysis, and postmortem processes to prevent recurring issues
Build and enhance observability through monitoring, logging, and alerting frameworks
Support infrastructure-as-code and automation initiatives to improve deployment consistency and efficiency
Collaborate with engineering teams to improve system design, performance, and operational readiness
Contribute to CI/CD pipelines, deployment strategies, and release engineering practices
Provide production support, including occasional off-hours incident handling when required

Requirements

You bring strong hands-on experience in cloud infrastructure, DevOps, and site reliability engineering, with the ability to operate in complex distributed environments.

5+ years of experience in SRE, DevOps, or Cloud Engineering roles
Strong expertise in AWS, Kubernetes, Docker, and modern cloud-native architectures
Proficiency in Linux/UNIX systems administration and production troubleshooting
Experience with infrastructure-as-code tools such as Terraform, Ansible, or Chef
Strong programming/scripting skills (Python, Bash, or similar) for automation and tooling
Solid understanding of networking, system design, and distributed systems principles
Experience with monitoring, logging, and incident management tools and practices
Familiarity with CI/CD pipelines and DevOps best practices
Exposure to PostgreSQL or database operations is a plus
Strong English communication skills and ability to work in global, distributed teams
Problem-solving mindset with high ownership, initiative, and attention to detail

Benefits

Competitive base salary aligned with market standards
Equity package with ownership opportunities in a high-growth tech environment
Unlimited PTO and flexible time-off policy
Remote-first setup within Brazil
Opportunity to work on large-scale distributed systems in a global engineering organization
Collaborative, high-impact engineering culture focused on innovation and continuous improvement.

Senior Site Reliability Engineer

Company

Location

Type

Job Description

Accountabilities:

Requirements

Benefits

Explore More

Date Posted

Views

Similar Jobs

Senior Microsoft Cloud Engineer - Jobgether

Senior Data Engineer - Jobgether

Senior Data Developer (Analytics Engineer) - Jobgether

Web3 Technical Support Engineer - Jobgether

Staff Frontend Software Engineer | BASE Team - Jobgether

Software Engineer P2P - Jobgether