Site Reliability Engineer - SRE

Jobgether · India

Company

Jobgether

Location

India

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Site Reliability Engineer (SRE) in India.

This role offers the opportunity to shape and scale the reliability backbone of a fast-growing SaaS platform operating in a high-growth, product-led environment. You will ensure high availability, performance, and security across complex cloud-native systems serving financial planning and decision-making use cases. Working in a remote-first and highly collaborative culture, you will partner closely with engineering teams to embed reliability into every stage of the software lifecycle. The position requires ownership of multi-cloud infrastructure and a strong focus on automation and observability. You will contribute to building resilient systems that can scale with rapid business growth while maintaining strict security and compliance standards. This is a hands-on role where engineering excellence directly impacts customer trust and product success.

Accountabilities:

  • Design, manage, and optimize scalable multi-cloud infrastructure across AWS and GCP, ensuring high availability, cost efficiency, and security compliance.
  • Lead Kubernetes orchestration, including cluster design, deployment strategies, and configuration management for consistent and reliable environments.
  • Implement and maintain service mesh solutions to secure and monitor service-to-service communication across distributed systems.
  • Build and optimize CI/CD pipelines using Git and Jenkins, improving deployment speed, reliability, and automated testing coverage.
  • Develop Infrastructure as Code (Terraform) to provision and manage cloud resources in a repeatable and version-controlled manner.
  • Drive automation initiatives using Python to reduce operational toil, streamline maintenance, and improve system resilience.
  • Own observability systems using tools like Prometheus, Grafana, ELK/EFK, CloudWatch, and GCP Operations Suite to ensure full system visibility.
  • Lead incident response, postmortems, and reliability engineering practices, defining and tracking SLIs, SLOs, and SLAs.
  • Collaborate with development teams to embed DevOps and reliability best practices into application design and delivery.
  • Requirements:

    • 5+ years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles in a SaaS or high-scale environment.
    • Strong expertise in AWS (EC2, EKS, RDS, VPC, IAM, S3) and GCP (GKE, Compute Engine, Cloud SQL, IAM, Cloud Storage).
    • Advanced knowledge of Kubernetes and Docker, including deployment, scaling, and lifecycle management.
    • Solid experience with Terraform and Infrastructure as Code principles.
    • Strong programming skills in Python for automation and tooling development.
    • Hands-on experience with observability stacks including Prometheus, Grafana, and ELK/EFK.
    • Deep understanding of cloud networking, distributed systems, and security best practices (zero-trust, IAM, RBAC).
    • Experience building CI/CD pipelines and working with Git-based workflows.
    • Strong problem-solving skills, ownership mindset, and ability to thrive in fast-paced, remote-first environments.
    • Benefits:

      • Fully remote-first work environment with high autonomy and flexibility.
      • Opportunity to work on modern, cloud-native, high-scale infrastructure.
      • Competitive compensation aligned with experience and market standards.
      • Culture of openness, transparency, and idea-driven innovation.
      • Strong focus on learning, experimentation, and professional growth.
      • Collaborative, engineering-led environment with high ownership and impact.
      • Exposure to cutting-edge technologies in multi-cloud, Kubernetes, and observability stacks.
Apply Now

Date Posted

05/06/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories