Senior Site Reliability Engineer, Tenant Services: Geo

Jobgether · India

Company

Jobgether

Location

India

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer, Tenant Services: Geo in India.

This role is focused on ensuring the reliability, scalability, and operational excellence of large-scale distributed systems that support data replication and disaster recovery workflows for enterprise customers. You will join a high-impact SRE team responsible for executing and improving complex migration and cutover processes in a SaaS environment. The position blends deep infrastructure engineering with hands-on operational work, including incident response, automation, and observability. You will help ensure that critical customer data migrations are safe, repeatable, and increasingly low-risk over time. Working in a fully remote, global setup, you will collaborate closely with multiple engineering, support, and infrastructure teams. The environment is fast-paced, highly collaborative, and driven by strong engineering values and automation-first thinking.

Accountabilities:

  • Execute end-to-end data migrations and cutovers, including planning, validation, execution, and post-cutover verification and cleanup activities.
  • Participate in on-call rotations and shift coverage to handle incidents, ensure system availability, and support live migration events across global time zones.
  • Operate and improve replication and migration systems, including data hygiene checks, validation workflows, and escalation handling.
  • Design and maintain automation, tooling, and runbooks to reduce operational complexity and make processes repeatable and reliable.
  • Build and enhance observability systems, including monitoring, alerting, dashboards, and SLO tracking for migrations and system health.
  • Collaborate with multiple engineering and support teams to improve reliability, capacity planning, and disaster recovery processes.
  • Contribute to incident response, post-incident reviews, and root cause analysis, ensuring learnings are converted into long-term improvements.
  • Continuously reduce operational toil through automation and process optimization.
  • Requirements:

    • Strong experience operating large-scale, highly available distributed systems in a SaaS or cloud environment.
    • Hands-on experience with major cloud platforms, including networking, compute, storage, and managed services.
    • Solid Kubernetes experience, including deployment, troubleshooting, and ecosystem tooling such as Helm.
    • Proficiency with infrastructure as code and configuration tools such as Terraform, Ansible, or Chef.
    • Strong programming ability in at least one language (preferably Go or Ruby) plus scripting skills in Python or Shell.
    • Experience with observability stacks such as Prometheus, Grafana, and logging systems for troubleshooting and performance analysis.
    • Exposure to data replication, backup/restore, or migration scenarios where data integrity and downtime risk are critical.
    • Experience working in on-call environments and handling production incidents under pressure.
    • Strong communication skills with the ability to engage customers during migrations and incidents.
    • Ability to work independently in a remote, asynchronous environment with strong ownership mindset.
    • Clear problem-solving skills with a focus on long-term system improvements and not just short-term fixes.
    • Benefits:

      • Flexible Paid Time Off
      • Equity compensation and Employee Stock Purchase Plan
      • Growth and Development Fund
      • Parental leave
      • Home office support
      • Team Member Resource Groups
      • Global remote-first working environment
      • Inclusive and values-driven culture
Apply Now

Date Posted

04/10/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0

© 2026 Job Transparency. All rights reserved.