Site Reliability Engineer

Tata Consultancy Services · Columbus, OH

Company

Tata Consultancy Services

Location

Columbus, OH

Type

Full Time

Job Description

Job Description Summary

  • The Site Reliability Engineer (SRE) works across teams to build and operate highly reliable systems while minimizing restrictions on release velocity.
  • Youll take a broad view and respond strategically to problems with a focus on reliability and member experience.
  • As an SRE, youll spend at least 50% of your time developing software to improve observability and reliability.
  • Youll help and protect our members by ensuring the availability and performance of our critical systems and by ensuring our product teams achieve and maintain high standards of quality and efficiency.
  • With your understanding and work across value streams, from user-facing applications to underlying platforms, youll partner with numerous development and infrastructure teams to solve challenging problems.

Job Description

Key Responsibilities:

  • Uses and applies strategies in monitoring, automation, performance, and process engineering to improve the user experience, application architecture and resiliency.
  • Diagnoses availability, latency and performance issues; making improvements in code and configuration to achieve service level objectives efficiently at scale with minimal human intervention.
  • Assists peers and leaders to influence and guide product teams to implement SRE principles and practices.
  • Creates and implements tools of moderate to low complexity to automate toil and improve the reliability of the systems.
  • Supports services in production as part of a 24x7 on-call rotation.
  • Works with architects and engineers to design reliability in to new and existing systems.
  • Works to ensure reliable interactions between systems and Software as a Service (SaaS) providers through engineering and relationship management.
  • May perform other responsibilities as assigned.

Experience:

  • Four years or more of technology experience with application development or system management, using multiple technologies within one or more domains.
  • Proven experience with CI/CD, infrastructure as code and other modern Technology practices, with at least two years of experience building and operating distributed systems.
  • Experience improving availability, latency and performance with a systematic problem-solving approach.
  • Strong technical and communication skills, knowledge of planning, management and execution of Accelerated Solutions Delivery framework; and Information Security acumen.
  • Insurance/financial services industry knowledge a plus.

Date Posted

09/16/2023

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8