Senior Reliability Engineer

EPAM Systems Río Grande, Mexico

Company

EPAM Systems

Location

Río Grande, Mexico

Type

Full Time

Job Description

We are seeking a Senior Reliability Engineer to join our remote team. This role is crucial for ensuring our systems' ongoing stability and efficiency, focusing on minimizing downtime and maximizing performance. The ideal candidate will have a proven track record of improving system reliability and a strong technical acumen in managing complex infrastructures. Your expertise will help shape our operational strategies, ensuring our services are robust and resilient against disruptions.

#LI-DNI

Responsibilities

  • Lead initiatives to enhance system reliability, availability, and resilience
  • Design and implement robust monitoring solutions to proactively identify potential issues
  • Mentor junior engineers in reliability best practices and advanced troubleshooting techniques
  • Collaborate with cross-functional teams to ensure seamless deployments and operations
  • Develop automation scripts to streamline operational processes and reduce human error
  • Conduct detailed root cause analysis for critical incidents and drive continuous improvement
  • Establish and maintain service level objectives (SLOs) and service level indicators (SLIs) to measure system performance
  • Advocate for and implement reliability-focused changes in the software development lifecycle
Requirements

Want more jobs like this?

Get jobs in Río Grande, Mexico delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.
  • Minimum of 3 years experience in a Reliability Engineer role
  • Advanced scripting skills in Python and PowerShell
  • Strong knowledge of cloud platforms, specifically Azure and GCP
  • Proficient with Azure DevOps pipelines for efficient CI/CD workflows
  • Expertise in debugging and troubleshooting complex systems
  • Experience with monitoring tools such as GCP Cloud Logging, Grafana, and Azure Logs
  • In-depth understanding of Site Reliability Engineering (SRE) principles
  • Fluent English communication skills at a B2 level or higher
Nice to have
  • Experience with Kubernetes and container orchestration platforms
  • Proven ability to lead projects focused on system scalability and disaster recovery planning
  • Familiarity with advanced data analytics and machine learning tools to predict system failures
We offer
  • Career plan and real growth opportunities
  • Unlimited access to LinkedIn learning solutions
  • International Mobility Plan within 25 countries
  • Constant training, mentoring, online corporate courses, eLearning and more
  • English classes with a certified teacher
  • Support for employee's initiatives (Algorithms club, toastmasters, agile club and more)
  • Enjoyable working environment (Gaming room, napping area, amenities, events, sport teams and more)
  • Flexible work schedule and dress code
  • Collaborate in a multicultural environment and share best practices from around the globe
  • Hired directly by EPAM & 100% under payroll
  • Law benefits (IMSS, INFONAVIT, 25% vacation bonus)
  • Major medical expenses insurance: Life, Major medical expenses with dental & visual coverage (for the employee and direct family members)
  • 13 % employee savings fund, capped to the law limit
  • Grocery coupons
  • 30 days December bonus
  • Employee Stock Purchase Plan
  • 12 vacations days plus 4 floating days
  • Official Mexican holidays, plus 5 extra holidays (Maundry Thursday and Friday, November 2nd, December 24th & 31st)
  • Monthly non-taxable amount for the electricity and internet bills
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM's Privacy Notice and Policy.

Apply Now

Date Posted

11/19/2024

Views

0

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.9

Similar Jobs

Senior AWS Engineer - EPAM Systems

Views in the last 30 days - 0

View Details

Senior Python AWS Engineer - EPAM Systems

Views in the last 30 days - 0

View Details

Senior Cloud Java Full Stack Developer - EPAM Systems

Views in the last 30 days - 0

View Details

Senior Python Developer - EPAM Systems

Views in the last 30 days - 0

View Details

Lead Cloud Security Engineer - EPAM Systems

Views in the last 30 days - 0

View Details

Junior Cloud Support Engineer - EPAM Systems

Views in the last 30 days - 0

View Details