Lead Site Reliability Engineer (Azure)
Company
EPAM Systems
Location
Pune, India
Type
Full Time
Job Description
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a highly skilled and experienced Lead Site Reliability Engineer with a focus on Azure environments to join our team.
In this crucial role, you will leverage your expertise to enhance the reliability and scalability of our cloud-based platforms, ensuring efficient operation and optimal performance. This position involves collaborating closely with cross-functional teams to migrate existing services to the OpenShift platform and make our infrastructure Cloud agnostic. As a leader, you'll guide your team in creating resilient systems and processes that support both internal and external customers relying on our desktop applications and services.
Want more jobs like this?
Get jobs in Pune, India delivered to your inbox every week.
#LI-DNI#EasyApply
Responsibilities
- Oversee migration of services to OpenShift and work towards making our infrastructure Cloud agnostic
- Run pipelines using Azure DevOps for environment configuration and application deployment
- Leverage Python, bash, and PowerShell to automate routine and complex tasks
- Implement and manage Kubernetes and container-based environments
- Monitor cloud resources efficiently and improve system performance in line with SLI metrics
- Debug and resolve operational issues swiftly and effectively
- Collaborate with development and operations teams to ensure system reliability and security
- Mentor team members and lead by example in maintaining best practices for site reliability
- Continuously assess, improve and optimize existing system architecture and applications
- Stay up-to-date with technological advancements and integrate innovative tools and techniques
- 5+ years of experience as a Systems Engineer with a development background
- 1+ years of relevant leadership experience
- Proficiency in Linux and Docker with hands-on experience in Kubernetes
- Capability to use at least one of the following scripting languages: Python, Bash, PowerShell
- Background in infrastructure management including networking and operating systems
- Familiarity with monitoring tools in cloud environments and understanding of SLI concepts
- Familiarity with Azure and/or GCP as cloud service providers
- Experience working with Windows
- Knowledge of CI/CD pipelines, particularly Azure DevOps
- Understanding of Istio and GitOps tools like ArgoCD
- Opportunity to work on technical challenges that may impact across geographies
- Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
- Opportunity to share your ideas on international platforms
- Sponsored Tech Talks & Hackathons
- Unlimited access to LinkedIn learning solutions
- Possibility to relocate to any EPAM office for short and long-term projects
- Focused individual development
- Benefit package:
- Health benefits
- Retirement benefits
- Paid time off
- Flexible benefits
- Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)
Date Posted
01/21/2025
Views
0
Similar Jobs
Senior Solution Consultant - Coursera
Views in the last 30 days - 0
This role involves supporting various Coursera Business teams through Salesforce Solution Architecture and administration skills Key responsibilities ...
View DetailsSenior Product Manager - Mobile - G-P
Views in the last 30 days - 0
The company is seeking a Senior Product Manager with extensive experience in mobile app development to lead the launch and growth of Gias AI Advisor f...
View DetailsManager - ML Practice - Databricks
Views in the last 30 days - 0
Databricks is seeking a worldclass Manager to lead its Machine Learning Practice in India The role involves managing hiring and team growth developing...
View DetailsEnglish Physics content creator - Khan Academy
Views in the last 30 days - 0
Khan Academy is a nonprofit organization offering free worldclass education to millions of students globally They aim to provide locally relevant cont...
View DetailsSoftware Engineer (P3) - Twilio
Views in the last 30 days - 0
Twilio is seeking a Software Engineer with 5 years of experience in designing building and deploying largescale distributed systems and microservices ...
View DetailsSession Lead - Integrated Application Security Services Nanodegree Session lead - Udacity
Views in the last 30 days - 0
The text describes a position for Session Leads who are industry professionals providing technical support to learners in their Nanodegree journey The...
View Details