DevOps - Automation & Monitoring Lead

Western Digital · Other US Location

Company

Western Digital

Location

Other US Location

Type

Full Time

Job Description

Company Description

Western Digital's reliance on software development workflows is growing by leaps and bounds as a leading provider of Storage Solutions. As Secure Development Factory (SDF) Senior Infrastructure Engineer, you will be at the heart of Western Digital’s engineering process, delivering the software development tools and infrastructure that empower engineering teams to develop and deliver high-quality products quickly. The sheer diversity of Western Digital’s products demands a secure development infrastructure that allows teams to develop many types of software at scale without sacrificing security, development velocity, stability, code quality or code health. 


This position requires partnering with WDC engineering teams, IT teams, and others to deliver the right scalable tools and infrastructure that help engineers develop, test, debug and release software and products quickly. We impact thousands of Western Digital developers worldwide and millions of customers by increasing the pace of product development, improving product quality, and directly impacting WDC’s bottom line.


The ideal candidate will have a passion for technology, a relentless focus on the customer experience and an ability to multitask, assimilate data, make decisions and prioritize complex work while paying attention to the details. Communication with internal customers, vendors and co-workers clearly and professionally is an absolute must.

Job Description

Job Description

We are seeking a skilled and proactive DevOps Observability and Automation Lead to join our dynamic team. In this role, you will be responsible for enhancing our DevOps automation framework, applications' observability and reliability. You will collaborate closely with engineering, operations, and development teams to ensure our systems are highly available, scalable, and efficient.

Key Responsibilities:

  • Design and implement observability strategies, including monitoring, logging, and alerting solutions, to ensure high availability and performance of systems.
  • Lead efforts to automate infrastructure deployment, configuration management, and continuous integration/delivery pipelines.
  • Develop and maintain tools for deployment, monitoring, and operations, ensuring operational best practices are followed.
  • Extensive experience in shell scripting/programming, systems automation tools (Ansible(preferred)/Salt/Puppet/Chef/Kickstart/Terraform)
  • Must possess strong documentation skills and can work with rapid change and at a fast pace.
  • Excellent analytical, problem solving, and troubleshooting skills to manage complex process and technology issues.
  • Expertise in handling custom workflow design, automation and product license management
  • Collaborate with software development teams to integrate observability and automation into the software development lifecycle.
  • Analyze system performance and reliability metrics to identify and address bottlenecks and optimize performance.
  • Implement security best practices in observability and automation solutions.
  • Mentor and coach team members on best practices related to observability tools, automation frameworks and DevOps methodologies.

Qualifications


  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proven experience as a DevOps Engineer, Site Reliability Engineer, or similar role with a focus on observability and automation.
  • Hands-on experience with observability tools such as Prometheus, Grafana, Icinga, ELK stack, etc.
  • Proficiency in scripting and automation using Python, Shell scripting, or similar languages.
  • Strong understanding of containerization technologies (e.g., Docker, Kubernetes) and cloud platforms (e.g., AWS, Azure, GCP).
  • Experience with infrastructure-as-code tools such as Terraform, Ansible, or Chef.
  • Excellent troubleshooting and problem-solving skills.
  • Ability to work effectively in a fast-paced, dynamic environment.
  • Experience with CI/CD pipelines and related tools (e.g., Jenkins, GitLab CI).
  • Knowledge of agile methodologies and software development lifecycle.
Apply Now

Date Posted

11/18/2024

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Senior Lead, Talent Acquisition - Sales (Relocation to Munich) (d/f/m) - Personio

Views in the last 30 days - 0

Personio a leading HR platform is seeking a Senior Lead Talent Acquisition professional to drive growth in the Revenue and Success functions across Eu...

View Details

Team Lead, Expansion Account Executive - Personio

Views in the last 30 days - 0

Personio a human resources platform is seeking a Team Lead Expansion Account Executive with 5 years of experience in B2B software sales The role invol...

View Details

Lead Data Analyst - Mitigation - WISE

Views in the last 30 days - 0

Wise is a global technology company seeking an Operations Analyst with 4 years of experience in analytics particularly in operational team analytics T...

View Details

Lead Technical Support Engineer - HERE Technologies

Views in the last 30 days - 0

This role Senior Technical Support Engineer at HERE Technologies involves supporting a diverse portfolio of products and services acting as a technica...

View Details

Principal / Lead Software Engineer- RUST (Algorithmic and Mathematics) - m/w/d - HERE Technologies

Views in the last 30 days - 0

HERE Technologies is seeking a Principal Software Engineer to lead the development of extended services for their VRP solver Tour Planning The role in...

View Details

Software Architecture Engineering and Cloud Computing Engineer - The Aerospace Corporation

Views in the last 30 days - 0

The Aerospace Corporation is seeking a Senior Project Engineer with expertise in software architecture engineering and cloud computing The role involv...

View Details