Site Reliability Engineering Intern

IBM · New York, NY

Company

IBM

Location

New York, NY

Type

Full Time

Job Description

Introduction
IBM Technology Zone is the one stop shop for IBMers and business partners to build, show, and share solutions built on IBM technologies to facilitate opportunity progression and customer adoption.

The SRE Intern will learn how to maintain in the reliability and performance of our systems, working to ensure the highest levels of uptime and reliability.

This role will involve learning and applying SRE practices, mentee of senior engineers, and collaborating with various teams to implement best practices in site reliability.

Your Role and Responsibilities

  • Implement SRE best practices, including incident management, post-mortem analysis, and capacity planning
  • Collaborate in building and maintaining TechZone automation that deploys and provisions environments at scale
  • Ensure the reliability, availability, and performance of our applications and infrastructure through proactive monitoring, incident response, and optimization
  • Help drive automation efforts to reduce manual intervention, enhance deployment processes, and improve overall system efficiency
  • take part in team practices to foster a culture of continuous improvement and learning within the team
  • participate in incident response efforts, conduct root cause analysis, and implement preventive measures to avoid future incidents
  • Maintain detailed and up-to-date documentation of system architecture, operational procedures, and incident reports

Want more jobs like this?

Get jobs delivered to your inbox every week.

Select a location
By signing up, you agree to our Terms of Service & Privacy Policy.

Required Technical and Professional Expertise

  • Pursuing bachelor's degree in computer science or information technology (CIS/MIS).
  • Awareness and interest in site reliability engineering, DevOps, or a related field
  • Experience in cloud platforms (e.g., AWS, Azure, GCP), containerization technologies (e.g., Docker, Kubernetes), and infrastructure as code tools (e.g., Terraform, Ansible)
  • Experience in one or more programming languages (e.g., Python, Go, Java) for scripting and automation
  • Excellent analytical and problem-solving skills, with the ability to troubleshoot and resolve complex issues efficiently
  • Strong communication and leadership skills, with the ability to collaborate effectively with cross-functional teams

Preferred Technical and Professional Expertise

  • Familiarity with Hybrid Cloud technologies and strategy, including IBM's OCP, Cloud Pak, and Services strategy, and how these elements bring value to clients and end users
  • Experience using telemetry and monitoring software

Apply Now

Date Posted

12/20/2024

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.8