Senior Site Reliability Engineer

IBM • AU Sydney

Company

IBM

Location

AU Sydney

Type

Full Time

Job Description

Introduction
At IBM work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so lets talk.

Your Role and Responsibilities

In this role you will be part of a team that develops and supports the Apptio Kubernetes Platform (AKP) where all Apptio applications are deployed. In a typical day you will interact with Github Linux Kubernetes ArgoCD Docker Confluence Jira Slack and AWS.

You Are: You are passionate about problem solving and reliability and have significant experience in SRE or an adjacent role. Your team can count on you to solve challenging problems across the entire Apptio Portfolio. You collaborate with other SREs developers and support teams to help provide value to the broader organization. You take responsibility when fixing problems in an automated code first way and are happy to step outside your comfort zone to develop your skillset.

We Are: The Platform and Site Reliability Engineering team – PRE – at Apptio is responsible for enhancing and maintaining our Kubernetes platform and driving the adoption of SRE best practices across our engineering teams. We are a distributed team working across three locations including the United States Poland and Australia.

Responsibilities

  • Manage deployments of Apptio services to AKP
  • Streamline the deployment process
  • Improve observability of the services within your purview by reviewing KPI dashboards and alerting
  • Author and maintain documentation of deployment and monitoring processes
  • Use runbooks to troubleshoot and triage production issues
  • Detect issues and handle Tier 3 troubleshooting
  • Participate in online “swarm” collaboration sessions
  • Collaborate with service developers
  • Participate in on-call rotation
  • Perform maintenance of the platform (patching resets upgrades etc.)
  • Operate independently and own end-to-end delivery of solutions
  • Mentor junior SRE members
  • Have significant input in the product roadmap and be able to articulate effectively the benefits of alternative technologies


Required Technical and Professional Expertise

  • 5+ years’ experience in an SRE or adjacent role
  • Functional understanding of at least one programming language and source control (Preferably Golang)
  • Expertise with distributed application deployment and management via Kubernetes and/or OpenShift Application Platform
  • Expertise with container technologies (e.g. Kubernetes Docker OpenShift)
  • Expertise with Infrastructure-as-code (IaC) concepts (Terraform)
  • Expertise with cloud provider services
  • Ability to work with RESTful systems and their APIs
  • Familiarity with observability (e.g. Prometheus Open telemetry)
  • Desire working with a remote team
  • Fluent English language skills
  • AU resident (Location should be Brisbane Sydeny Melbouren)


Preferred Technical and Professional Expertise
Same as above.

Apply Now

Date Posted

09/05/2024

Views

6

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Senior Project Manager - Infosys

Views in the last 30 days - 0

View Details

Senior Counsel Manager, Competition Advisory - Google

Views in the last 30 days - 0

View Details

Sr Dir, Sales FSI ANZ - ServiceNow

Views in the last 30 days - 0

View Details

Security Operations Center Engineer - Cloudflare

Views in the last 30 days - 0

View Details

Consulting Manager - Visa

Views in the last 30 days - 0

View Details

Sales Development Representative - CrowdStrike

Views in the last 30 days - 0

View Details