In this role you will be part of the team building and supports the Internal Developer Platform (IDP) where all Apptio applications are deployed. In a typical day you will interact with GitHub Linux Kubernetes ArgoCD Docker Confluence Jira Slack and AWS.
The Platform and Reliability Engineering (PRE) team at Apptio is responsible for enhancing and maintaining our IDP and driving the adoption of platform best practices across our engineering teams. We are a distributed team working across three locations including the United States Poland and Australia.
When you join IBM you join a culture of openness collaboration and trust. Join us and experience a place where you can co-create your learning and opportunities. A place where teamwork and unique ideas are treasured. A place where you can bring innovation to life.
You Are
Passionate about problem solving and have experience developing platform features designed to improve the developer experience here in Apptio + IBM. Your team can count on you to solve challenging problems across the entire Apptio portfolio. You collaborate with other Platform Engineers developers and support teams to help provide value to the broader organization. You take responsibility when fixing problems in an automated code first way and are happy to step outside your comfort zone to develop your skillset.
Responsibilities
- Develop self-service features and services specifically designed to improve developer velocity
- Manage deployments of Apptio services via ArgoCD
- Streamline the CI process via GitHub Actions and create reusable templates for our developer's consumption
- Improve observability of the services within your purview by reviewing KPI dashboards and alerting
- Use and contribute to runbooks to troubleshoot and triage production issues
- Detect issues and handle Tier 1-2 troubleshooting
- Participate in online βswarmβ collaboration sessions
- Collaborate with Apptio product developers
- Participate in on-call rotation
- Perform maintenance of the platform (patching resets upgrades etc.)
- 2+ yearsβ experience in a Platform Development DevOps SRE or adjacent role
- Experience with at least one programming or scripting language (Preferably Golang)
- Experience working with distributed application deployment and management
- Experience working with container technologies (e.g. Kubernetes Docker)
- Experience working with Infrastructure-as-code (IaC) concepts
- Experience working with cloud provider services such as AWS Azure or Google Cloud Platform
- Familiarity with RESTful systems and their APIs
- Desire working with a remote team
- Fluent English language skills
- Experience with Monitoring Technologies (Prometheus Splunk Datadog etc.)
- General knowledge of GitOps and CI/CD principles
- Experience with an Internal Developer Platform
- Experience with CNCF products such as Cilium Karpenter and a good knowledge of the CNCF landscape
- Experience with the HashiCorp product suite (Vault Terraform Consul etc.)