Site Reliability Engineering (SRE) professionals are engineers who specialize in reliability and resiliency with the right mix of knowledge and skills in software and systems responsible to analyze business needs problem determination advise & design build test deploy changes and maintenance of a well-engineered information system and ecosystems.
Weβre seeking skilled automation-focused interns to maintain and administer the PowerVS Cloud Infrastructure-as-a-Service environment and provide reliable and secure offering to clients.
As an Automation focused intern for Site Reliability Engineer you will perform the following tasks:
- Develop Test Deploy and Maintain Automation code for various procedures/runbooks defined for several PowerVS Data Center Build Operations and Support related tasks.
- Develop Test Deploy and Maintain Automation code for various Data Collection Logs processing infrastructure monitoring backup and restore of critical logs/configuration data in PowerVS Data Centers
- Develop automation to reduce manual toil (automated repetitive tasks) using shell scripts (bash etc) Python Ansible and related tools and languages.
- Develop Test Deploy and Maintain Automation code to perform code stack updates on infrastructure systems (VIOS firmware PowerVC HMC Novalink NIM servers) as well as cloud supporting systems (jump servers sobox network nodes gateways TSM servers).
- Develop Test Deploy and Maintain Automation code to upload and maintain stock images offered in PowerVS environments.
- Develop Test Deploy and Maintain Automation code to remotely administer AIX and Linux servers maintain User IDs (Add/delete) and passwords.
Significant scripting/coding experience for automating various aspects of IBM Power systems administration.
- Automation using Python shell scripting (bash etc) Ansible and related tools and languages.
- Experience with AIX and/or Linux administration commands and networking.
- Experience with Redhat OpenShift Kubernetes
- Experience with DevOps CI CD Terraform
- Good communication: ability to communicate effectively.
- An automation mindset wherever possible you should look to use scripting and automation.
Experience with configuring and tuning IBM AIX VIOS PowerVC
- IBM Cloud CLI APIs Terraform
- Knowledge of IBM Power Systems IBM Storage Systems Cisco ACI Juniper vSRX
- Understanding of system monitoring (Nagios ELK stack)