Observability Automation Architect
Company
IBM
Location
IN Bangalore
Type
Full Time
Job Description
At IBM we are driven to shift our technology to an as-a-service model and to help our clients transform themselves to take full advantage of the cloud. With industry leadership in AI analytics security commerce and quantum computing and with unmatched hardware and software design and industrial research capabilities no other company is as well positioned to address the full opportunity of enterprise cloud computing. We are looking for a lead SRE architect to join our IBM Cloud VPC Observability team. This team is dedicated to ensuring that IBM Cloud is at the forefront of reliable enterprise cloud technology. We are building Observability platforms to deliver performance reliability and predictability for our customersβ most demanding workloads at global scale and with leadership efficiency resiliency and security.
Your Role and Responsibilities
- Implement and administrate infrastructure and solutions that support the IBM Cloud VPC.
- Support the compliance and security integrity of the environment through your work
- Partner with other teams functional managers and program managers to deliver mission-critical services to the market
- Support development of new and enhanced existing capabilities for our compute storage and network services
- Adopt and build on automation solutions governed by SRE principles including CI CD pipelines configuration management immutable infrastructure deployment auto healing systems etc.
- Provide technical escalation support for other Infrastructure Operations teams
- Conceptualize Design implement manage and create a reliable highly performant scalable automation solutions that can build consistency across our infrastructure
- Work with and adopt open source technologies as well as participate in new IBM innovations across IaaS
- A self-driven attitude to propose test and implement solutions and improvements for review and consideration with your peers
Required Technical and Professional Expertise
- 5+ years of experience in data center infrastructure or relevant work experience
- 5+ years of experience in large-scale infrastructure design engineering and support
- 5+ years of experience in IT Change Incident Problem Asset management
- 5+ years of infrastructure engineering with proven record for delivering high-quality large-scale solutions. Experience designing architectures for scale and performance
- 5+ years of practical experience with one or more operating systems: Ubuntu (Preferred) CentOS RHEL or Debian Linux and Windows Servers.
- 5+ years of experience debugging issues across a Linux environment with network storage compute and orchestration components. Does not need to be code debugging.
- Development experience with one or more programming languages: PowerShell Python (preferred) and Ruby
- 2+ years practical experience with orchestration that uses desired state models and/or finite state machine models of orchestration: Kubernetes(Preferred) OpenShift etc.
- 5+ years practical experience Containerization and container orchestration: Docker(preferred) Kubernetes (preferred) OpenShift rancher docker swarm docker compose
- 5+ years experience with Monitoring technologies: Sydig (preferred) Grafana Nagios Zenoss ELK Splunk Zabbix etc.
- Familiarity with Open Telemetry concepts Tracing Metrics Events and other Observability principles
- 2+ years of experience with one or more Virtualization technologies: Citrix Xen Hypervisor (Preferred) KVM(also preferred) libvirt qemu VMware vSphere etc.
- 5+ years of experience with one or more automation and configuration management tools/solutions: Ansible & Terraform (Preferred) Chef python bash puppet Rundeck etc.
- 2+ years of experience with version control systems: github(preferred) gitlab subversion etc.
- Basic experience with databases both RDBMS like mysql or postrgresql as well as non-relational databases such as etcd TimeScaleDB InnoDB etc. Not a DBA role.
- Working knowledge with Network and Storage technologies
- Working knowledge with ServiceNow JIRA Confluence and GitHub
- ITIL Foundation V4 certification is a plus
Preferred Technical and Professional Expertise
- Excellent verbal and written communication skills
- Highly responsible motivated able to work with little direction
- Experience with design and development of complex systems
- Ability to troubleshoot complex problems and customer issues
- Working knowledge of Linux clustering HA and Fault Tolerant system implementations: active/active active/passive pacemaker keepalived haproxy corosync LVM
- 2+ years of experience with complex systems and layered architecture models: OSI Kubernetes virtualization TCP/IP etc.
- Working knowledge of what TCP/IP BGP Sockets routing protocols routes an keepalived are and how they participate in debugging and Highly available systems at scale.
- Ability to debug an issue across the entire OSI stack of a typical Linux environment across storage network compute OS system tuning orchestration.
- Ability to debug stack traces to particular libraries in code and root cause identification.
- Working knowledge of a message bus and message queues: kafka(preferred) Spark RabbitMQ redis etc.
- Extensive experience with databases and debugging their usage with application stacks
- Experience with and understanding of the interaction and dependencies of a typical three tier model of application stacks as well as cloud
Date Posted
09/18/2024
Views
0
Similar Jobs
Quality Engineer: Automation - IBM
Views in the last 30 days - 0
In this role youll work in one of IBMs Consulting Client Innovation Centers delivering deep technical and industry expertise to clients worldwide As a...
View DetailsProcess Analyst Finance & Administration Delivery - Procure to Pay - IBM
Views in the last 30 days - 0
IBM Consulting offers longterm relationships and global collaboration with clients focusing on digital transformation using agile methodologies proces...
View DetailsStorage Network Developer - IBM
Views in the last 30 days - 0
The Spectrum Fusion team of IBM Storage is seeking a Software Development Engineer with over 4 years of experience in networking servers and Layer 2 n...
View DetailsSr. Process Analyst – Recruitment - IBM
Views in the last 30 days - 0
The text describes a career opportunity in IBM Consultings Senior Process Analyst role focusing on recruitment support sourcing strategies and candida...
View DetailsSecurity Specialist-Network Security - IBM
Views in the last 30 days - 0
The text is a job description for a Network Security Engineer role at IBM The role involves working on network security products or solutions troubles...
View DetailsDevOps Engineer - IBM
Views in the last 30 days - 0
The text is an invitation to join IBM where work is more than just a job Its a calling to build design code consult think along with clients sell make...
View Details