AI Ops Architect

IBM β€’ CA Toronto

Company

IBM

Location

CA Toronto

Type

Full Time

Job Description

Introduction
At IBM work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so lets talk.

Your Role and Responsibilities
We are looking for an AIOps Architect to lead the development and deployment of AI-enhanced solutions for IT operations. In this role you will architect cloud-native compliant platforms that integrate AIOps cognitive computing and machine learning models to improve infrastructure performance reduce downtime and enhance system observability. You will design scalable secure and resilient systems develop automated operations and implement robust security practices to ensure compliance and operational excellence.

As an AIOps Architect you will guide clients in their digital transformation utilizing state-of-the-art technologies to build intelligent operations platforms that drive efficiency enhance system reliability and support business growth.

Core Responsibilities

  • Architect and deploy hybrid multi-cloud and cloud-native solutions to support payments transformation aligning infrastructure systems networking and data center strategies
  • Architect and implement comprehensive Solution Architectures High-Level Designs (HLD) and Low-Level Designs (LLD) that ensure seamless integration of cloud-native technologies AI-enhanced monitoring and automation tools adhering to best practices in security compliance and governance
  • Develop and deploy strategies to enhance scalability resilience and operational efficiency across hybrid and multi-cloud environments integrating automation observability and robust security protocols to support seamless high-performing and compliant systems
  • Design and implement solutions that optimize cloud operations infrastructure management application performance DevOps pipelines security frameworks network architecture MLOps and LLMOps.
  • Deep expertise in monitoring tools (AppDynamics Dynatrace Splunk Instana QRadar AWS CloudWatch Azure Monitor Google Operations Suite) with a focus on LLM observability and security for real-time analytics and anomaly detection
  • Develop advanced monitoring and observability frameworks leveraging LLM observability and security enabling robust tracking of application performance anomaly detection and real-time analytics for Large Language Models and other AI/ML workloads
  • Integrate supervised learning models for predictive analytics employing techniques such as data cleaning event correlation and root cause analysis to generate actionable insights that drive proactive incident resolution and optimize system performance
  • Design and implement IT Service Management (ITSM) and ITIL frameworks encompassing incident management problem management change management and service level management to standardize operational workflows and enhance service reliability
  • Utilize AI/ML models including machine learning-based anomaly detection and reinforcement learning to automate incident response performance tuning and infrastructure scaling reducing Mean Time to Detection (MTTD) and Mean Time to Resolution (MTTR)
  • Engineer robust security architectures that include Cloud Native Application Protection Platforms (CNAPP) Zero Trust Network Access (ZTNA) and fully automated DevSecOps pipelines ensuring compliance with stringent regulatory requirements and maintaining security posture across multi-cloud ecosystems
  • Design and deploy High Availability (HA) and Disaster Recovery (DR) solutions using distributed architectures multi-zone redundancy data replication and automated failover ensuring minimal service disruption and business continuity in multi-region deployments
  • Implement chaos engineering practices conducting FURPS (Functionality Usability Reliability Performance Supportability) testing to identify potential failure points validate system resilience and ensure seamless recovery under high-stress conditions.
  • Lead end-to-end project lifecycle management including agile project methodologies DevOps pipelines resource allocation risk management and milestone tracking to ensure the successful deployment of scalable robust and secure solutions aligned with client objectives


Required Technical and Professional Expertise

  • 8+ years of experience in the design delivery and scaling of complex large-scale IT projects with a focus on cutting-edge technology solutions across hybrid multi-cloud and on-premises environments
  • 3+ years of technical leadership as a solution architect driving the design integration and management of hybrid cloud solutions including seamless coordination across various cloud environments
  • Demonstrated success in leading super complex projects from initial solution design through to deployment managing diverse teams multi-vendor coordination and ensuring alignment with strategic business goals
  • Strong background in architecting complex multi-cloud systems leveraging hyperscalers (AWS Azure IBM Cloud Google Cloud) with experience in multi-region deployments multi-cloud networking and cross-cloud service integration
  • Proven expertise in designing cloud-native solutions with microservices containers (Docker Podman) and orchestration platforms (Kubernetes OpenShift) ensuring modular scalable and resilient deployments
  • In-depth understanding of regulatory compliance security frameworks and best practices in designing secure resilient architectures
  • Familiarity with integrating AI/ML models to enhance monitoring incident response and predictive maintenance processes
  • Expertise in emerging technologies such as AI-enhanced operations automation frameworks and cloud-native security to future-proof systems and improve operational efficiency


Preferred Technical and Professional Expertise
Same as above

Apply Now

Date Posted

11/26/2024

Views

0

Back to Job Listings ❀️Add To Job List Company Info View Company Reviews
Positive
Subjectivity Score: 0.8

Similar Jobs

Oracle Cloud ERP Solution Architect - IBM

Views in the last 30 days - 0

The text describes a role as an Oracle Cloud ERP Solution Architect at IBM responsible for leading implementations to top clients providing thought le...

View Details

Platform Architect - IBM

Views in the last 30 days - 0

The job posting is for a Platform Architect position at IBM Payments Center requiring 10 years of experience in designing and delivering complex IT pr...

View Details

Application Architect - IBM

Views in the last 30 days - 0

The role of an IBM Application Architect involves helping clients transform their business and solve complex problems The job requires expertise in ar...

View Details

Solution Architect - IBM

Views in the last 30 days - 0

The job posting is for an Enterprise Solution Architect at IBM responsible for designing and delivering hybrid cloud solutions for clients The role re...

View Details

Application Architect Tech Lead - IBM

Views in the last 30 days - 0

The job posting is for an Application ArchitectTechnical Lead position at IBM Payments Centre responsible for designing and delivering hybrid cloud an...

View Details

Solution Architect - IBM

Views in the last 30 days - 0

The job posting is for an experienced Architect to lead the architecture technical delivery and implementation of payments modernization and transform...

View Details