Job Description
WatsonX Orders is a Silicon Valley & Poland based technology development group within IBM targeting the development of world-class conversational AI. Our mission is to deliver advanced technology solutions that address real-world data driven needs in customer-facing the quick service restaurant environment. We are focused on using state-of-the-art Machine Learning AI and related technologies to completely transform the customer experience!
Your Role and Responsibilities
We are currently looking for skilled Platform Engineers to develop maintain and support container orchestration (kubernetes) machine learning workloads network services and storage layers across cloud and on-premise.
Responsibilities:
- Develop and maintain scalable distributed systems in IBM Cloud AWS and on-premise.
- Develop and maintain high performance k8s clusters across multiple regions.
- Develop and maintain telemetry infrastructure & service instrumentation (python) for metrics distributed tracing and logging.
- Support infrastructure for a petabyte scale data platform and stream analysis services.
- Work with Audio and Speech AI Engineers to accelerate development and deployment of heterogeneous analysis and training pipelines
- Participate in the definition and management of SLIs SLOs and error budgets for infrastructure and production services.
- Design and implement infrastructure-as-code pipelines
Required Technical and Professional Expertise
- 2+ Years cloud development (IBM cloud preferred and AWS) experience designing implementing and support cloud-based infrastructure
- 2+ Years experience architecting deploying and supporting kubernetes in cloud and on-prem environments.
- 2+ years experience designing and supporting distributed systems.
- Experience writing production code in one of more languages such as Python (preferred) Java Go in a microservices environments.
- 2+ Years Linux experience configuring supporting and optimizing. Bonus for Redhat
Preferred Technical and Professional Expertise
- Familiarity running distributed ML workloads in cluster orchestrated environments
- Experience building and supporting telemetry and related infrastructure (Open Telemetry Jaeger Grafana Prometheus)
- Experience with k8s ecosystem tooling like helm deployment tools such as ArgoCD
- Experience designing and implementing infrastructure as code pipelines
- Experience designing and implementing traffic routing strategies in edge and microservices environments.
Date Posted
09/27/2024
Views
0
Similar Jobs
Platform Engineer - IBM
Views in the last 30 days - 0
IBM Software is seeking skilled Platform Engineers to develop and maintain cloudnative software solutions The role involves developing scalable distri...
View DetailsSenior Software Engineer - Backend/Java - IBM
Views in the last 30 days - 0
The text describes a role as a Software Engineer for IBM Infrastructure focusing on data integration capabilities and building scalable highperformanc...
View DetailsSenior Machine Learning Engineer - IBM
Views in the last 30 days - 0
WatsonX Orders is an IBM Silicon Valley based technology development group focusing on conversational AI for the quick service restaurant environment ...
View DetailsInfrastructure Security Engineer, WatsonX Orders – ML - IBM
Views in the last 30 days - 0
The job posting is for a skilled Security Engineer to secure infrastructure and applications across AWS k8s and edge locations for an MLpowered AI com...
View DetailsSoftware Engineer (AI) - IBM
Views in the last 30 days - 0
IBM is seeking a Software Engineer with experience in Python Machine Learning and AI to work on the IBM Watson XAI offering from the Kraków Poland off...
View DetailsSite Reliability Engineer - IBM
Views in the last 30 days - 0
The job posting is for a Site Reliability Engineer SRE at IBM responsible for ensuring the reliability and scalability of systems and services The rol...
View Details