Cost-Efficient Inference Serving and Routing Optimization- MSc and PHD-Summer internship 2026- Research Lab

IBM · Multiple Cities

Company

IBM

Location

Multiple Cities

Type

Full Time

Job Description

Introduction

At IBM work is more than a job - it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so let’s talk.

Your role and responsibilities
We are looking for a highly motivated PhD or MSc student to join our team for a summer internship focused on cost-efficient serving of large-scale AI inference workloads.
The internship will explore advanced routing strategies and KV-cache–aware optimizations in distributed inference systems with an emphasis on improving performance scalability and GPU cost efficiency.
What you will work on
  • Designing and evaluating routing algorithms to optimize inference latency throughput and cost
  • Investigating KV cache management strategies for large-scale distributed inference serving
  • Prototyping benchmarking and analyzing inference optimization techniques
  • Working with modern inference frameworks and real production-like workloads

Why join us?
This internship offers a unique opportunity to work at the intersection of AI systems and distributed infrastructure with real-world impact on scalable cost-efficient inference serving used in production environments.


Required education
Bachelor's Degree
Required technical and professional expertise
  • MSc or PhD student in Computer Science Machine Learning Systems or a related field
  • Strong background or interest in distributed systems systems research or ML infrastructure
  • Strong programming skills (Python Go or similar)
  • Hands-on experience or familiarity with vLLM (architecture KV cache behavior scheduling or extensions)
  • Interest in AI infrastructure performance optimization and cost efficiency
  • Ability to work independently while collaborating effectively within a research and engineering team


Please include your grade sheet with your application.

Preferred technical and professional experience
  • Experience with Kubernetes (K8s) and cloud-native systems
  • Familiarity with inference serving stacks networking or GPU-based systems
  • Experience with benchmarking profiling or performance analysis
Apply Now

Date Posted

12/28/2025

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Positive
Subjectivity Score: 0.65

Similar Jobs

Data Services Developer Intern (May 2026 - 4 month term - Toronto or Ottawa) - IBM

Views in the last 30 days - 0

IBM Consulting offers career opportunities focused on innovation collaboration with global clients and career growth The role emphasizes technology tr...

View Details

Business Strategist - IBM

Views in the last 30 days - 0

This job posting highlights a strategic role in IBMs ecosystem strategy team focusing on driving growth through partnerships and cloud solutions The p...

View Details

Sr. Data Analytics Engineer - HashiCorp - IBM

Views in the last 30 days - 0

This job description highlights a senior data analytics engineer role at IBM focusing on transforming customer challenges into industryleading solutio...

View Details

Data Scientist (Public sector) - IBM

Views in the last 30 days - 0

The text promotes IBM CICs career opportunities emphasizing career growth training programs innovation and a supportive work environment It highlights...

View Details

SAP SD OTC Consultant - IBM

Views in the last 30 days - 0

This job description highlights a career in IBM Consulting focused on hybrid cloud and AI innovation collaboration with global clients and career grow...

View Details

Platform Team Lead - IBM

Views in the last 30 days - 0

The text describes a role at IBM Consulting involving leadership in platform operations collaboration on hybrid cloud and AI initiatives and expertise...

View Details