Staff Machine Learning Engineer, Multimodal Modeling

Jobgether · US

Company

Jobgether

Location

US

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Staff Machine Learning Engineer, Multimodal Modeling in the United States.

This role is ideal for a senior machine learning engineer passionate about advancing multimodal AI systems. You will lead the development and fine-tuning of embedding-based retrieval models, unifying text and image representations to improve performance, generalization, and cross-modal alignment. The position requires a strong foundation in representation learning and experience applying state-of-the-art methods to real-world problems. You’ll collaborate closely with engineering and product teams to design scalable, extensible systems, while tackling complex research challenges independently. The environment is fast-paced, high-impact, and innovative, offering the opportunity to shape the future of AI-driven search and recommendation systems. Remote work flexibility is provided, with opportunities to lead and mentor others in the team.

Accountabilities:

  • Lead the design, development, and fine-tuning of multimodal models (e.g., CLIP, SigLIP) for embedding-based retrieval systems.
  • Unify and improve cross-modal representations, ensuring high performance and extensibility for evolving product use cases.
  • Implement and optimize model architectures, training loops, loss functions, and data pipelines for real-world applications.
  • Evaluate and improve vector similarity search, contrastive learning methods, and embedding quality metrics.
  • Collaborate with engineering teams to deploy scalable, reliable, and maintainable AI systems.
  • Contribute to team growth by mentoring colleagues and providing technical leadership.
  • Conduct research and development to explore new modeling approaches and emerging AI techniques.
  • Requirements:

    • 7+ years of industry experience in machine learning, specializing in representation learning, multimodal modeling, or embedding-based retrieval.
    • Deep expertise in at least one domain: computer vision, natural language processing, or recommendation systems.
    • Proficiency in PyTorch and experience fine-tuning foundation models for real-world tasks.
    • Demonstrated ability to customize model architectures, training procedures, and evaluation methods.
    • Strong engineering skills in Python, with familiarity in Git, SQL, and Bash.
    • Experience with multi-GPU and distributed training workflows is a plus.
    • Knowledge of model compression techniques, such as distillation, quantization, or pruning, is desirable.
    • Ability to work independently, navigate ambiguity, and solve open-ended modeling challenges.
    • Benefits:

      • Competitive salary range of $200,000–$240,000, with equity options.
      • Flexible PTO and 11 company holidays.
      • Fully-paid health, dental, and vision benefits with HSA match.
      • 12 weeks fully-paid parental leave, with additional physical recovery time for birthing parents.
      • Fertility, adoption, and surrogacy support up to $50,000 lifetime maximum.
      • Caregiver support programs and 1:1 equity tax advisor sessions.
      • Work-from-home and productivity stipends, plus home office setup support.
      • Access to employee resource groups and professional development opportunities.
Apply Now

Date Posted

04/08/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories