Principal AI Research Scientist Post-Training Alignment

Jobgether · Canada

Company

Jobgether

Location

Canada

Type

Full Time

Job Description

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal AI Research Scientist Post-Training Alignment in Canada.

This role sits at the forefront of foundation model research, focusing on post-training, alignment, and reinforcement learning for advanced AI systems. You will work on shaping how large-scale models behave, reason, and interact with real-world constraints, with a strong emphasis on reliability, controllability, and safety. The environment blends cutting-edge academic research with direct product impact, allowing your contributions to move quickly from experimentation to deployment. You will design and evaluate long-horizon reasoning systems, agentic behaviors, and alignment methodologies grounded in both human feedback and structured, domain-based signals. Working alongside world-class researchers and engineers, you will help define evaluation standards and readiness criteria for next-generation AI systems. This is a highly influential role for someone passionate about advancing frontier AI while ensuring robust and responsible model behavior at scale.

Accountabilities:

  • Lead research and development in post-training methods for foundation models, including reinforcement learning, preference optimization, and alignment techniques such as RLHF, RLAIF, DPO, and PPO.
  • Design and develop novel algorithms that improve model reliability, controllability, reasoning ability, and alignment with human and system objectives.
  • Define and execute experimental frameworks to evaluate model behavior, robustness, safety, and long-horizon reasoning performance.
  • Architect evaluation systems for agentic workflows, tool use, and real-world task completion, leveraging both human and automated signals.
  • Make principled decisions on when improvements should be addressed through pre-training, post-training, or system-level design changes.
  • Lead model analysis and interpretability efforts to understand failure modes, trade-offs, and emergent behaviors in large-scale systems.
  • Collaborate with infrastructure teams to build scalable, reproducible post-training pipelines and support large-scale experimentation.
  • Establish model readiness criteria and provide clear go/no-go recommendations for production deployment and releases.
  • Contribute to scientific publications, patents, and external research visibility at leading ML and AI conferences.
  • Communicate technical risks, limitations, and strategic trade-offs to both technical peers and senior stakeholders.
  • Requirements:

    • Deep expertise in reinforcement learning for foundation models and strong command of post-training methodologies such as RLHF, RLAIF, DPO, PPO, or related approaches.
    • PhD or equivalent industry research experience in machine learning, reinforcement learning, AI, or closely related fields.
    • Proven track record in leading or mentoring research teams in academia, industry labs, or advanced AI organizations.
    • Strong publication history in top-tier ML/AI venues such as NeurIPS, ICML, ICLR, CVPR, or SIGGRAPH.
    • Experience in alignment research, preference learning, agentic AI systems, or large-scale model behavior optimization.
    • Strong intuition for model behavior, failure modes, and trade-offs in post-training and alignment settings.
    • Experience designing evaluation systems and defining model readiness criteria for deployment.
    • Familiarity with large-scale training infrastructure and compute/resource trade-offs in ML systems.
    • Ability to communicate complex technical concepts clearly to both technical and non-technical audiences.
    • Experience working with or deploying production AI systems in applied or research-to-production environments.
    • Prior experience in frontier AI labs or equivalent high-impact research organizations is highly valued.
    • Benefits:

      • Competitive compensation package including base salary, performance bonuses, and potential stock grants
      • Flexible work arrangements, including remote options across Canada and hybrid setups in major hubs
      • Comprehensive health, dental, and wellness coverage
      • Opportunities to publish and present research at top-tier global AI conferences
      • High-impact research environment with direct pathways to production and real-world deployment
      • Access to large-scale compute infrastructure and advanced AI research tooling
      • Strong culture of innovation, collaboration, and scientific rigor
      • Inclusive and diverse workplace committed to belonging and equal opportunity.
Apply Now

Date Posted

06/02/2026

Views

0

Back to Job Listings Add To Job List Company Profile View Company Reviews
Neutral
Subjectivity Score: 0
142,000+ Jobs Tracked
12,400+ Companies
1,930 Categories