Principal AI Research Scientist Post-Training Alignment at Jobgether

Team: IT

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Principal AI Research Scientist Post-Training Alignment in Canada.

This role sits at the forefront of foundation model research, focusing on post-training, alignment, and reinforcement learning for advanced AI systems. You will work on shaping how large-scale models behave, reason, and interact with real-world constraints, with a strong emphasis on reliability, controllability, and safety. The environment blends cutting-edge academic research with direct product impact, allowing your contributions to move quickly from experimentation to deployment. You will design and evaluate long-horizon reasoning systems, agentic behaviors, and alignment methodologies grounded in both human feedback and structured, domain-based signals. Working alongside world-class researchers and engineers, you will help define evaluation standards and readiness criteria for next-generation AI systems. This is a highly influential role for someone passionate about advancing frontier AI while ensuring robust and responsible model behavior at scale.

Accountabilities:

Lead research and development in post-training methods for foundation models, including reinforcement learning, preference optimization, and alignment techniques such as RLHF, RLAIF, DPO, and PPO.
Design and develop novel algorithms that improve model reliability, controllability, reasoning ability, and alignment with human and system objectives.
Define and execute experimental frameworks to evaluate model behavior, robustness, safety, and long-horizon reasoning performance.
Architect evaluation systems for agentic workflows, tool use, and real-world task completion, leveraging both human and automated signals.
Make principled decisions on when improvements should be addressed through pre-training, post-training, or system-level design changes.
Lead model analysis and interpretability efforts to understand failure modes, trade-offs, and emergent behaviors in large-scale systems.
Collaborate with infrastructure teams to build scalable, reproducible post-training pipelines and support large-scale experimentation.
Establish model readiness criteria and provide clear go/no-go recommendations for production deployment and releases.
Contribute to scientific publications, patents, and external research visibility at leading ML and AI conferences.
Communicate technical risks, limitations, and strategic trade-offs to both technical peers and senior stakeholders.

Requirements:

Deep expertise in reinforcement learning for foundation models and strong command of post-training methodologies such as RLHF, RLAIF, DPO, PPO, or related approaches.
PhD or equivalent industry research experience in machine learning, reinforcement learning, AI, or closely related fields.
Proven track record in leading or mentoring research teams in academia, industry labs, or advanced AI organizations.
Strong publication history in top-tier ML/AI venues such as NeurIPS, ICML, ICLR, CVPR, or SIGGRAPH.
Experience in alignment research, preference learning, agentic AI systems, or large-scale model behavior optimization.
Strong intuition for model behavior, failure modes, and trade-offs in post-training and alignment settings.
Experience designing evaluation systems and defining model readiness criteria for deployment.
Familiarity with large-scale training infrastructure and compute/resource trade-offs in ML systems.
Ability to communicate complex technical concepts clearly to both technical and non-technical audiences.
Experience working with or deploying production AI systems in applied or research-to-production environments.
Prior experience in frontier AI labs or equivalent high-impact research organizations is highly valued.

Benefits:

Competitive compensation package including base salary, performance bonuses, and potential stock grants
Flexible work arrangements, including remote options across Canada and hybrid setups in major hubs
Comprehensive health, dental, and wellness coverage
Opportunities to publish and present research at top-tier global AI conferences
High-impact research environment with direct pathways to production and real-world deployment
Access to large-scale compute infrastructure and advanced AI research tooling
Strong culture of innovation, collaboration, and scientific rigor
Inclusive and diverse workplace committed to belonging and equal opportunity.

Principal AI Research Scientist Post-Training Alignment

Company

Location

Type

Job Description

Accountabilities:

Requirements:

Benefits:

Explore More

Date Posted

Views

Similar Jobs

Staff Product Manager - Jobgether

Staff Engineer (Platform) - Jobgether

Senior NPU Architect - Jobgether

Senior Growth Marketer - Jobgether

Senior Engineer (Product) Canada - Jobgether

Senior AI Platform Engineer - Jobgether