DevTech Engineer - Windows LLM and GenAI Open-Source Ecosystem

NVIDIA • Munich, Germany / Remote

Company

NVIDIA

Location

Munich, Germany / Remote

Type

Full Time

Job Description

For more than two decades, NVIDIA has pioneered visual computing, the art and science of computer graphics. With our invention of the GPU - the engine of modern visual computing - the field has expanded to PC games, movie production, product design, medical diagnosis, research and AI.

Nowadays, Large Language Models (LLMs) and Generative AI change our world. They help us being productive and collaborative, they fuel our creativity and enable communication across language barriers. Being computationally tremendously demanding, NVIDIA's technology is a driving force behind the wider adoption of these AI models in datacenters and edge computing.

For our team in Wuerselen we are now looking for a Developer Technology Engineer to ...

Want more jobs like this?

Get jobs delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

contribute to the LLM & GenAI open-source ecosystem to enable Windows AI enthusiasts and developers with innovative models and functionality as well as speed-of-light performance on RTX.
engage with our strategic partners and internal teams to overcome the challenges arising when deploying modern LLM & GenAI architectures on local workstations.

What you'll be doing:

Improve Windows LLM & GenAI user experience on NVIDIA RTX by working on feature and performance enhancements of OSS software, including but not limited to projects like PyTorch, llama.cpp, ComfyUI.
Engage with internal product teams and external OSS maintainers to align on and prioritize OSS enhancements.
Work closely with internal engineering teams and external app developers on solving local end-to-end LLM & Generative AI GPU deployment challenges, using techniques like quantization or distillation.
Apply powerful profiling and debugging tools for analyzing most demanding GPU-accelerated end-to-end AI applications to detect insufficient GPU utilization resulting in suboptimal runtime performance.
Conduct hands-on trainings, develop sample code and host presentations to give good guidance on efficient end-to-end AI deployment targeting optimal runtime performance.
Guide developers of AI applications applying methodologies for efficient adoption of DL frameworks targeting maximal utilization of GPU Tensor Cores for the best possible inference performance.
Collaborate with GPU driver and architecture teams as well as NVIDIA research to influence next generation GPU features by providing real-world workflows and giving feedback on partner and customer needs.

What we need to see:

5+years of professional experience in local GPU deployment, profiling and optimization.
BS or MS degree in Computer Science, Engineering, or related degree.
Strong proficiency in C/C++, Python, software design, programming techniques.
Familiarity with and development experience on the Windows operating system.
Proven theoretical understanding of Transformer architectures - specifically LLMs and Generative AI - and convolutional neural networks.
Experience working with open-source LLM and GenAI software, e.g. PyTorch or llama.cpp.
Experience with CUDA and NVIDIA's Nsight GPU profiling and debugging suite.
Strong verbal and written communication skills in English and organization skills, with a logical approach to problem solving, time management, and task prioritization skills.
Excellent interpersonal skills.
Some travel is required for conferences and for on-site visits with external partners.

Ways to stand out from the crowd:

Experience with GPU-accelerated AI inference driven by NVIDIA APIs, specifically cuDNN, CUTLASS, TensorRT.
Confirmed expert knowledge in Vulkan and / or DX12.
Familiarity with WSL2, Docker.
Detailed knowledge of the latest generation GPU architectures.
Experience with AI deployment on NPUs and ARM architectures.

NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all

Apply Now

Date Posted

12/03/2024

Views

Back to Job Listings ❤️Add To Job List Company Info View Company Reviews

Positive

Subjectivity Score: 0.9

Similar Jobs

Account Manager, Care Partnerships - Headway

Views in the last 30 days - 0

Headway a mental health care company founded in 2019 aims to revolutionize mental healthcare by building a national network of providers accepting ins...

View Details

Director of Pricing - Garner Health

Views in the last 30 days - 0

Garner Health is a rapidly growing company backed by toptier venture capital firms Their mission is to transform the healthcare economy by delivering ...

View Details

Director, Product, Customer, and Lifecycle Marketing - Garner Health

Views in the last 30 days - 0

Garner Health is seeking an experienced Product Marketing Leader to join their team The ideal candidate will lead the product marketing efforts focusi...

View Details

Linux Support Engineer - Voltage Park

Views in the last 30 days - 0

Voltage Park is seeking a Linux Support Engineer for a fulltime remote position The ideal candidate will have command line level Linux sys administrat...

View Details

Data Analyst - Agero

Views in the last 30 days - 0

Agero a leading B2B whitelabel provider of digital driver assistance services is revolutionizing the vehicle ownership experience through datadriven t...

View Details

Director, Product (Remote) - Dscout

Views in the last 30 days - 0

Dscout is a leading company in experience research technology offering a platform for major companies to gain insights into user needs and behaviors T...

View Details