NVIDIA is seeking a highly skilled and motivated CPU Optimization Engineer to join our Compute Developer Technology (Devtech) team. In this role, you’ll research, design, and implement performance optimization strategies across a broad range of workloads—including AI data preprocessing, scientific and HPC applications, and emerging data-intensive computing domains—on NVIDIA Grace/Vera CPUs. You’ll engage directly with developers and researchers to help them achieve breakthrough performance and efficiency while influencing NVIDIA’s future CPU and software architecture.
What you will be doing:
Collaborate with developers, researchers, and framework maintainers across industries to identify and resolve performance challenges in diverse workloads such as AI, data analytics, simulation, and numerical computing.
Profile, analyze, and optimize CPU performance from application-level algorithms down to low-level microarchitecture.
Contribute to open-source frameworks, key software stacks, reference implementations, and performance libraries to unlock full CPU potential.
Work closely with NVIDIA’s architecture, research, libraries, tools, and system software teams to improve our overall platform performance.
Provide insights that shape next-generation CPU designs, compiler toolchains, and development workflows for better developer productivity and throughput.
What we need to see:
BS, MS, or PhD in Computer Science, Computer Engineering, or a related field.
5+ years of relevant experience in performance engineering or CPU optimization.
Strong programming proficiency in C/C++ and/or Python, with a deep understanding of algorithms and software architecture.
Solid grasp of CPU microarchitecture, performance analysis tools, and optimization methodologies.
Proven track record of CPU benchmarking and bottleneck-driven performance tuning.
Excellent communication and organizational skills, with the ability to collaborate effectively across teams and manage multiple priorities.
Ways to stand out from the crowd:
Experience optimizing AI or data preprocessing pipelines on CPUs.
Familiarity with HPC applications, parallel computing, and distributed runtime environments.
Hands-on experience with SIMD instruction sets, low-level intrinsics, or vectorization.
Contributions to open-source performance tools or HPC frameworks.
With highly competitive salaries, a comprehensive benefits package, and a great company culture, NVIDIA is widely considered to be one of the technology industry's most desirable employers. We have some of the most forward-thinking and hardworking people in the world and our engineering teams are rapidly growing. If you are a creative and autonomous engineer with a real passion for technology, we want to hear from you.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.