Sr. Principal Software Engineer
đşđ¸ United States
CUDA
Machine Learning
Design
Security Engineer
$185,000.00 - $280,000.00
Sr. Principal Software Engineer
from đşđ¸ United States
$185,000.00 - $280,000.00
A Moving Experience.
Who is Cerence AI?Â
Cerence AI is the global leader in AI for transportation, specialized in building AI and voice-powered companions for cars, two-wheelers, and more that enable people to focus on what matters most. With over 500 million cars shipped with Cerence AI's technology, we partner with leading automakers (such as Volkswagen, Mercedes, Audi, Toyota and many more), mobility providers, and technology companies to power intuitive, integrated experiences that create safer, more connected, and more enjoyable journeys for drivers and passengers alike.Â
Â
Our Driving Force Â
Our team is dedicated to pushing the boundaries of AI innovation, working around the globe with headquarters in Burlington, Massachusetts, USA and 16 other offices across Europe, Asia, and North America. We bring together diverse backgrounds, and varied skill sets with the shared goal of advancing the next generation of transportation user experiences. Our culture is customer-centric, collaborative, fast-paced, and fun, with continuous opportunities for learning and development to support your career growth.Â
Â
Interested in having a significant impact in a dynamic industry with a high-performing global team? Weâre looking for an exceptional Senior Principal Software Engineer who is ready to drive the future of mobility with us!Â
Job Description:
What You Will Work OnÂ
OptimizeâŻand deployâŻhighâperformanceâŻLLM inference pipelinesÂ
Own inference runtimes acrossâŻdata center, edge, and embedded platformsÂ
Push model performance through quantization, kernel fusion, and cache optimizationÂ
Drive latency and throughput improvements that directlyâŻimpactâŻproduction productsÂ
Enable efficient, reliable deployment without external vendor dependencyÂ
Â
Core ResponsibilitiesÂ
Inference Engines & RuntimeÂ
Build deepâŻexpertiseâŻand ownership of:Â
vLLMÂ
TensorRTâLLMÂ
llama.cppÂ
QAIRTÂ
Extend and tune inference engines usingâŻcustom CUDA kernelsÂ
Adapt runtimes for constrained and embedded deployment environmentsÂ
Â
Quantization & Numerical OptimisationÂ
Implement and evaluate quantisation strategies:Â
INT8, INT4, FP4, FP8, mixed precisionÂ
AWQÂ
GPTQÂ
Balance accuracy, latency, memory footprint, and throughputÂ
KV Cache OptimizationÂ
OptimizeâŻkeyâvalue cache performance through:Â
PagingÂ
Prefix cachingÂ
CacheâawareâŻmemory layout designÂ
Reduce memory pressure while sustaining high throughputÂ
Â
Latency & Throughput OptimisationÂ
Design and tune:Â
Batching strategiesÂ
Continuous batchingÂ
Speculative decodingÂ
OptimizeâŻtail latency and tokens/sec under real production traffic patternsÂ
Â
What Success Looks LikeÂ
Models deploy efficiently onâŻedge and embedded devices, not just serversÂ
Tokens/sec significantly outperform baseline implementationsÂ
EndâtoâendâŻlatency is minimized and predictableÂ
Inference cost per request is materially reducedÂ
The company is no longer dependent on partners for inference optimizationÂ
Â
Required Experience & SkillsÂ
Strongly RequiredÂ
Proven experienceâŻoptimizingâŻML inference performance in productionÂ
Deep understanding ofâŻGPU architecture and memory hierarchiesÂ
HandsâonâŻexperience with CUDA andâŻlowâlevelâŻperformance tuningÂ
Experience deploying models beyond research environmentsÂ
Critical Technical SkillsÂ
Inference engines:âŻvLLM,âŻTensorRTâLLM, llama.cpp, QAIRTÂ
CUDA kernel development and profilingÂ
Quantisation techniques: INT8/INT4/FP4/FP8, AWQ, GPTQÂ
KV cache optimisation and memory layout designÂ
Latency optimisation: batching, speculative decoding, continuous batchingÂ
Â
Common ProblemsâŻYouâllâŻBe SolvingÂ
Deploy efficiently onâŻedge or embedded targetsÂ
Achieve competitiveâŻtokens/secÂ
Reduce and stabilize inference latencyÂ
You willâŻbe responsible forâŻclosing these gaps, creating a major competitive advantage.Â
Â
What we offerÂ
We offer a generous compensation and benefits package (in addition to the base salary), including:Â
Salary range $185,000.00 USD - $280,000.00 USDIt is not typical for offers to be made at or near the top of the range. The actual salary will be determined based on experience and other job-related factors.Â
Annual bonus opportunityÂ
Insurance coverage (medical, dental, vision, life, and disability)Â
Paid time offÂ
Paid holidaysÂ
Company contribution to the RRSP (Registered Retirement Savings Plan)Â
Equity awards for certain positions and levelsÂ
Remote and/or hybrid work available depending on the positionÂ
All compensation and benefits are subject to the terms and conditions of the underlying plans or programs, as applicable, and may be amended, terminated, or replaced from time to time.Â
Cerence Inc. (Nasdaq: CRNC and www.cerence.com) is the global industry leader in creating unique, moving experiences for the automotive world. Spun out from Nuance in October 2019, Cerence is a new, independent company that has quickly gained traction as a leader in the automotive voice assistant space, working with all of the worldâs leading automakers â from Ford and Fiat Chrysler to Daimler, Audi and BMW to Geely and SAIC â to transform how a car feels, responds and learns. Its track record is built on more than 20 years of industry experience and leadership and more than 500 million cars on the road today across more than 70 languages. Â
Â
As Cerence looks to the future and continues an ambitious growth agenda, we need someone to join the team and help build the future of voice and AI in cars. This is an exciting opportunity to join Cerenceâs passionate, dedicated, global team and be a part of meaningful innovation in a rapidly growing industry.Â
EQUAL OPPORTUNITY EMPLOYER
Cerence is firmly committed to Equal Employment Opportunity (EEO) and to compliance with all federal, state and local laws that prohibit employment discrimination on the basis of age, race, color, gender, gender identity, gender expression, sex, sex stereotyping, pregnancy, national origin, ancestry, religion, physical or mental disability, medical condition, marital status, citizenship status, sexual orientation, protected military or veteran status, genetic information and other protected classifications. Cerence Equal Employment Opportunity Policy Statement.
All prospective and current Employees need to remain vigilant when it comes to executing security policies in the workplace. This includes:
- Following workplace security protocols and training programs to familiarize with the ways to maintain a safe workplace.
- Following security procedures to report any suspicious activity.
- Having respect for corporate security procedures to allow those procedures to be effective.
- Adhering to company's compliance and regulations.
- Encouraging to follow a zero tolerance for workplace violence.
- Basic knowledge of information security and data privacy requirements (e.g., how to protect data & how to be handling this data).
- Demonstrative knowledge of information security through internal training programs.









