Connecting World's top Talents with Premier Jobs and Networking.
Register
Connecting World's top Talents with Premier Jobs and Networking.

大模型推理引擎研发工程师

Apply instagram Share link

Job Source

腾讯集团

Location

China, Shanghai

Salary

Negotiable

Job Type

Full Time

Language

Job Posted Date

11-06-2025

Job Description

1.研发及优化大模型推理引擎;
2.推广公有云客户,创造技术优势引导客户上云;
3.对接客户业务,分析性能瓶颈,定位、解决问题;
4.协助业务内部集群部署,持续迭代性能,保持业内领先优势。

Job Requirements

1.熟悉主流大模型推理框架,如vllm,lightllm,tensorrt-llm,lmdeploy,faster transformer等;
2.熟悉CUDA,triton(https://openai.com/research/triton)、cutlass至少一种以上,精通者优先;
3.熟悉大模型结构,了解大模型性能瓶颈,熟练分析单机及分布式情况下不同性能热点和优化手段;
4.熟悉大模型量化算法,int8/fp8/混合精度量化,了解模型蒸馏、稀疏化、剪裁技术;
5.熟悉推理服务框架,具备服务部署经验者优先,了解k8s,容器化服务,Triton Inference Server (https://github.com/triton-inference-server/server)实现原理者优先;
6.熟悉分布式模型部署及并行策略,如模型并行、流水线并行等,了解NVLINK、GPU通信者优先;
7.熟练掌握Python及C++;
8.了解GPU体系结构者优先。。加分项:1.在同等条件下,通过腾讯云认证或取得同等资格认证的候选人,我们会优先考虑。



腾讯集团




Just one more quick step more to complete your application!

 

Welcome to Linkedtour! Please complete your profile first and then enjoy your trip in Linkedtour!

 

Just one more quick step more to complete your application!

 

Please complete now your information at our partner site and click to apply. Good luck !