大模型推理引擎研发工程师Apply |
|
Job Source |
腾讯集团 |
Location |
China, Shanghai |
Salary |
Negotiable |
Job Type |
Full Time |
Language |
|
Job Posted Date |
11-06-2025 |
Job Description |
|
1.研发及优化大模型推理引擎;
2.推广公有云客户,创造技术优势引导客户上云; 3.对接客户业务,分析性能瓶颈,定位、解决问题; 4.协助业务内部集群部署,持续迭代性能,保持业内领先优势。 |
|
Job Requirements |
|
1.熟悉主流大模型推理框架,如vllm,lightllm,tensorrt-llm,lmdeploy,faster transformer等;
2.熟悉CUDA,triton(https://openai.com/research/triton)、cutlass至少一种以上,精通者优先; 3.熟悉大模型结构,了解大模型性能瓶颈,熟练分析单机及分布式情况下不同性能热点和优化手段; 4.熟悉大模型量化算法,int8/fp8/混合精度量化,了解模型蒸馏、稀疏化、剪裁技术; 5.熟悉推理服务框架,具备服务部署经验者优先,了解k8s,容器化服务,Triton Inference Server (https://github.com/triton-inference-server/server)实现原理者优先; 6.熟悉分布式模型部署及并行策略,如模型并行、流水线并行等,了解NVLINK、GPU通信者优先; 7.熟练掌握Python及C++; 8.了解GPU体系结构者优先。。加分项:1.在同等条件下,通过腾讯云认证或取得同等资格认证的候选人,我们会优先考虑。 |
Welcome to Linkedtour! Please complete your profile first and then enjoy your trip in Linkedtour!
Please complete now your information at our partner site and click to apply. Good luck !