Jiusheng Chen’s team just got accelerated.
It’s an amazing achievement for the principal software engineering manager and his crew.
Tuning a Complex System
Bing’s ad service uses hundreds of models that are constantly evolving. Each must respond to a request within as little as 10 milliseconds, about 10x faster than the blink of an eye.
Together, they apply sophisticated techniques to do more work in less time with less computer memory. Model training was based on Azure Machine Learning for efficiency.
Flying With NVIDIA A100 MIG
Next, the team upgraded the ad service from NVIDIA T4 to A100 GPUs.
The latter’s Multi-Instance GPU (MIG) feature lets users split one GPU into several instances.
Chen’s team maxed out the MIG feature, transforming one physical A100 into seven independent ones. That let the team reap a 7x throughput per GPU with inference response in 10ms.
Flexible, Easy, Open Software
Triton enabled the shift, in part, because it lets users simultaneously run different runtime software, frameworks and AI modes on isolated instances of a single GPU.
The inference software comes in a software container, so it’s easy to deploy. And open-source Triton — also available with enterprise-grade security and support through NVIDIA AI Enterprise — is backed by a community that makes the software better over time.
Accelerating Bing’s ad system with Triton on A100 GPUs is one example of what Chen likes about his job. He gets to witness breakthroughs with AI.
While the scenarios often change, the team’s goal remains the same — creating a win for its users and advertisers.
Written by admin
March 1, 2024GFN Thursday celebrates this leap day with the addition of a popular game store to the cloud. Stream the first titles from Blizzard Entertainment’s Battle.net, including Diablo IV, Overwatch 2, Call of Duty HQ and Hearthstone, now playable across more...
ZDI-24-214: NI FlexLogger RabbitMQ Incorrect Permission Assignment Local Privilege Escalation VulnerabilityFebruary 29, 2024This vulnerability allows local attackers to escalate privileges on affected installations of NI FlexLogger. An attacker must first obtain the ability to execute low-privileged code on the target system in order to exploit this vulnerability. The ZDI has assigned a...
ZDI-24-213: NI FlexLogger userservices Missing Authorization Local Privilege Escalation VulnerabilityFebruary 29, 2024This vulnerability allows local attackers to escalate privileges on affected installations of NI FlexLogger. An attacker must first obtain the ability to execute low-privileged code on the target system in order to exploit this vulnerability. The ZDI has assigned a...