News

First details of Singularity, Microsoft’s Artificial Intelligence infrastructure service

microsoft is working on the development of a new Artificial Intelligence infrastructure service. Is named Singularity, and teams of experts from the areas of Microsoft Azure and Microsoft Research work on its creation. They have been collaborating on its development for some time, but it has not been until now that those from Redmond have decided to make some details of this service public. Part of this has been known through various job offers for the group, in which professionals are asked to work in «a new AI platform service built from the ground up that will become a huge driver for AI, both inside and outside of Microsoft«.

In addition, a group of 26 Microsoft experts working on its development, including the Azure CTO Mark Russinovichhas published a study entitled «Singularity: Planet-Scale, Preemptible and Elastic Scheduling of AI Workloads» (Singularity: Globally Preferred and Elastic Scheduling of Artificial Intelligence Workloads”), with some technical details about the project. Likewise, he points out that the service that Singularity will offer is designed to give access to data scientists, and those who work with Artificial Intelligence, a system to develop, scale and experiment with their models. They will be able to do this with Singularity, which is a distributed infrastructure service tailored to work with AI.

This study details that «At the core of Singularity is a new workload-aware scheduler that can transparently, preemptively, and elastically scale deep learning workloads to facilitate high usage that has no impact on correctness or performance. its performance, through a fleet of accelerators. Among them GPUs and FGPAs«. In addition, there is also talk that with this service, and thanks to the aforementioned programmer, the company will be able to control costs.

What is most detailed in the study is this programmer, although it offers some other data that gives clues about the system’s architecture. Thus, a performance analysis of Singularity mentions a test run on Nvidia DGX-2 servers with a Xeon Platinum 8168 with two sockets of 20 cores each, eight V100 GPUs per server, 692 GB of RAM and an Infiniband-based network. The Singularity fleet includes thousands of GPUs, as well as FPGAs and possibly other accelerators.

The system software automatically decouples tasks from accelerator resources, which means that when tasks are scaled up or down, simply “Change the number of devices employees are connected to. This is completely transparent to the user, while the size of the task remains the same no matter how many physical devices are performing it.«. This is possible thanks to a new technique called «replica union«, with which «multiple workers’ time can be split on the same device with minimal costs, allowing each worker to use all of the device’s memory«.

To do this, the authors need what the experts working on Singularity have called “proxy device“, and that “runs in its own address space and has a one-to-one correspondence with a physical accelerator device«. Thus, when a worker initializes the device’s APIs «they are intercepted and sent, via shared memory, to the device’s process proxy, which runs in a different address space, and whose time to live is independent of that of the worker process«. All of this makes it possible to schedule more tasks and do it more efficiently, allowing thousands of servers to be put into service, and to do so for longer. In addition, it also allows rapid scaling, to grow or decrease, without problems.

This is not the first time that Microsoft has discussed plans to make FPGAs (programmable gate arrays) available as a service to its customers. In 2018 Microsoft released the Brainwave Project, designed to offer a fast process of Artificial Intelligence in Azure. So, Microsoft unveiled a pre-release version of Azure Machine Learning Accelerated Models, powered by Brainwave in the cloud.

This was seen as a first step in making the FPGA process accessible to customers for AI workloads. Singularity may be the next phase in Brainwave becoming a service available to customers, but it’s not clear at this time if that’s the case. Nor when will it be publicly available.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *