Vela: this is the first IBM supercomputer optimized for AI

IBM has presented Vela, its first cloud-native supercomputer optimized for Artificial Intelligence (AI), a technology with the necessary flexibility to “easily” implement an infrastructure of this magnitudein any IBM Cloud data center worldwide.

The American technology company has explained that AI models have an increasing impact on our day-to-day lives, addressing our lives with increasingly complex algorithms and new techniques that, in turn, require more resource capacity and “gross power”. » in order to function properly.

And here, of course, supercomputers come into play, but as the American multinational explains, traditional supercomputers were not designed to develop AI and, although they work well for it, the traditional design requires technological options that “increase costs and limit flexibility.” of implementation”.

Taking all this into account, IBM has developed Vela, its cloud-native supercomputer specially optimized for AI development, with which it has achieved flexibility to implement it in any of its centers and lower costs. As detailed by the company, this project has been active since May of last year, hosted in the IBM Cloud and, for the moment, its use is intended solely for the IBM Research communitybecoming the “go-to environment” for researchers and for building more advanced AI capabilities.

IBM has explained that by opting for the development of the supercomputer in the cloud, they have sacrificed a part of their performance but, in exchange, they have managed to significantly increase their productivity. This is possible thanks to the fact that this powerful cloud computer allows everything necessary to be configured through software. Likewise, it makes it possible to take advantage of the data sets that reside in IBM Cloud Object Storeso there is no need to build your own storage server.

Following this line, the company has also explained that it can take advantage of its virtual private cloud (VPC) network capacity to collaborate with partners “using advanced security practices.”

As for the reduction in performance capacity that the cloud implies, IBM has stressed that after “a significant amount of research and discovery” they managed to make the virtualization overhead lower, less than 5 percent, exposing all the capabilities of the node. ie GPU, CPU, networking and storage, in the virtual machine (VM).

On the other hand, the composition of the nodes of the AI ​​supercomputer includes a large GPU memory with a capacity of 80GB and local storage memory on the node of 1.5TB of dynamic random access memory (DRAM) along with four 3.2TB fast non-volatile memory (NVMe) drives. All this, based on the intention of using Vela to train large AI models.

Each node has two 2nd generation Intel Xenon processors. Vela also incorporates NVLink and NvSwitch solutions to interconnect the eight 80GB A100 graphics processing units that are located in each node.

IBM has indicated that they will share more information about upcoming improvements in both productivity and end-user performance that will facilitate “emerging systems and software innovations.”

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *