Accelerated Distributed AI Applications on the AWS Cloud
For the past decade or so, many companies have focused their deep learning training on accelerators, such as GPUs, in order to make deep learning more accessible and cost-effective in the long run. However, some AI/ML workloads may require different hardware, depending on their complexity. As data computations and algorithms grow increasingly intricate, it’s essential for developers and researchers to become more compute-aware and be able to understand the depth of optimization of an AI workload, according to Alvarez. To become a compute-aware AI/ML programmer, he encouraged aspiring developers to keep performance in their minds, making conscious decisions about the type of software and hardware they run on the stack.
In the rapidly-growing technological world of data science, one skill that compute-aware developers should possess is the ability to carry out a Kubernetes application, according to Alvarez. These applications can help manage a user’s infrastructure through autoscaling, aiding programmers in overseeing dockerized applications in a distributed fashion across various clusters. Alvarez explained the general anatomy and function of a Kubernetes system, addressing how the system dynamically addresses computation failures.
Alvarez briefly discussed Intel’s AI toolkit, showcasing its cloud optimization modules (ICOMs), which are open-source codebases with codified AI optimizations and instructions for selecting appropriate hardware per Cloud Service Provider (CSP). To showcase a module in action, he demonstrated how to configure and deploy an application focused on loan default risk prediction, utilizing the oneDAL hardware-level accelerations in the daal4py library, which are only available on Intel hardware. Additionally, he gave a tour of the Intel codebase, as well as the AWS cloud infrastructure.