This post first published on the Altran blog. The world is rapidly moving toward a realization that an unlimited number of devices will generate massive amounts of data that needs to be processed in near real-time. Edge Computing allows applications to be hosted closer to devices or consumers and can deliver low-latency, real-time services to end-users. Use cases have been debated and discussed over the past few years, and I have written about edge compute use cases, both for industrial context and consumers. Last year, we discussed the need for open source in infrastructure and runtime platforms for edge compute with respect to mobile and network operators and debated on how their strategy should be as they move toward monetizing 5G and edge computing. For this blog, I will focus on accelerating compute on the network edge.
Hardware Acceleration in Edge Compute and Container Ecosystem
We discuss edge computing mostly in the context of low-latency sensitive applications, so the focus has been on the networking aspects and the location of the edge compute. Having said that, the workload itself (or the application) will run on the compute, and any delay introduced while computing will contribute to the overall latency. Many such applications require hardware acceleration to reduce compute latency, including field-programmable gate arrays (FPGA) and graphical processing units (GPU). There are several reasons why hardware acceleration may be required in edge compute:
- Compute acceleration: Machine learning and artificial intelligence (AI) algorithms are finding their way into edge devices and other connected systems that leverage GPUs, FPGAs, and innovative AI chips.
- Localized low-latency decisions: Systems are being designed toward building real-time analysis and response to events that require hardware acceleration.
- Security and retaining data: Protecting user data from being infected by rogue data into AI systems and avoiding misuse and tampering is critical for embedded AI. Acceleration in the edge for such AI algorithms can help solve that problem.
- Power and Energy requirements: Edge computes have constraints on power budgets, and utilizing the accelerators in an optimized manner can help reduce the overall power footprint. Edge compute in service provider networks is also expected to host virtualized network functions (VNFs) and containerized NFs, such as virtualized access for wireless networks and virtual baseband units; fixed networks, such as virtual optical line terminals; virtual broadband network gateways; and cable networks, such as virtual cable modem termination systems (VCMTS). In many cases, parts of these network functions may benefit from hardware acceleration, like FPGAs. In recent years, the container ecosystem driven out of Kubernetes-based deployments have found rapid adoption with application developers and NF vendors. It is imperative that accelerator vendors support cloud-native frameworks for development, deployment, and monitoring of applications that are built for such accelerators. However, some challenges have arisen with respect to developing applications for such accelerators. There is a lack of generic frameworks to help application developers create performance-optimized code for accelerators that can be deployed using standard cloud-native orchestrators like Kubernetes. OpenCL is one such framework that attempts to solve this problem. Some accelerator vendors now are actively pursuing projects to ensure support for the cloud-native ecosystem. Additionally, there is a need to transition workloads from one accelerator to another. This is not an easy problem to solve, particularly due to memory and compute penalties experienced during the transition as explained by Randy Levensalor of CableLabs in his blog.
Altran has been investigating this problem through Project Adrenaline, a co-innovation partnership with CableLabs, wherein we jointly developed open source software and collaborated on proof of concepts. Through Project Adrenaline and CableLabs SNAPS™ open-source effort, we have created an initial set of tools that can enable a developer to bootstrap an environment for compute acceleration. This includes automated installation software with pre-built methods for installing low-level drivers for different accelerators and building a workload for the same. This project is called SNAPS-Boot and it implements some of these changes, and our Kubernetes bootstrap tool, SNAPS-Kubernetes, can be used to create a cloud-native ecosystem for developers. Randy Levensalor from CableLabs and I introduced SNAPS-Kubernetes in 2018 at the OpenStack Summit, Berlin. See the video here. In the future, we intend to drive the project toward building a cloud-native, end-to-end model for application development, deployment, and monitoring of hardware accelerators – essentially creating an accelerator-as-a-service. Some of the key ambitions of this projects are:
- Integrate a larger set of hardware accelerators to the platform
- Create an integrated monitoring system for heterogeneous accelerators in cloud-native edge deployments
- Simplify and improve lifecycle management frameworks for acceleration resources
- Develop zero-touch provisioning, multi-tenancy, quality of service (QoS), fault and security aspects of edge compute hardware accelerators We strongly believe that a robust container ecosystem that supports accelerators is required for a successful edge strategy. We look forward to hearing from you on how you plan to implement hardware accelerators in your edge compute and the challenges are you encounter while doing so. Another key area of interest is network hardware accelerators, such as smart network interface cards. If you would like to discuss more Project Adrenaline or share your experience with us, email us at firstname.lastname@example.org.