Optimal Drive Technology

Sponsored Article

Moving AI to the edge

Author : Yuan Lee (Advantech) & Dr. Maurits Kaptein (Scailable)

31 August 2021

Shutterstock image
Shutterstock image

Over the last decade, Machine Learning (ML) and Artificial Intelligence (AI) have become more pervasive. Speech recognition, object detection, and product recommendation comprise just a short list of applications that are enabled using powerful AI models.

Today, almost all of these AI models live in the cloud: data is sent to the cloud to use the AI and the results are subsequently returned. Tomorrow, we argue, AI models will mostly live on the edge and not in the cloud. With the appropriate tooling available, moving AI models to the edge is effortless and reduces the latency, energy consumption, and costs, of AI in various applications.

AI training versus deployment

To understand the need for moving AI to the edge, we first need to demystify AI. Novice AI users associate AI with large amounts of data and extensive computing resources. We think terabytes of data are fed to powerful GPUs to enable AI. This is partly true: yes, terabytes of data and powerful GPUs are often necessary to create (or train) AI models. Thus, AI training is inherently a cloud activity. However, using trained AI models is different: a surprising number powerful AI models can, and for many purposes should, be deployed to relatively small and omnipresent edge devices such as ubiquitous 32-bit gateways.

Consider the use of AI for object detection: we might want to count the number people in a room, identify whether a box pallet is obstructing the walking-route in a production facility, or use vision to recognise product anomalies. All of these applications rely on AI for object detection. To create such an AI model, we actually do need loads of data and fast processing. Creating an AI model starts with a vast number of labelled examples consisting of both the input to the model, and the desired output (i.e., images of varying numbers of people in a room, including the correct number of people). Creating such a dataset is non-trivial, and hence developing new object recognition models is challenging. Next, given a large, labelled, dataset, a powerful computer can learn the – often complex – rules that map the input to the output. The process of training an AI model in many ways boils down to iteratively trying new sets of rules that map the input image to the desired output digit over and over again while gradually improving performance. To do this in a reasonable time we need powerful GPUs.

However, once we have trained the AI model and we would like to use it things are different. We are now dealing with one specific set of rules – those that have been learned during training – and the need for GPUs effectively vanishes. Furthermore, we are only dealing with single datapoints: we want to know the number of people on a given image. Even if we want to use AI on video, and thus deal with 24 frames per second, the number of computations does not come close to the number of iterations needed to train a model. Thus, training and AI model and using an AI model are very different things. Surprisingly to some, using AI can often be done on relatively small devices and is thus not restricted to the cloud.

The benefits of edge deployment of trained AI models

While the inherent differences between AI training and AI usage make it theoretically possible to use AI models outside of the cloud, it might not immediately be clear why we would want to move AI usage out of the cloud to various Edge and IoT devices in practice. Cloud execution of AI seems to bring a number of benefits: when using AI in the cloud we do not need to worry about the diversity of various edge devices (i.e., we don’t need to worry about varying chipsets, operating systems, etc.). Furthermore, cloud usage of AI seems appealing as we can often use the same tools we used to train the AI models (i.e., TensorFlow, PyTorch, etc.) for AI usage, which means that the same data scientists who create the models can also ensure that the models can be used. These arguments however overlook a number of important benefits of moving AI models to the edge:

1. Low latency: When AI models are used on edge devices the necessary computations can happen close to the source. Thus, there is no need for sending data to and from the cloud, and latency is greatly reduced. 
2. Low energy: Sending data to and from the cloud consumes significant energy. Running AI models efficiently on edge devices significantly reduces the energy consumption of many AI applications.
3. Low costs: Advanced AI applications come with high network and cloud computation costs. Being able to reduce network traffic and execute the computations on existing edge devices, lowers the costs of ownership of AI.

For applications in which data is generated on the edge, and the results from the AI model are also needed on the edge, as is the case with many vision and vibration sensing AI applications, the benefits of Edge AI quickly outweigh the perceived benefits of running AI in the cloud, especially as new tools that allow for the modular, safe, and efficient deployment of trained AI models to edge devices have recently emerged. 

Tools and methods for edge AI

Trained AI models are simply sets of rules, often specified in terms of matrix multiplications and convolutions, that map the input data (i.e., the image), to the desired output (i.e., a digit representing the number of people in the image). Once learned, these rulesets can often be expressed in megabytes of data (as opposed to the terabytes of training data needed to learn them). The term “Edge AI” simply means bringing this ruleset towards the data, as opposed to repeatedly sending data to the cloud to evaluate this ruleset. 

However, cloud execution of AI is still the norm. This is because those who create AI models understandably work on cloud systems and because an ecosystem of tools to “deploy” AI models to the cloud exists. Any data scientist who knows how to create an AI model will also know how to spin up a (e.g.,) docker container to expose a REST endpoint such that the AI model can be used in the cloud. This process has become effortless over the last few years, and is fully modular; i.e., it allows for continuous and secure updates of the AI model without intervening with the surrounding ecosystem. These modular updates enable continuous innovation by a data science team. 

Such tools up until recently did not exists for the effective, secure, and modular deployment of AI to edge devices. Containerised solutions, while simple to use and modular, consume too much resources to be of use for AI deployment on many edge devices. As an alternative, the tinyML community has developed numerous amazing tools for compiling ML/AI models towards various targets. This approach is attractive for deploying relatively simple models efficiently to small devices such as MCUs, but it does not provide modular updates – and hence hinders continuous innovation. Furthermore, this deployment process does not align well with the continuous development processes that are common in the data science community. For Edge AI to truly become successful we need tools that algin with the process of AI development, and that allow for secure, modular, and efficient deployment.

Recent technological advances are currently making secure, modular, and efficient edge AI deployment possible. The emergence of WebAssembly, which can be thought of as intermediate representation optimised for portability, has made it possible to modularly and efficiently move code to edge devices. In many respects, the WebAssembly runtime can be thought of as a secure container, providing modularity, at the costs of several Kbs in disk space (as opposed to Gbs for most container solutions). Combine this with recent tooling to store models independently of their training tools (such as the ONNX standard), and tooling for compilation from ONNX to WebAssembly as created by Scailable, and it is possible to move complex AI models to the edge efficiently and modularly. Thus, we are now able to combine the benefits of the edge for low latency, low energy, and low-cost deployment of AI models with the traditional benefits of the cloud: modularity and alignment with the data science process.

The future of AI usage is on the edge

With appropriate tooling we believe the future of many AI solutions will be on the edge. Yes, we will use the cloud for AI training, and yes, applications for which the models will run in the cloud will remain. But, for many applications edge AI is simply the better alternative. To further push the boundaries of edge AI, Advantech and Scailable are jointly creating plug-and-play edge AI runtimes that are pre-installed on Advantech gateways. Using the edge AI runtime, all “edge” engineering has been automated. Thus, AI models can be deployed to a gateway without any additional necessary coding on the device and the deployment process seamlessly integrates with the data science process. 

Advances in tooling make edge AI achievable for any data scientist. And, subsequently, as models can be standardised and deployed anywhere, this tooling opens up the possibility of a future market place of AI models such that AI users do not need to worry about their cloud, the terabytes of data, and their GPUs: they can simply select the AI functionality they desire and use it on their edge device. 


Contact Details and Archive...

Print this page | E-mail this page


Optimal Drive Technology

This website uses cookies primarily for visitor analytics. Certain pages will ask you to fill in contact details to receive additional information. On these pages you have the option of having the site log your details for future visits. Indicating you want the site to remember your details will place a cookie on your device. To view our full cookie policy, please click here. You can also view it at any time by going to our Contact Us page.