As often happens with quickly and widely adopted innovations, the very factors that make them successes can create unforeseen problems that threaten to overwhelm them. Witness the incredible volume of data the IoT creates even in these early days.
Fortunately, the solution to this data explosion — “edge” processing — promises to also multiply the IoT’s benefits.
That volume of data generated by sensors and other IoT devices is objectively overwhelming. T0 pick only one example, a typical smart factory will create about 1 petabyte of data daily. Experts predict that by 2025, the total of data generated by the IoT in all its manifestations will exceed 175 zettabytes: 10 times the 2016 levels.
Understandably, such volume overwhelms the conventional data processing infrastructure, especially the combination of the cloud, data transmission, and central data processing. Even worse, perhaps the IoT’s most important advantage, the ability to apply data almost instantly in order to fine-tune things’ operations, is diminished if the real-time data must be relayed to the cloud, analyzed, then returned to the collection point to finally trigger action.
As a result, there’s a growing trend toward IoT data analysis at “the edge”; i.e., at or near the point of collection.
What exactly is the edge? According to the new Data at the Edge report from a coalition of edge computing companies, it is where “real-time decision-making untethered from cloud computing’s latency takes place [my emphasis].” That’s critical, because it’s the ability for near-real time analysis followed immediately by implementation that differentiates the IoT from past practices, which had to rely on spotty, historical data and centralized, delayed analysis and implementation. (Of course, the report’s findings should be viewed with some skepticism because of the sponsors’ edge advocacy, But, of all the material I’ve read over the past few years about the emergence of edge technology, it’s the best pocket guide to edge strategy, so ignore it at your own risk.)
Let’s clarify from the beginning: the edge and the cloud are not mutually exclusive. In fact, as Data at the Edge states, “it will be cloud with edge.” However, just apply a common-sense test: the capacity simply doesn’t exist at present to gather huge volumes of data on sensors in remote locations such as wind turbines and deep-sea drilling platforms, transmit the entire collection to the cloud, process it centrally, and then return the results to the edge. That will become even more the case in the near future, with developments such as autonomous vehicles, where reducing latency and rapid adjustment based on data is a matter of life-or-death.
Processing at the edge isn’t as simple as scaling down the cloud and locating mini-clouds
randomly. For example, much of the data is gathered in isolated locations on pipelines or planes, where it’s hard for maintenance workers to adjust the sensors and/or processing equipment, and, worst of all, in some highly sensitive operations such as critical infrastructure, they may be tampered with. Thus, the design must be hardened, self-healing, and secure.
The key factor increasing the volume of data generated at the edge is more powerful, less-energy-consuming sensors, which can make the information they gather more valuable to you.
In particular, the report cites progress with video ones:
"… As the resolution and frame rates for video have gotten more advanced, the camera lens has emerged as the most prolific and versatile IoT sensor. Image sensors capture rich context that cannot be acquired with single-purpose devices, like a temperature sensor that periodically takes measurements. When combined with deep learning (DL) algorithms running on powerful graphics processing units (GPU) and tensor processing units (TPU), images open up new uses.”
Deploying these new sensors creates a virtuous cycle: “The more sensors a system uses, the more data the system collects—and the better the predictive model built with the data can be. This, in turn, leads to demand for even more instrumentation with more sensors.”
Data at the Edge also concludes that another benefit is that it’s also cheaper to do the analytics at the edge and only send on a small portion of the overall data to the cloud that is truly significant and might be valuable for further analysis at a later date. Part of the reason is that cloud facilities are so large and consume so much energy that they can really only be located in rural areas with cheap real estate and abundant energy, such as hydro, and that makes them inherently remote from the point of use.
Edge facilities, by contrast, are ideally suited to the users’ need for nearby, real-time processing. They require only hundreds of square feet, and can range from regional to very small ones. They can be located in a wide range of places, including parking lots and even right at the base of cell towers.
Furthermore, there are even regulatory reasons favoring nearby processing. Some nations restrict data movement across their boundaries, and some cities require that their data stay in the municipality.
In the final analysis, from a strategic perspective, the ultimate argument for edge processing comes down to the IoT’s essence: enabling real-time collection of data from “things” in the field followed by near-real-time processing—and then acting on that data to increase precision, uncover problems early to allow “predictive maintenance,” and feed accurate information about how things do (or don’t) work in the field to allow rapid upgrades and increase customer satisfaction. The latency that is an inevitable aspect of cloud computing is real-time’s enemy.
Given that, it’s no wonder that Gartner predicts that, by 2022, 75% of all data will need to be analyzed and acted on at the edge.