Optimizing a Neural Network? Prepare to Hit a Wall

Throughout the years, the history of computer vision went hand in hand with the history of compute power. It was not a surprise therefore, that some of the major breakthroughs in this field have happened in the past decade or so, alongside major advancements in cloud computing and Graphics Processing Units (GPUs). The cloud or a massive GPU, as well as modern dedicated ASICs, can provide plenty of resources to solve the compute-intensive visual processing problem.

However, what about applications in which a cloud or a bulky GPU are not a good fit, nor is an inflexible ASIC? There could be many considerations leading to that, such as round-trip latency, battery power consumption, cost or sheer physical size, which could be a hindrance when it comes to computer vision for smaller cars, mass-production sedans and IoT edge devices. Catering for such use cases requires efficient algorithms that do not mandate massive processors, and could produce more AI performance out of any given hardware. This realization has triggered an optimization process of computer vision technology and Deep Neural Networks (DNN) in particular, in order to facilitate efficient edge computing solutions.

Ramifications of running computer vision on a tiny grain of silicon

There are quite a few factors to optimize in order to switch from the abundant resources of a cloud or a GPU to a few square millimeters of silicon. These include performance, accuracy, memory footprint and power consumption, and they directly affect the end product in terms of supported frame rates, detection quality, physical size and price point. Not only that, but there are also tradeoffs between some of these factors that can cause one factor to degrade when optimizing another. As explained below, this is also the case with the tradeoff between performance and accuracy, which could eventually cause the optimization process to hit a wall.

Beyond a certain point, optimizing a DNN stalls due to the performance/accuracy tradeoff
Beyond a certain point, optimizing a DNN stalls due to the performance/accuracy tradeoff

Is it inevitable to choose between performance and accuracy?

Many companies that have tried using an open-source DNN and optimize it to fit their demands, came to realize that beyond a certain point, improving performance comes at the expense of accuracy. This is a no-go for many use cases, especially for automotive where the utmost demand for accuracy is directly related to safety or for use case that are sensitive to false alarms like smart cameras.

There are several reasons why the optimization process might cause accuracy degradation. First of all, the original source code is oftentimes a general one and was not designed for the specific use case at hand. Furthermore, the available high-quality open-source DNNs have already been exhaustively optimized, which leaves little room for further significant improvements that do not degrade accuracy.

One of the main optimization techniques used with open-source DNNs is identifying identical or similar computations and reusing them to eliminate redundancy. Alas, as mentioned above most of these optimizations have already been implemented in the original code and most of the remaining ones are prone to the performance/accuracy tradeoff.

To break free of this inescapable trade off, Brodmann17 has designed a maverick DNN from scratch, without relying on an available open-source as a starting point.

Instead of starting from a network with lots of redundancies and optimizing it, the new patented technology has no redundancy to begin with, as computation reuse was baked into it from square one. As a result, while DNN optimization is prone to accuracy degradation, the Brodmann17 DNN already has optimal performance and accuracy by design. It is important to note though that this innovative approach is not yet another mere optimization method that might sacrifice detection quality for performance; It is rather a solution that has been designed from the ground up to provide ultimate accuracy at a fraction of the compute power.

DNN optimization on the left is prone to accuracy degradation, while the Brodmann17 DNN on the right already has optimal performance and accuracy by design

This pioneering solution has been engineered so that DNN weights and calculations are constantly shared and reused. The unique design leads to a smaller model size as depicted above, thus requiring a lot less computation and memory than any other neural network. The Brodmann17 algorithm can therefore run on any embedded platform without needing massive or expensive hardware, and can extract a lot more AI performance out of any given processor. This enables a plethora of new use cases, like ADAS systems and autonomous cars that can now benefit from more accurate, slim and power-efficient solutions, as well as IoT edge devices that can now add computer vision to the feature list using their existing chipset.