MIT’s MCUNet brings deep learning to IoT Internet of Things devices
Deep learning is everywhere. This branch of artificial intelligence curates your social media and serves your Google search results. Soon, deep learning could also check your vitals or set your thermostat.
MIT researchers have developed a system that could bring deep learning neural networks to new β and much smaller β places, like the tiny computer chips in wearable medical devices, household appliances, and the 250 billion other objects that constitute the βinternet of thingsβ (IoT).
The system, calledΒ MCUNet, designs compact neural networks that deliver unprecedented speed and accuracy for deep learning on IoT devices, despite limited memory and processing power. The technology could facilitate the expansion of the IoT universe while saving energy and improving data security.
The research will be presented at next monthβs Conference on Neural Information Processing Systems. The lead author is Ji Lin, a PhD student in Song Hanβs lab in MITβs Department of Electrical Engineering and Computer Science. Co-authors include Han and Yujun Lin of MIT, Wei-Ming Chen of MIT and National University Taiwan, and John Cohn and Chuang Gan of the MIT-IBM Watson AI Lab.
The Internet of Things
The IoT was born in the early 1980s. Grad students at Carnegie Mellon University, including Mike Kazar β78, connected a Cola-Cola machine to the internet. The groupβs motivation was simple: laziness. They wanted to use their computers to confirm the machine was stocked before trekking from their office to make a purchase. It was the worldβs first internet-connected appliance. βThis was pretty much treated as the punchline of a joke,β says Kazar, now a Microsoft engineer. βNo one expected billions of devices on the internet.β
Since that Coke machine, everyday objects have become increasingly networked into the growing IoT. That includes everything from wearable heart monitors to smart fridges that tell you when youβre low on milk. IoT devices often run on microcontrollers β simple computer chips with no operating system, minimal processing power, and less than one thousandth of the memory of a typical smartphone. So pattern-recognition tasks like deep learning are difficult to run locally on IoT devices. For complex analysis, IoT-collected data is often sent to the cloud, making it vulnerable to hacking.
βHow do we deploy neural nets directly on these tiny devices? Itβs a new research area thatβs getting very hot,β says Han. βCompanies like Google and ARM are all working in this direction.β Han is too.
With MCUNet, Hanβs group codesigned two components needed for βtiny deep learningβ β the operation of neural networks on microcontrollers. One component is TinyEngine, an inference engine that directs resource management, akin to an operating system. TinyEngine is optimized to run a particular neural network structure, which is selected by MCUNetβs other component: TinyNAS, a neural architecture search algorithm.
System-algorithm codesign
Designing a deep network for microcontrollers isnβt easy. Existing neural architecture search techniques start with a big pool of possible network structures based on a predefined template, then they gradually find the one with high accuracy and low cost. While the method works, itβs not the most efficient. βIt can work pretty well for GPUs or smartphones,β says Lin. βBut itβs been difficult to directly apply these techniques to tiny microcontrollers, because they are too small.β
So Lin developed TinyNAS, a neural architecture search method that creates custom-sized networks. βWe have a lot of microcontrollers that come with different power capacities and different memory sizes,β says Lin. βSo we developed the algorithm [TinyNAS] to optimize the search space for different microcontrollers.β The customized nature of TinyNAS means it can generate compact neural networks with the best possible performance for a given microcontroller β with no unnecessary parameters. βThen we deliver the final, efficient model to the microcontroller,β say Lin.
To run that tiny neural network, a microcontroller also needs a lean inference engine. A typical inference engine carries some dead weight β instructions for tasks it may rarely run. The extra code poses no problem for a laptop or smartphone, but it could easily overwhelm a microcontroller. βIt doesnβt have off-chip memory, and it doesnβt have a disk,β says Han. βEverything put together is just one megabyte of flash, so we have to really carefully manage such a small resource.β Cue TinyEngine.
The researchers developed their inference engine in conjunction with TinyNAS. TinyEngine generates the essential code necessary to run TinyNASβ customized neural network. Any deadweight code is discarded, which cuts down on compile-time. βWe keep only what we need,β says Han. βAnd since we designed the neural network, we know exactly what we need. Thatβs the advantage of system-algorithm codesign.β In the groupβs tests of TinyEngine, the size of the compiled binary code was between 1.9 and five times smaller than comparable microcontroller inference engines from Google and ARM. TinyEngine also contains innovations that reduce runtime, including in-place depth-wise convolution, which cuts peak memory usage nearly in half. After codesigning TinyNAS and TinyEngine, Hanβs team put MCUNet to the test.
MCUNetβs first challenge was image classification. The researchers used the ImageNet database to train the system with labelled images, then to test its ability to classify novel ones. On a commercial microcontroller they tested, MCUNet successfully classified 70.7 percent of the novel images β the previous state-of-the-art neural network and inference engine combo was just 54 percent accurate. βEven a 1 percent improvement is considered significant,β says Lin. βSo this is a giant leap for microcontroller settings.β
The team found similar results in ImageNet tests of three other microcontrollers. And on both speed and accuracy, MCUNet beat the competition for audio and visual βwake-wordβ tasks, where a user initiates an interaction with a computer using vocal cues (think: βHey, Siriβ) or simply by entering a room. The experiments highlight MCUNetβs adaptability to numerous applications.
βHuge potentialβ
The promising test results give Han hope that it will become the new industry standard for microcontrollers. βIt has huge potential,β he says.
The advance βextends the frontier of deep neural network design even farther into the computational domain of small energy-efficient microcontrollers,β says Kurt Keutzer, a computer scientist at the University of California at Berkeley, who was not involved in the work. He adds that MCUNet could βbring intelligent computer-vision capabilities to even the simplest kitchen appliances, or enable more intelligent motion sensors.β
MCUNet could also make IoT devices more secure. βA key advantage is preserving privacy,β says Han. βYou donβt need to transmit the data to the cloud.β
Analyzing data locally reduces the risk of personal information being stolen β including personal health data. Han envisions smart watches with MCUNet that donβt just sense usersβ heartbeat, blood pressure, and oxygen levels, but also analyse and help them understand that information. MCUNet could also bring deep learning to IoT devices in vehicles and rural areas with limited internet access.
Plus, MCUNetβs slim computing footprint translates into a slim carbon footprint. βOur big dream is for green AI,β says Han, adding that training a large neural network can burn carbon equivalent to the lifetime emissions of five cars. MCUNet on a microcontroller would require a small fraction of that energy. βOur end goal is to enable efficient, tiny AI with less computational resources, less human resources, and less data,β says Han.
















