XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
News Release Summary
Researchers at the Allen Institute for AI and the University of Washington have developed a way to dramatically shrink and speed up the image-recognition neural networks that typically require expensive, power-hungry GPUs to run. The team, led by Mohammad Rastegari and Ali Farhadi, tackled a straightforward but consequential problem: standard convolutional neural networks store their internal parameters as 32-bit floating-point numbers and perform billions of high-precision multiplications to classify a single image, making them impractical for smartphones and other resource-constrained devices. Their solution, described in a paper on two related approaches called Binary-Weight-Networks and XNOR-Networks, replaces those full-precision numbers with single bits — essentially just positive or negative ones — and substitutes the expensive multiply-accumulate operations with fast XNOR and bit-counting instructions that modern CPUs handle efficiently. The binary-weight version cuts memory use by roughly 32 times and matched the standard AlexNet's full-precision accuracy on the large-scale ImageNet benchmark, while the more aggressive XNOR-Net, which binarizes both the stored filters and the data flowing through the network, achieved about 58 times faster convolution operations at the cost of some accuracy. Crucially, the researchers introduced a simple scaling factor — essentially the average magnitude of the original weights — that partially compensates for the information lost in binarization, and they show this detail is what separates their results from earlier binarization attempts, which lagged behind by more than 16 percentage points on ImageNet top-1 accuracy. The practical implication is that capable image-recognition models could run in real time on ordinary CPUs in phones or wearables, without needing cloud offloading or specialized hardware.
abstract
We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is only 2.9% less than the full-precision AlexNet (in top-1 measure). We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than 16% in top-1 accuracy.
citation
@inproceedings{rastegari2016xnor,
title = {XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks},
author = {Rastegari, Mohammad and Ordonez, Vicente and Redmon, Joseph and Farhadi, Ali},
year = {2016},
booktitle = {European Conference on Computer Vision. ECCV 2016},
url = {http://arxiv.org/abs/1603.05279},
}