Although deep neural networks have shown great potential in several application domains including computer vision and speech recognition, it is hard to implement DNN methods in hardware with the limitation of storage, compute capabilities and battery power. The authors in this paper proposed two efficient approximation to the neural network: binary weight networks (BWN) and XNOR-Networks. In binary weight networks, all the weights are approximated with binary values. While in XNOR-Networks, both the weights and the inputs to the convolutional layers and fully connected layers are approximated with binary values. The authors also attempted to evaluate their methods on large scale data sets like ImageNet, and proved that their methods outperform baseline for about 16.3%. Source code is available on GitHub.
Binary Weight Networks
Represent an $L$-layer DNN model with a triplet $<\mathcal{I, W}, * >$. Each element $I=\mathcal{I}_{l(l=1,\cdots,L)}$ in $\mathcal{I}$ is the input tensor of the $l^{th}$ layer, $W=\mathcal{W}_{lk(k=1,\cdots, K^l)}$ is the $k^{th}$ weight filter in the $l^{th}$ layer of DNN. $*$ represents convolutional operation with $I$ and $W$. Note that the authors assume the convolutional layers in the network do not have bias terms. Thus the convolutional operation can be approximated by $I*W\approx(I\oplus B)\alpha$, where $\oplus$ indicates a convolution without multiplication, $B=\mathcal{B}_{lk}$ is a binary filter $\alpha=\mathcal{A}_{lk}$ is an scale factor and $\mathcal{W}\approx\mathcal{A}_{lk}\mathcal{B}_{lk}$.