Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer

Sanghyun Seo and Juntae Kim

Resumen

Convolutional neural networks (CNN) have achieved excellent results in the field of image recognition that classifies objects in images. A typical CNN consists of a deep architecture that uses a large number of weights and layers to achieve high performance. CNN requires relatively large memory space and computational costs, which not only increase the time to train the model but also limit the real-time application of the trained model. For this reason, various neural network compression methodologies have been studied to efficiently use CNN in small embedded hardware such as mobile and edge devices. In this paper, we propose a kernel density estimation based non-uniform quantization methodology that can perform compression efficiently. The proposed method performs efficient weights quantization using a significantly smaller number of sampled weights than the number of original weights. Four-bit quantization experiments on the classification of the ImageNet dataset with various CNN architectures show that the proposed methodology can perform weights quantization efficiently in terms of computational costs without significant reduction in model performance.

Palabras claves

weights quantization - kernel density estimation - Lloyd?Max quantizer - K-means clustering - convolutional neural networks

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 9 Parte: 12 (2019)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

DOI

https://doi.org/10.3390/app9122559

Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer

Artículos similares

Revistas destacadas