Analyzing the Loss Functions in Training Convolutional Neural Networks with the Adam Optimizer for Classification of Imagess

  • Алексей [Aleksey] Николаевич [N.] Апарнев [Aparnev]
  • Олег [Oleg] Васильевич [V.] Бартеньев [Barten′ev ]
Keywords: neural network, loss function, data set, training history, classification accuracy

Abstract

The problem of analyzing and determining the loss functions used in training convolutional neural networks (CNN), which provide the highest accuracy of image classification, is considered. The loss functions implemented on the basis of the Keras and TensorFlow machine learning libraries, using which CNNs can be designed, trained, and applied in the Python language, are analyzed. The MNIST, EMNIST and CIFAR10 datasets are used as image sources for CNN training and validation purposes. These data sets are used to train CNNs having different architectures, for each of which the history of its training is preserved (information about the accuracy of image classification on training and validation samples after each training epoch). The evaluation results are then generalized. The highest achieved accuracy of image classification makes it possible to justify the expediency of using the relevant loss functions in training the CNNs.

Information about authors

Алексей [Aleksey] Николаевич [N.] Апарнев [Aparnev]

Aparnev Aleksey N. — Ph.D.-student of Applied Mathematics and Artificial Intelligence Dept., NRU MPEI, e-mail: apich238@gmail.com

Олег [Oleg] Васильевич [V.] Бартеньев [Barten′ev ]

Bartenev Oleg V.  —  Ph.D.  (Techn.), Assistant Professor of Applied  Mathematics  and Artificial Intelligence  Dept., NRU MPEI , e-mail: mdf4@mail.ru

References

1. Fukushima K. Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position — Neocognitron // Trans. IECE. 1979. V. J62-A(10). Pp. 658—665.
2. Fukushima K. Neocognitron: A Self-Organizing Neural Network for a Mechanism of Pattern Recognition Unaffected by Shift in Position // Biological Cybernetics. 1980. V. 36. No. 4. Pp. 193—202.
3. LeCun Y. e. a. Back-propagation Applied to Handwritten Zip Code Recognition // Neural Computation. 1989. V. 1. No. 4. Pp. 541—551.
4. LeCun Y. e. a. Gradient-based Learning Applied to Document Recognition // Proc. IEEE. 1998. V. 86. No. 11. Pp. 2278—2324.
5. LeCun Y. e. a. Handwritten Digit Recognition with a Back-Propagation Network // Advances in Neural Information Proc. Syst. 1990. V. 2. Pp. 396—404.
6. Keras: The Python Deep Learning library [Электрон. ресурс] https://keras.io/ (дата обращения 01.05.2019).
7. TensorFlow. An Open Source Machine Learning Framework for Everyone [Электрон. ресурс] https://www. tensorflow.org/ (дата обращения 01.05.2019).
8. Keras documentation. Convolutional Layers [Электрон. ресурс] https://keras.io/layers/convolutional/ (дата обращения 01.05.2019).
9. Keras documentation. Core Layers [Электрон. ресурс] https://keras.io/layers/core/ (дата обращения 01.05.2019).
10. LeCun Y., Cortes C., Burges C. The MNIST Database of Handwritten Digits [Электрон. ресурс] http:// yann.lecun.com/exdb/mnist/ (дата обращения 01.05.2019).
11. Cohen G. e. a. EMNIST: an Extension of MNIST to Handwritten Letters [Электрон. ресурс] https://arxiv. org/pdf/1702.05373.pdf (дата обращения 01.05.2019).
12. The CIFAR10 Dataset. [Электрон. ресурс] https://www.cs.toronto.edu/~kriz/cifar.html (дата обращения 01.05.2019).
13. Николенко С., Кадурин А., Архангельская Е. Глубокое обучение. СПб.: Питер, 2018.
14. Джулли А., Пал С. Библиотека Keras — инструмент глубокого обучения. М.: ДМК Пресс, 2018.
15. Trains a ResNet on the CIFAR10 Dataset [Электрон. ресурс] https://keras.io/examples/cifar10_resnet/ (дата обращения 01.05.2019).
16. Janocha K., Czarnecki W.M. On Loss Functions for Deep Neural Networks in Classification. Submitted on 18 Feb 2017. [Электрон. ресурс] https://arxiv.org/abs/1702.05659 (дата обращения 01.05.2019).
17. Бартеньев О.В. Классификация изображений нейронной сетью [Электрон. ресурс] http://100byte.ru/ python/imgClasses/imgClasses.html (дата обращения 01.05.2019).
18. He K. e. a. Deep Residual Learning for Image Recognition // Proc. 2016 CVPR, 2016. Pp. 770-778.
19. Функции потерь библиотеки Keras [Электрон. ресурс] http://100byte.ru/python/loss/loss.html (дата обращения 01.05.2019).
20. Keras Documentation. Built-in Loss Functions. [Электрон. ресурс] https://github.com/kerasteam/keras/blob/master/keras/losses.py (дата обращения 01.05.2019).
21. Keras Advanced Activations [Электрон. ресурс] https://keras.io/layers/advanced-activations/ (дата обра- щения 01.05.2019).
22. Kingma D., Ba J. Adam: A Method for Stochastic Optimization // arXiv, 2014. [Электрон. ресурс] http:// arxiv.org/abs/1412.6980 (дата обращения 01.05.2019).
23. Keras documentation. Merge Layers [Электрон. ресурс] https://keras.io/layers/merge/ (дата обращения 01.05.2019).
24. Keras Documentation. ImageDataGenerator [Электрон. ресурс] https://keras.io/preprocessing/image/ (дата обращения 01.05.2019).
25. Keras documentation. How can I Obtain Reproducible Results Using Keras During Development? [Электрон. ресурс] https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development (дата обращения 01.05.2019).
---
Для цитирования: Апарнев А.Н., Бартеньев О.В. Анализ функций потерь при обучении сверточных нейронных сетей с оптимизатором Adam для классификации изображений // Вестник МЭИ. 2020. № 2. С. 90—105. DOI: 10.24160/1993-6982-2020-2-90-105.
#
1. Fukushima K. Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position — Neocognitron. Trans. IECE. 1979;J62-A(10): 658—665.
2. Fukushima K. Neocognitron: A Self-Organizing Neural Network for a Mechanism of Pattern Recognition Unaffected by Shift in Position. Biological Cybernetics. 1980; 36;4:193—202.
3. LeCun Y. e. a. Back-propagation Applied to Handwritten Zip Code Recognition. Neural Computation. 1989;1;4:541—551.
4. LeCun Y. e. a. Gradient-based Learning Applied to Document Recognition. Proc. IEEE. 1998;86;11:2278— 2324.
5. LeCun Y. e. a. Handwritten Digit Recognition with a Back-Propagation Network. Advances in Neural Information Proc. Syst. 1990;2:396—404.
6. Keras: The Python Deep Learning library [Elektron. Resurs] https://keras.io/ (Data Obrashcheniya 01.05.2019).
7. TensorFlow. An Open Source Machine Learning Framework for Everyone [Elektron. Resurs] https://www. tensorflow.org/ (Data Obrashcheniya 01.05.2019).
8. Keras documentation. Convolutional Layers [Elektron. Resurs] https://keras.io/layers/convolutional/ (Data Obrashcheniya 01.05.2019).
9. Keras documentation. Core Layers [Elektron. Resurs] https://keras.io/layers/core/ (Data Obrashcheniya 01.05.2019).
10. LeCun Y., Cortes C., Burges C. The MNIST Database of Handwritten Digits [Elektron. Resurs] http://yann. lecun.com/exdb/mnist/ (Data Obrashcheniya 01.05.2019).
11. Cohen G. e. a. EMNIST: an Extension of MNIST to Handwritten Letters [Elektron. Resurs] https://arxiv.org/pdf/1702.05373.pdf (Data Obrashcheniya 01.05.2019).
12. The CIFAR10 Dataset. [Elektron. Resurs] https:// www.cs.toronto.edu/~kriz/cifar.html (Data Obrashcheniya 01.05.2019).
13. Nikolenko S., Kadurin A., Arkhangel'skaya E. Glubokoe Obuchenie. SPb.: Piter, 2018. (in Russian).
14. Dzhulli A., Pal S. Biblioteka Keras — Instrument Glubokogo Obucheniya. M.: DMK Press, 2018. (in Russian).
15. Trains a ResNet on the CIFAR10 Dataset [Elektron. Resurs] https://keras.io/examples/cifar10_resnet/ (Data Obrashcheniya 01.05.2019).
16. Janocha K., Czarnecki W.M. On Loss Functions for Deep Neural Networks in Classification. Submitted on 18 Feb 2017. [Elektron. Resurs] https://arxiv.org/ abs/1702.05659 (Data Obrashcheniya 01.05.2019).
17. Barten'ev O.V. Klassifikatsiya Izobrazheniy Neyronnoy Set'yu [Elektron. Resurs] http://100byte.ru/python/imgClasses/imgClasses.html (Data Obrashcheniya 01.05.2019). (in Russian).
18. He K. e. a. Deep Residual Learning for Image Recognition. Proc. 2016 CVPR, 2016:770-778.
19. Funktsii Poter' Biblioteki Keras [Elektron. Resurs] http://100byte.ru/python/loss/loss.html (Data Obrashcheniya 01.05.2019). (in Russian).
20. Keras Documentation. Built-in Loss Functions. [Elektron. Resurs] https://github.com/keras-team/keras/blob/master/keras/losses.py (Data Obrashcheniya 01.05.2019).
21. Keras Advanced Activations [Elektron. Resurs] https://keras.io/layers/advanced-activations/ (Data Obrashcheniya 01.05.2019).
22. Kingma D., Ba J. Adam: A Method for Stochastic Optimization. arXiv, 2014. [Elektron. Resurs] http://arxiv. org/abs/1412.6980 (Data Obrashcheniya 01.05.2019).
23. Keras documentation. Merge Layers [Elektron. Resurs] https://keras.io/layers/merge/ (Data Obrashcheniya 01.05.2019).
24. Keras Documentation. ImageDataGenerator [Elektron. Resurs] https://keras.io/preprocessing/image/(Data Obrashcheniya 01.05.2019).
25. Keras documentation. How can I Obtain Reproducible Results Using Keras During Development? [Elektron. Resurs] https://keras.io/getting-started/faq/#howcan-i-obtain-reproducible-results-using-keras-during-development (Data Obrashcheniya 01.05.2019).
---
For citation: Aparnev A.N., Barten′ev O.V. Analyzing the Loss Functions in Training Convolutional Neural Networks with the Adam Optimizer for Classification of Images. Bulletin of MPEI. 2020;2:90—105. (in Russian). DOI: 10.24160/1993-6982-2020-2-90-105.
Published
2018-07-11
Section
Mathematical and Software Support of Computing Machines, Complexes and Computer (05.13.11)