Analysis of methods for image recognition and image synthesis for expanding the training sample
DOI: 10.31673/2412-9070.2024.022729
DOI:
https://doi.org/10.31673/2412-9070.2024.022729Abstract
The article examines in detail an important aspect in the field of machine learning — image recognition and synthesis for effective training sample expansion. Methods for image recognition are considered, the method of convolutional neural networks is highlighted. One of the main problems when using these methods was also revealed, namely the insufficient training sample for training the network. In modern machine learning, where access to large amounts of real data is often limited, the problem of lack of training examples emerges as a key one. The importance of data expansion is undeniable in conditions of limited access to a large amount of real data. The article discusses and analyzes augmentation methods that allow you to increase the amount of data by introducing various transformations and modifications to the original set.
Particular emphasis is placed on generative models such as the Variational Autoencoder (VAE). The authors consider in detail their ability to synthesize new, realistic images and their impact on improving the quality of model training. The ability to create realistic and context-sensitive images is examined, considering their pros and cons in the context of expanding training data. The importance of using such methods in scenarios with a small amount of available data is emphasized.
The paper also examines the effect of balanced synthesis and recognition on the performance of models, taking into account the representativeness and diversity of the data. Examples and research results highlight the practicality of using the considered methods in various machine learning scenarios. The purpose of the article is not only to consider existing approaches, but also to point to the prospects and directions of further research in this important direction.
Keywords: neural networks; variational autoencoder; image recognition; image generation; augmentation; training sample.