Convolutional neural networks for moving object analysis in video streams

DOI: 10.31673/2412-9070.2025.042042

Authors

  • Т. М. Кисіль, (Kysil T. M.) State University of Information and Communication Technologies, Kyiv
  • О. В. Зінченко, (Zinchenko O. V.) State University of Information and Communication Technologies, Kyiv

DOI:

https://doi.org/10.31673/2412-9070.2025.042042

Abstract

This article presents a comprehensive review and in-depth analysis of methods for applying Convolutional Neural Networks (CNNs) to the processing of moving objects in video streams. Modern CNN-based approaches enable effective solutions to tasks such as object recognition, tracking, and classification in video sequences, which form the foundation of many areas in computer vision. The study examines architectures including 2D-CNNs for processing individual frames, 3D-CNNs for capturing spatiotemporal dependencies, and hybrid models (RNN+CNN and ConvLSTM) that integrate the advantages of different video data processing techniques.
The advantages of CNNs are highlighted, including automatic feature extraction, high adaptability to varying shooting conditions (e.g., changes in lighting, presence of noise), and the ability to handle large-scale data. Particular attention is paid to addressing scalability challenges and optimizing computational resources to ensure real-time data processing, which is critical for fields such as autonomous transportation systems, intelligent video surveillance, and human/object behavior analysis.
A significant portion of the article discusses challenges associated with large-scale video data processing, such as modeling temporal dependencies, preventing the loss of important spatial features, and reducing the energy consumption of models for deployment on mobile and embedded platforms. The article outlines future prospects for CNN development, including the creation of light weight models, the adoption of transfer learning, and the integration of CNNs with transformers to handle dynamic video environments. The obtained results are of significant importance for the further advancement of computer vision algorithms. The proposed recommendations on the use of CNNs will contribute to the development of high-performance intelligent video processing systems and enhance their efficiency across a wide range of applications.

Keywords: convolutional neural networks; object recognition; motion tracking; computer vision; 3D-CNN; hybrid architectures.

Published

2025-09-22

Issue

Section

Articles