A study of neural network approaches to deep stylometry-based authorship analysis

DOI: 10.31673/2412-9070.2025.042554

Authors

  • В. Р. Передера, (Peredera V. R.) National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
  • О. Л. Недашківський, (Nedashkivski O. L.) National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”

DOI:

https://doi.org/10.31673/2412-9070.2025.042554

Abstract

This article presents a comprehensive review of modern neural network approaches to authorship attribution within the framework of deep stylometry. The study focuses on how different deep learning architecttures, specifically convolutional neural networks (CNN), recurrent neural networks (RNN), and transformer-based models are applied to detect stylistic patterns inherent to individual authors. Special attention is given to the advantages and limitations of each architecture in handling stylistic variation, genre shifts, and short or informal text formats, which are typical in digital communication.
The article examines the role of training corpora and evaluates how the amount and quality of data influence model performance. Datasets such as PAN, Blog Authorship Corpus, and Enron Email Dataset are discussed in terms of their representativeness and challenges they pose for generalization. The performance of neural models is analyzed using standard evaluation metrics including accuracy, F1-score, and ROC-AUC, with an emphasis on cross-domain reliability and robustness to stylistic obfuscation. 
In addition, the paper highlights the limitations of current neural systems in terms of interpretability, adaptability, and ethical deployment. Experimental results are illustrated with comparative figures showing model behavior across text types and dataset sizes. The article outlines future directions in the development of interpretable and cross-lingual architectures, as well as the integration of graph-based structures for advanced stylistic representation. 
It is shown that promising areas of further research are the combination of transformer-type models with graph structures, which will allow taking into account not only the sequence, but also the relationship between parts of the text at the discourse level and will become the basis of algorithmic support for effective software development of mobile, cross-platform and web applications using artificial intelligence technologies, including for combating disinformation and developing information technologies for determining tone and classifying text context.

Keywords: software engineering; stylometry; author attribution; deep learning; transformer; neural network; text classification; multimodal analysis; cross-platform; web application. 

Published

2025-09-29

Issue

Section

Articles