Applying user data mining using clustering algorithms
DOI: 10.31673/2412-9070.2020.062023
DOI:
https://doi.org/10.31673/2412-9070.2020.062023Abstract
In this article we are talking about the applying of user data mining using clustering algorithms. In the process of information technology development, as well as data collection and storage systems, the problem of analyzing large amounts of information is becoming increasingly acute. Another equally important task is the visual and compact presentation of data. These problems are solved within the framework of an interdisciplinary area of knowledge - Data Mining. Today, the analysis of data obtained from the Internet, the so-called Web Mining, is becoming increasingly relevant. The main purpose of Web Mining is to collect data (Parsing) and then save it in the desired format. The information on the Internet is presented in the form of special formats, such as markup language HTML, RSS, Atom, SOAP and others. Web pages may have additional meta information as well as document structure information. In Web Mining, there are two main areas of focus: Web Content Mining and Web Usage Mining, and, accordingly, two types of tasks that Web Mining systems are facing. Web Content Mining means the automated search for information from various sources on the Internet. The second direction is more adapted, Web Usage Mining implies the detection of patterns in the actions of the site visitor, as well as the collection of statistics and its subsequent analysis. This work is based on the analysis of clustering algorithms, allows to evaluate the use of technology for user data, gives an understanding of the impact the use of intellectual analysis has in general. Main applications fields are also being shown describing the benefits of integrating this analyze approach.
Keywords: data mining; clustering; scalability; classification; real-time analytical processing; the use of clustering; patterns; data collection; data integration; data analysis; cluster analysis.
References
1. Henzinger M., Raghavan P., Rajagopalan S. Computing on Data Streams // Digital Equipment Corporation. SRC TN-1998-011, August 1998.
2. Тиндова М. Г. Предварительная кластеризация многомерных объектов в интеллектуальном анализе данных // Вестник Саратов. гос. соц.-эконом. ун-та. 2008. №. 4. С. 137–138.
3. Murphy S. A. Data visualization and rapid analytics: applying tableau desktop to support library decision-making // Journal of Web Librarianship. 2013. Vol. 7, № 4. P. 465–476.