Principles of constructing a decision tree based on the classification algorithm C4.5
DOI №______
Abstract
The basic requirements for data structures were described. Determined the basic criteria for selecting a data attributes, necessary for building a tree. A decision tree constructing stages were described. From the first stage criterion for selecting an attribute was described. Also ware highlighted main variables, such as: set of examples, possible options, verification variable etc. For the fourth formula we have added explanation by using properties of entropy and its impact on the final stage. Finally we have described all cases for different situations in data classification process. The last publications on the field of data analysis, classification theory, statistics and information theory have been analyzed. The general advantages and disadvantages of decision trees and C4.5 algorithm were highlighted.
Key words: algorithm, analysis, classification, decision tree, machine learning.
References
1. Quinlan J.R. C4.5: Programs for Machine Learning / Morgan J. Ross. - Boston: Kaufmann Publishers, 1993. - 302 p.
2. Шеннон К. Работы по теории информации и кибернетике / К. Шеннон. - М.: Иностранная литература, 1963. - 832 c.
3. Коршунов Ю.М. Математические основы кибернетики / Коршунов Ю.М. - М.: Энергоатомиздат, 1987. - 496 с.
4. Breiman L. Classification and Regression Trees / Breiman Leo, Friedman Jerome Charles J. Stone, Olshen R.A. - Washington: Taylor & Francis, 1984. - 368 p.