Data Mining

Teachers

Included in study programs

Teaching results

Successful completion of the course is a guarantee that students will gain a basic overview of the data mining process in practice. Students acquire the following:
abilities
− knowledge of basic concepts, principles, methods and procedures used in data mining,
− knowledge of individual stages of the process of extracting information from databases,
− knowledge of theoretical principles of data mining models.
− skills
− Students will be able to implement individual steps of the process of extracting information from databases using professional software SAS Enterprise Miner.
− Students will learn to adequately apply the methods and procedures of data mining and interpret the results.
competencies
− Students will be able to apply the acquired knowledge and skills in solving data mining problems in practice.

Indicative content

The data mining process provides a framework to extract nontrivial information from data. With the advent of massive storage, increased data collection, and advanced computing paradigms, the data at our disposal are only increasing. To extract knowledge from these massive data assets, we need to employ advanced approaches like data mining algorithms, in addition to simple statistical processing. Studying of subject enables to understand sense and possibilities of data mining.

Support literature

1. TEREK, M., HORNÍKOVÁ, A., LABUDOVÁ, V. Hĺbková analýza údajov. Bratislava: Iura Edition, 2010. ISBN 978-80-8078-336-5
2. BERKA, P. Dobývání znalostí z databází. Praha: Academia, 2003. ISBN 80-200-1062-9
3. PETR, P. Data Mining: Díl I. Pardubice: Univerzita Pardubice, 2008, 139 s. ISBN 978-80-7395-098-9
4. SKALSKÁ, H. Data mining a klasifikační modely. Hradec Králové: Gaudeamus, 2010. ISBN 978-80-7435-088-7
5. LABUDOVÁ, V. Hĺbková analýza údajov s programom SAS Enterprise Miner (praktikum). Bratislava: Ekonóm, 2012. ISBN 978-80-225-3402-4
6. LABUDOVÁ, V. Rozhodovacie stromy ako prediktívna modelovacia technika. Slovenská štatistika a demografia: vedecký časopis. Roč. 27, č. 3 (2017), s. 60-76. Bratislava: Štatistický úrad Slovenskej republiky. ISSN 1210-1095
7. KANTARDZIC, M. Data Mining. Concepts, Models, Methods and Algorithms. USA, J. Wiley and Sons, 2003. ISBN 0-471-22852-4
8. GUIDICI, P. Applied Data Mining. New York, J. Wiley and Sons, 2004. ISBN 0-470-84679-8
9. LAROSE, D. T. Discovering Knowledge in Data. An Introduction to Data Mining. USA: Wiley 2005. ISBN 978-0-471-66657-8
10. LAROSE, D. T. Data Mining. Methods and Models. USA: Wiley 2006. ISBN 0-471-66656-4

Syllabus

1. Knowledge discovery in databases, Data mining. The Data mining process. 2. Data mining – objectives and tasks. Big data and data mining. Data mining and application areas. 3. Data mining methodology. Data mining tools. 4. Databases. Data preparation (data cleaning and preparation, data transformation, classification). 5. Data preparation (outlier detection, data reduction). 6. Decision trees (classification and regression trees). 7. Process of growing a decision tree (Shannon entropy, Gini index). Pruning decision trees. Generating decision rules. 8. Logistic regression. Point estimation of parameters and odds ratio. Interpretation. 9. Statistical inference for logistic regression. 10. Artificial neural networks and its architectures. 11. Association rules. 12. Evaluation of models. Criteria for Evaluating Models. 13. Summary of the lectured subject matter.

Requirements to complete the course

Seminars (40%):
− Assignment (20 %)
− Seminar work (20 %)
Final exam (60%):
− Theoretical part (20 %)
− Practical part (40 %)

Student workload

Total study load (in hours): 156 hours
Lectures participation: 26 hours,
Seminar participation: 26 hours,
Preparation for seminars: 26 hours,
Written assignment: 26 hours,
Seminar work preparation: 22 hours,
Final exam preparation: 30 hours

Language whose command is required to complete the course

Slovak

Date of approval: 10.02.2023

Date of the latest change: 03.02.2022