Data analysis for decision making support

Teachers

Included in study programs

Teaching results

In case of successful mastering of the subject, students will acquire the following knowledge in particular:
- basic knowledge in SQL programming,
- basic knowledge of data processing and visualization and statistical analyzes in the R language,
- basic knowledge of data processing and visualization and statistical analyzes in the Python
language.
In case of successful mastery of the subject, students will acquire the following skills in particular:
- the ability to use basic tools for data processing, visualization and analysis in the R language
and Python,
- the ability to use basic SQL tools when working with databases.
In case of successful mastering of the subject, students acquire the following competences in particular:
- practical skills and competences with the application of methods used for data analysis and solving economic problems for decision-making support based on real data, mainly in retail.
During teaching, emphasis will be placed on cooperation with practice.
Therefore, in case of successful mastery of the subject, the student should be able to analyze data using various tools, solve problems using various modeling techniques, create sophisticated analytical solutions with the aim of obtaining outputs from data for comprehensively informing the client and facilitating his business decision-making.

Indicative content

Introduction to SQL. Basic structures. Basic commands (Select, Where, Order By, ...). Data types. Keys. Joining data (Join, Inner Join, Union, ...). Writing code efficiently in SQL. Window Function.
Practical data analysis in R. Packages. Variables. Fields. Data Frames. Functions. For loop. If. Practical data analysis in Python.
Practical statistical data analysis in R and Python including the basics of descriptive statistics, inductive statistics, regression analysis, time series analysis and visualization in the mentioned areas. Practical data analyzes in R and Python will be performed on real data.

Support literature

BEAULIEU, A. Learning SQL: Generate, Manipulate, and Retrieve Data. O'Reilly, 2020.
ALBERT, J. – RIZZO, M. R by Example. New York : Springer, 2012.
DE LAFAYE MICHEAUX, P. – DROUILHET, R. – LIQUET, B. The R Software. Fundamentals
of Programming and Statistical Analysis. New York : Springer, 2013.
UNPINGCO, J. Python for Probability, Statistics, and Machine Learning. Second Edition. Cham:
Springer Nature Switzerland AG, 2016.
Wickham. H. – Grolemund, G. R for Data Science – visualize, model, transform, tidy and import
data, O'Reilly, 2017.

Syllabus

1. What does the real database look like and what does the stored real data look like? (Introduction to SQL. Basic structures. Basic commands (Select, Where, Order By, ...). Data types. Keys.) 2. How to extract the necessary data from the database for further analysis? (Joining data (Join, Inner Join, Union, ...) 3. How to save hours with demanding databases when downloading and analyzing data via SQL? (Writing code efficiently in SQL. Window Function.) 4. Summarizing knowledge (Practicing previous tasks. Indexing.) 5. How to use R-ko, one of the languages most used by data analysts? (Introduction to R. Library Variables. Fields. Data Frames. Functions. For cycle. If.) 6. How to perform an initial descriptive and exploratory analysis of a data set? (Basic statistical indicators (mean, variance, quantiles). Graphs (histogram, boxplot).) What is the value of the average basket? How many products does it contain? Does the establishment have any wholesalers? 7. How to verify and test hypotheses in R (part 1)? (Statistical induction. Statistical hypothesis testing. Parametric tests.) Is the average purchase amount in operation the same across regions? 8. How to verify and test hypotheses in R (part 2)? (Statistical induction. Statistical hypothesis testing. Nonparametric tests.) 9. How do I find out the basic relationships and connections between the variables of a data set? (Correlation and association analysis. Covariance. Correlation and correlation coefficients. Chi-square test, measures of association) When a customer pays with points, does he buy more money? The more points a customer has on the loyalty card, the more expensive products he buys? 10. How do I determine the relationship between two/more independent/dependent groups of a data file? (F-test, t-test and ANOVA - parametric/non-parametric.) Do the shopping carts of customers with a loyalty card contain more expensive products than the shopping carts of customers without a loyalty card? Do customers with higher loyalty card levels buy more than customers with lower levels? 11. How do I reveal more complex relationships in the data and create a prediction? (Regression models - linear/logistic. Simple and multiple linear/logistic model. Method of least squares.) What does the amount of the customer's purchase depend on? 12. How do I identify a trend in data? (Analysis of time series. Decomposition of time series into components. Determining the trend.) 13. How to use the acquired knowledge in the increasingly frequently used Python language?

Requirements to complete the course

Active participation in seminars (10 %),
exam project (30 %),
written exam (60 %).

Student workload

26 hours of seminars,
13 hours preparation for seminar,
13 hours preparation for exam project,
26 hours preparation for written exam.
Total 78 hours

Language whose command is required to complete the course

slovak

Date of approval: 07.11.2023

Date of the latest change: 08.11.2023