# Applied Data Analysis

- Credits: 6
- Ending: Examination
- Range: 2P + 2C
- Semester: winter
- Year: 1
- Faculty of Economics and Finance

## Teachers

## Included in study programs

**Teaching results**

The aim of the course is to teach students the knowledge and skills related to modern methods of applied data analysis and statistical learning, including the use of R software, for conducting empirical economic research and designing research methods to solve economic problems.

Upon completion of the course, students should acquire:

a) knowledge of modern methods of research and data visualization, linear regression, general data analysis, and statistical learning.

b) skills in working with data that they can use in their own empirical research. Additionally, they will acquire advanced skills in using modern software (R) for empirical economic research, be able to write their own functions, use functions for data visualization, and empirically estimate advanced methods of statistical learning.

c) competences to design research for a given economic problem and perform data analysis. They will be competent to further develop their knowledge in data analysis and the use of modern software, understand empirical articles from applied data analysis, and be able to apply them in new contexts.

**Indicative content**

1. Introduction to the R Programming Language and Function Anatomy

2. Basic Mathematical and Statistical Concepts in Statistical Learning, Notation, and Types of Variables

3. Introduction to Data Types, Their Loading, Cleaning, Wrangling, and Merging

4. Visualization of Categorical Data

5. Visualization of Numerical Data

6. Summarizing the Relationship Between Two (Categorical and Numerical) Variables - Linear and Non-Linear Relationships, Scatter Plots, Correlation, and Quantile Plots

7. Randomization and Randomized Controlled Experiments, Unbiasedness and Consistency of Estimates

8. Standard Errors and Confidence Intervals, Hypothesis Testing, Parameters vs. Hyperparameters, Classification vs. Regression

9. Fundamental Algorithms I: Linear Regression

10. Fundamental Algorithms II: Logistic Regression, Decision Trees, Support Vector Machine, k-nearest neighbors

11. Anatomy of Statistical Learning, Gradient Descent, Basic Variable Transformations and Algorithm Selection, Underfitting vs. Overfitting

12. Adjustment of Explanatory Variables (coding, normalization, handling missing data), Regularization, and Model Selection and Evaluation

13. Advanced Methods of Statistical Learning

**Support literature**

Required readings:

Hadley, W. , Çentikaya-Rundel, H., a Garrett ,G., 2022. R for data science. O’Reilly Media, Inc.

Burkov, A. The Hundred-Page Machine Learning Book. GitHub.

Recommended readings:

James, G., Witten, D., Hastie, T. a Tibshirani, R., 2013. An Introduction to Statistical Learning: with Applications in R. New York: Springer.

Imai, K., 2018 Quantitative social science: An Introduction. Princeton University Press.

**Requirements to complete the course**

20 % - activity during seminars

20 % - assignments

60 % - final exam

**Student workload**

Total study load: 156h

Out of that:

participation in lectures 26h,

participation in seminars 26,

preparation for seminars 26h,

assignments 26h,

preparation for the final exam 52h

**Language whose command is required to complete the course**

English, Slovak

Date of approval: 12.03.2024

Date of the latest change: 21.05.2024