Predictive Models in Actuarial Science

Teachers

Included in study programs

Teaching results

Students will gain the ability to mathematically analyze the properties of linear and nonlinear regression models and apply their knowledge in specific economic and actuarial problems. This course also deals with the use of the generalized least squares method in the case of autocorrelation or heteroskedasticity as well as the estimation of nonlinear regression models. Demonstration of applications in R language is an integral part of this course.
Knowledge
1. The use of Exploratory Data Analysis (EDA) for the summary statistics and visualization of data, or the search for hypotheses about the causes of the observed phenomenon
2. Using Linear, resp. Nonlinear regression models to fit the causal dependence of statistical variables
3. Selection of the optimal set of relevant predictors
4. Verify the assumptions of the selected regression model and apply appropriate procedures in case of their violation
5. Diagnosis of relationships between covariates in a regression model
6. Determine the presence of multicollinearity and influential observations and apply appropriate procedures to eliminate them
Competences
Based on the above knowledge, students will work effectively with data. They propose optimal regression model, by which they describe the causal relationship between two or more variables and explain as much of the variability of the explained variable as possible by its relationship with another explanatory variable.
Skills
After completing the course, students will gain:
• define the assumptions of linear, resp. nonlinear regression model and their importance in the correct performance of regression analysis
• understand the geometric interpretation of regression models
• be able to estimate unknown parameters in regression models
• verify modeling assumptions using formal tests and visual diagnostic tools
• deduce conclusions about regression models
• be able to fundamentally create and verify regression models
• correctly interpret the results achieved in a comprehensible and clear form
• apply the above knowledge and techniques based on their own data with the support of computational programming

Indicative content

Introduction to Regression and Correlation analysis, regression models and their types. Use in actuarial science. Basic Linear Model theorem, estimation of Linear Model parameters using least squares method and geometry of regression models, estimator, Gauss-Markov theorem. Testing the statistical significance of the regression model and the contribution of explanatory variables, F-distribution, decomposition of the sum of squares. Theoretical and computational aspects of statistical inference about Linear Model parameters. Prediction vs explanation, causality concept, designed data vs observed data, matching. Correlation analysis. Simple correlation characteristics and inductive judgments about them. Multiple and partial correlation characteristics Quality of regression model, multicollinearity, methods of selection of explanatory variables, criteria methods - consistency vs. efficiency. Projection matrix, Model residuals, Diagnostics of influential observations: outliers, Cook statistics. Verification of assumptions about the random component of Linear Model, graphical analysis of residues, verification of homoskedasticity. Diagnostics - assumptions about random errors. Generalized least squares method, Generalized Linear Models – GLMs. Nonlinear models and algorithms for finding local minima in the least squares method, Gauss-Newton and related methods.

Support literature

1. Frees, E. W., Derrig, R., Mayers, G.: Predictive Modeling Applications in Actuarial Science: Volume 1, Predictive Modeling Techniques. Cambridge University Press, 2014
2. Frees, W., E.: Regression Modeling with Actuarial and Financial Applications. Cambridge University Press, 2010
3. De Jong, P., Heller, G. Z.: Generalized Linears Models for Insurance Data. Cambridge: Cambridge University Press, 2008
4. Faraway, J.: Linear Models with R, second edition, CRC press, 2014
5. Šoltés, E.: Regresná a korelačná analýza s aplikáciami v softvéri SAS. Bratislava: IURA Edition, 2019
6. Agresti, A: Foundations of Linear and Generalized Linear Models. John Wiley & Sons, 2015
7. Hastie, T., Tibshirani, R., Friedman, J.: The elements of Statistical Learning, 2017
8. James, G., Witten D., Hastie T., Tibshirani R.: An introduction to statistical learning with applications in R, Springer, 2013
9. Crawley, Michael J.: "Statistical modelling." The R Book, Second Edition, 2007
10. Pázman, A., Lacko, V.: Prednášky z regresných modelov: Odhadovanie parametrov strednej hodnoty a štatistická optimalizácia experimentu, Bratislava Univerzita Komenského, 2012
11. Škrovánková, L., Révészová, L. Some applications of statistical information theory. In Creative Mathematics and Informatics. Department of Mathematics and Computer Science North University of Baia Mare. Baia Mare, 2006.

Syllabus

1. Introduction to Regression and Correlation analysis, regression models and their types. Use in actuarial science. 2. Basic Linear Model theorem, estimation of Linear Model parameters using least squares method and geometry of regression models, estimator, Gauss-Markov theorem 3. Testing the statistical significance of the regression model and the contribution of explanatory variables, F-distribution, decomposition of the sum of squares 4. Theoretical and computational aspects of statistical inference about Linear Model parameters 5. Prediction vs explanation, causality concept, designed data vs observed data, matching 6. Correlation analysis. Simple correlation characteristics and inductive judgments about them 7. Multiple and partial correlation characteristics 8. Quality of regression model, multicollinearity, methods of selection of explanatory variables, criteria methods - consistency vs. efficiency 9. Projection matrix, Model residuals, Diagnostics of influential observations: outliers, Cook statistics 10. Verification of assumptions about the random component of Linear Model, graphical analysis of residues, verification of homoskedasticity 11. Diagnostics - assumptions about random errors 12. Generalized least squares method, Generalized Linear Models – GLMs. 13. Nonlinear models and algorithms for finding local minima in the least squares method, Gauss-Newton and related methods.

Requirements to complete the course

30% exams (using software support)
20% oral final exam
50% written final exam (using software support)

Student workload

Total study load (in hours): 130 hours
26 hours - participation in lectures,
26 hours - participation in exercises,
26 hours - preparation for exercises, homeworks,
10 hours - preparation for written work,
42 hours - self-study in preparation for the exam.

Language whose command is required to complete the course

slovak

Date of approval: 10.02.2023

Date of the latest change: 15.05.2022