Examples. David Ruppert. April 25, Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert.

Size: px
Start display at page:

Download "Examples. David Ruppert. April 25, 2009. Cornell University. Statistics for Financial Engineering: Some R. Examples. David Ruppert."

Transcription

1 Cornell University April 25, 2009

2 Outline

3 A little about myself BA and MA in mathematics PhD in statistics in 1977 taught in the statistics department at North Carolina for 10 years have been in Operations Research and Information (formerly Industrial) Engineering at Cornell since 1987

4 A little about myself starting teaching Statistics and Finance to undergraduates in 2001 textbook published in 2004 starting teaching Engineering to master s students in 2008 working on revised and expanded textbook now programming exclusively in R

5 Undergraduate Textbook

6 A little about my research have done research in asymptotic theory of splines semiparametric ing measurement error in regression smoothing (nonparametric regression and density estimation) transformation and weighting stochastic approximation biostatistics environmental engineering ing of term structure executive compensation and accounting fraud

7 Three types of regression Linear regression Y i = β 0 + β 1 X i,1 + + β p X i,p + ɛ i, i = 1,..., n regression Y i = m(x i,1,..., X i,p ; β 1,..., β q ) + ɛ i, i = 1,..., n where m is a known function depending on unknown parameters Nonparametric regression Y i = m(x i,1,..., X i,p ) + ɛ i, i = 1,..., n where m is an unknown smooth function

8 Usual assumptions on the noise Usually ɛ 1,..., ɛ n are assumed to be: mutually independent (or at least uncorrelated) homoscedastic (constant variance) normally distributed Much research over the last 50+ years has looked into ways of 1 checking these assumptions 2 statistical methods that require less assumptions

9 Transform-both-sides Ideal (no errors): Y i = f (X i, β) Statistical (first attempt): Y i = f (X i, β) + ɛ i where ɛ 1,..., ɛ n are iid Gaussian TBS : h{y i } = h{f (X i, β)} + ɛ i where ɛ 1,..., ɛ n are iid Gaussian h is an appropriate transformation

10 Estimation of Default Probabilities : ratings: 1=Aaa (best),...,16=b3 (worse) default frequency (estimate of default probability)

11 Some statistical s nonlinear : Pr(default rating) = exp{β 0 + β 1 rating} linear/transformation (in recent textbook): log{pr(default rating)} = β 0 + β 1 rating Problem: cannot take logs of default frequencies that are 0 (Sub-optimal) solution in textbook: throw out these observations

12 A better statistical Transform-both-sides (TBS) see Carroll and Ruppert (1984, 1988): using a power transformation: { } λ { Pr(default rating) + κ = exp(β0 + β 1 rating) + κ } λ λ chosen by residual plots (or maximum likelihood) λ = 1/2 works well for this example log transformations are also commonly used κ > 0 will shift data away from 0

13 The Box-Cox family the most common transformation family is due to Box and Cox (1964): derivative has simple form: h(y, λ) = yλ 1 if λ 0 λ = log(y) if λ = 0 h y (y, λ) = d dy h(y, λ) = yλ 1 for all λ

14 TBS fit compared to others log(default probability) o BOW nonlinear tbs data rating

15 regression residuals absolute residual Theoretical Quantiles Normal QQ plot fitted values Sample Quantiles

16 TBS residuals absolute residual Theoretical Quantiles Normal QQ plot fitted values Sample Quantiles

17 Estimated default probabilities method Pr{default Aaa} as % of TEXTBOOK est TEXTBOOK 0.005% 100% nonlinear 0.002% 40% TBS % 16%

18 A Similar Problem: Challenger # o rings distressed * ** * * * *** * ** *** ** * temp

19 Challenger : Extrapolation to 31 o Logistic regression # o rings distressed all data only data with failures * * * * * * ** temp

20 Variance stabilizing transformation: how it works y Pop 2 transformed Pop 1 transformed log(x) Pop 1 Pop

21 Strength of Box-Cox family Take a < b Then h y (b, λ) h y (a, λ) = ( ) b λ 1 a which is increasing in λ and equals 1 when λ = 1 λ = 1 is the dividing point between concave and convex transformations h(y, λ) becomes a stronger concave transformation as λ decreases from 1 also, h(y, λ) becomes a stronger convex transformation as λ increases from 1

22 Strength of Box-Cox family, cont. Example: b/a = 2 Derivative ratio α 1 concave convex α

23 Maximum likelihood L(β, λ, σ) = n log(σ) + n [h(y i + κ, λ) h { f (X i, β) + κ, λ }] 2 i=1 n (λ 1) log(y i ) i=1 } {{ } from Jacobian 2σ 2 can maximize over σ analytically: σ 2 = n 1 n i=1 [h(y i + κ, λ) h { f (X i, β) + κ, λ }] 2 they maximize over (β, λ) with optim, for example κ is fixed in advance

24 Reference for TBS Transformation and Weighting in by Carroll and Ruppert (1988) Lots of examples But none in finance _

25 1-Year Treasury Constant Maturity Rate, daily data rate year Source: Board of Governors of the Federal Reserve System

26 R t versus year change in rate year

27 R t versus R t 1 change in rate lagged rate

28 R 2 t versus R t 1 change in rate lagged rate

29 Drift function Discretized diffusion : R t = µ(r t 1 ) + σ(r t 1 )ɛ t µ(x) is the drift function σ(x) is the volatility function (as before)

30 Estimating Volatility Parametric : (Common in practice) Nonparametric : Var{( R t )} = β 0 R β 1 t 1 Var{( R t )} = σ 2 (R t 1 ) where σ( ) is a smooth function will be ed as a spline In these s: no dependence on t

31 Spline Software The penalized spline fits shown here were obtained using the function spm in R s SemiPar package author is Matt Wand

32 Comparing parametric and nonparametric volatility fits diff_rate^ nonpar par lag_rate lag_rate

33 Comparing parametric and nonparametric volatility fits: zooming in near lag_rate nonpar par

34 Spline fitting Estimation of drift function diff_rate lag_rate lag_rate 1st der lag_rate 2nd der lag_rate

35 Residuals for diffusion residual t := R t µ(r t 1 ) E(residual t ) = 0 std residual t := residual t σ(r t 1 ) E(std residual 2 t ) = 1

36 Question Are the drift and volatility functions constant in time?

37 Residual plots: ordinary residuals residual year

38 Residual plots: standardized residuals Lag ACF autocorrelation function Normal Q Q Plot Sample Quantiles Theoretical Quantiles

39 Residual plots: Squared standardized residuals Squared Std residual year

40 Residual plots: Squared standardized residuals autocorrelation function ACF Lag

41 GARCH(p, q) The GARCH(p, q) is a t = ɛ t σ t, where q p σ t = α0 + α i at i 2 + β i σt i 2. i=1 i=1 and ɛ t is an iid (strong) white noise process a t is weak white noise uncorrelated but with volatility clustering

42 GARCH(1,1) fit using garch in tseries Call: garch(x = std_drift_resid^2, order = c(1, 1)) Model: GARCH(1,1) Coefficient(s): Estimate Std. Error t value Pr(> t ) a <2e-16 *** a <2e-16 *** b <2e-16 *** Box-Ljung test data: Squared.Residuals X-squared = 0.13, df = 1, p-value =

43 GARCH: estimated conditional standard deviations GARCH conditional std year

44 GARCH: squared residuals with lowess smooth year GARCH squared residuals

45 GARCH residuals GARCH residuals ACF Lag

46 AR(1)/GARCH(1,1) Call: garchfit(formula = ~arma(1, 0) + garch(1, 1), data = std_drift_resid) Mean and Variance Equation: data ~ arma(1, 0) + garch(1, 1) [data = std_drift_resid] Conditional Distribution: norm Std. Errors: based on Hessian Error Analysis: Estimate Std. Error t value Pr(> t ) mu ar < 2e-16 *** omega e-13 *** alpha < 2e-16 *** beta < 2e-16 ***

47 AR(1)/GARCH(1,1) residuals AR(1)/GARCH(1,1) residuals ACF Lag

48 AR(1)/GARCH(1,1) residuals - QQ plot Normal Q Q Plot Sample Quantiles Theoretical Quantiles

49 AR(1)/GARCH(1,1) Call: garchfit(formula = ~arma(1, 0) + garch(1, 1), data = std_drift_resid, cond.dist = "std") Mean and Variance Equation: data ~ arma(1, 0) + garch(1, 1) [data = std_drift_resid] Conditional Distribution: std Error Analysis: Estimate Std. Error t value Pr(> t ) mu ar < 2e-16 *** omega ** alpha < 2e-16 *** beta < 2e-16 *** shape < 2e-16 ***

50 AR(1)/GARCH(1,1) residuals - QQ plot QQ plot using t(3.91) Theoretical quantiles Sample quantiles

51 Final for the interest rate s R t = µ(r t 1 ) + σ(r t 1 )a t 1 Model was fit in two steps: 1 estimate µ() and σ() spm in SemiPar 2 a t as AR(1)/GARCH(1,1) garchfit in fgarch 2 Could the two step be combined? 3 Would combining them change the results?

52 Reference for spline ing Semiparametric by Ruppert, Wand, and Carroll (2003) Lots of examples. But most from biostatistics and epidemiology

53 statistics analysis allows the use of prior information hierarchical priors can: specify knowledge that a group of parameters are similar to each other estimate their common distribution WinBUGS can be run from inside R using the R2WinBUGS package there is a similar BRugs package that runs OpenBugs BRugs is no longer on CRAN

54 midcapd.ts in fecofin package 500 daily on: 20 stocks market

55 Goal The goal is to use the first 100 days to estimate the mean for the next 400 days Four possible estimators: sample means Bayes estimation (shrinkage) mean of means (total shrinkage) CAPM ( return) = beta ( market return)

56 Who won? Estimate Sum of squared errors sample means 1.9 Bayes 0.17 mean of means 0.12 CAPM CAPM Squared estimation errors are summed over the 20 stocks CAPM 1: use mean of first 100 market CAPM 2: use mean of last 400 market

57 Why does shrinkage help? sample means Bayes mean mean estimate target estimate target

58 Likelihood and prior r i,t = tth return on i stock Likelihood: IN = Independent Normal Hierarchical Prior: r i,t = µ i + ɛ i,t ɛ i,t IN (0, σ 2 ɛ ) µ i IN (α, σ 2 µ) Diffuse (non-informative) priors on α, σ 2 ɛ, σ 2 µ Auto and cross-sectional correlations are ignored (treated as 0)

59 -driven shrinkage Hierarchical Prior: µ i IN (α, σ 2 µ) the µ i are shrunk towards α α should be (approximately) the mean of the means σµ/σ 2 ɛ 2 controls the amount of shrinkage large σµ/σ 2 ɛ 2 less shrinkage data-driven shrinkage because σµ 2 and σɛ 2 are estimated

60 WinBUGS output > print(means.sim,digits=3) Inference for Bugs at "midcap.bug", fit using WinBUGS, 3 chains, each with 5100 iterations (first 100 discarded) n.sims = iterations saved mean sd 2.5% 25% 50% 75% 97.5% Rhat n.eff mu[1] 1.1e e e e e mu[2] 1.2e e e e e mu[3] 7.7e e e e e mu[4] 4.5e e e e e mu[18] 8.3e e e e e mu[19] 5.1e e e e e mu[20] 4.8e e e e e sigma_mu 1.5e e e e e sigma_eps 4.3e e e e e alpha 8.8e e e e e deviance 1.2e e e e e For each parameter, n.eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor (at convergence, Rhat=1). DIC info (using the rule, pd = Dbar-Dhat) pd = 4.1 and DIC = DIC is an estimate of predictive error (lower deviance is better).

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm

Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay. Solutions to Midterm Booth School of Business, University of Chicago Business 41202, Spring Quarter 2015, Mr. Ruey S. Tsay Solutions to Midterm Problem A: (30 pts) Answer briefly the following questions. Each question has

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

ADVANCED FORECASTING MODELS USING SAS SOFTWARE

ADVANCED FORECASTING MODELS USING SAS SOFTWARE ADVANCED FORECASTING MODELS USING SAS SOFTWARE Girish Kumar Jha IARI, Pusa, New Delhi 110 012 gjha_eco@iari.res.in 1. Transfer Function Model Univariate ARIMA models are useful for analysis and forecasting

More information

Sales forecasting # 2

Sales forecasting # 2 Sales forecasting # 2 Arthur Charpentier arthur.charpentier@univ-rennes1.fr 1 Agenda Qualitative and quantitative methods, a very general introduction Series decomposition Short versus long term forecasting

More information

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

More information

Regression III: Advanced Methods

Regression III: Advanced Methods Lecture 16: Generalized Additive Models Regression III: Advanced Methods Bill Jacoby Michigan State University http://polisci.msu.edu/jacoby/icpsr/regress3 Goals of the Lecture Introduce Additive Models

More information

Analysis and Computation for Finance Time Series - An Introduction

Analysis and Computation for Finance Time Series - An Introduction ECMM703 Analysis and Computation for Finance Time Series - An Introduction Alejandra González Harrison 161 Email: mag208@exeter.ac.uk Time Series - An Introduction A time series is a sequence of observations

More information

Examining a Fitted Logistic Model

Examining a Fitted Logistic Model STAT 536 Lecture 16 1 Examining a Fitted Logistic Model Deviance Test for Lack of Fit The data below describes the male birth fraction male births/total births over the years 1931 to 1990. A simple logistic

More information

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics. Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Getting Correct Results from PROC REG

Getting Correct Results from PROC REG Getting Correct Results from PROC REG Nathaniel Derby, Statis Pro Data Analytics, Seattle, WA ABSTRACT PROC REG, SAS s implementation of linear regression, is often used to fit a line without checking

More information

Univariate Regression

Univariate Regression Univariate Regression Correlation and Regression The regression line summarizes the linear relationship between 2 variables Correlation coefficient, r, measures strength of relationship: the closer r is

More information

Logistic Regression (a type of Generalized Linear Model)

Logistic Regression (a type of Generalized Linear Model) Logistic Regression (a type of Generalized Linear Model) 1/36 Today Review of GLMs Logistic Regression 2/36 How do we find patterns in data? We begin with a model of how the world works We use our knowledge

More information

Time Series Analysis

Time Series Analysis Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Identification of univariate time series models, cont.:

More information

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group

MISSING DATA TECHNIQUES WITH SAS. IDRE Statistical Consulting Group MISSING DATA TECHNIQUES WITH SAS IDRE Statistical Consulting Group ROAD MAP FOR TODAY To discuss: 1. Commonly used techniques for handling missing data, focusing on multiple imputation 2. Issues that could

More information

Time Series Analysis

Time Series Analysis Time Series Analysis Identifying possible ARIMA models Andrés M. Alonso Carolina García-Martos Universidad Carlos III de Madrid Universidad Politécnica de Madrid June July, 2012 Alonso and García-Martos

More information

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction

More information

Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks *

Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks * Cost of Capital and Corporate Refinancing Strategy: Optimization of Costs and Risks * Garritt Conover Abstract This paper investigates the effects of a firm s refinancing policies on its cost of capital.

More information

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago

Recent Developments of Statistical Application in. Finance. Ruey S. Tsay. Graduate School of Business. The University of Chicago Recent Developments of Statistical Application in Finance Ruey S. Tsay Graduate School of Business The University of Chicago Guanghua Conference, June 2004 Summary Focus on two parts: Applications in Finance:

More information

Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic.

Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic. WDS'09 Proceedings of Contributed Papers, Part I, 148 153, 2009. ISBN 978-80-7378-101-9 MATFYZPRESS Volatility Modelling L. Jarešová Charles University, Faculty of Mathematics and Physics, Prague, Czech

More information

A Latent Variable Approach to Validate Credit Rating Systems using R

A Latent Variable Approach to Validate Credit Rating Systems using R A Latent Variable Approach to Validate Credit Rating Systems using R Chicago, April 24, 2009 Bettina Grün a, Paul Hofmarcher a, Kurt Hornik a, Christoph Leitner a, Stefan Pichler a a WU Wien Grün/Hofmarcher/Hornik/Leitner/Pichler

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Time Series Analysis

Time Series Analysis JUNE 2012 Time Series Analysis CONTENT A time series is a chronological sequence of observations on a particular variable. Usually the observations are taken at regular intervals (days, months, years),

More information

Simple linear regression

Simple linear regression Simple linear regression Introduction Simple linear regression is a statistical method for obtaining a formula to predict values of one variable from another where there is a causal relationship between

More information

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1) CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

More information

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal

MGT 267 PROJECT. Forecasting the United States Retail Sales of the Pharmacies and Drug Stores. Done by: Shunwei Wang & Mohammad Zainal MGT 267 PROJECT Forecasting the United States Retail Sales of the Pharmacies and Drug Stores Done by: Shunwei Wang & Mohammad Zainal Dec. 2002 The retail sale (Million) ABSTRACT The present study aims

More information

Some Essential Statistics The Lure of Statistics

Some Essential Statistics The Lure of Statistics Some Essential Statistics The Lure of Statistics Data Mining Techniques, by M.J.A. Berry and G.S Linoff, 2004 Statistics vs. Data Mining..lie, damn lie, and statistics mining data to support preconceived

More information

GLM I An Introduction to Generalized Linear Models

GLM I An Introduction to Generalized Linear Models GLM I An Introduction to Generalized Linear Models CAS Ratemaking and Product Management Seminar March 2009 Presented by: Tanya D. Havlicek, Actuarial Assistant 0 ANTITRUST Notice The Casualty Actuarial

More information

Statistical issues in the analysis of microarray data

Statistical issues in the analysis of microarray data Statistical issues in the analysis of microarray data Daniel Gerhard Institute of Biostatistics Leibniz University of Hannover ESNATS Summerschool, Zermatt D. Gerhard (LUH) Analysis of microarray data

More information

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4

4. Simple regression. QBUS6840 Predictive Analytics. https://www.otexts.org/fpp/4 4. Simple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/4 Outline The simple linear model Least squares estimation Forecasting with regression Non-linear functional forms Regression

More information

Is the Basis of the Stock Index Futures Markets Nonlinear?

Is the Basis of the Stock Index Futures Markets Nonlinear? University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Is the Basis of the Stock

More information

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com

A Regime-Switching Model for Electricity Spot Prices. Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com A Regime-Switching Model for Electricity Spot Prices Gero Schindlmayr EnBW Trading GmbH g.schindlmayr@enbw.com May 31, 25 A Regime-Switching Model for Electricity Spot Prices Abstract Electricity markets

More information

How To Model A Series With Sas

How To Model A Series With Sas Chapter 7 Chapter Table of Contents OVERVIEW...193 GETTING STARTED...194 TheThreeStagesofARIMAModeling...194 IdentificationStage...194 Estimation and Diagnostic Checking Stage...... 200 Forecasting Stage...205

More information

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear.

Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. Multiple Regression in SPSS This example shows you how to perform multiple regression. The basic command is regression : linear. In the main dialog box, input the dependent variable and several predictors.

More information

Statistics in Retail Finance. Chapter 6: Behavioural models

Statistics in Retail Finance. Chapter 6: Behavioural models Statistics in Retail Finance 1 Overview > So far we have focussed mainly on application scorecards. In this chapter we shall look at behavioural models. We shall cover the following topics:- Behavioural

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

SUMAN DUVVURU STAT 567 PROJECT REPORT

SUMAN DUVVURU STAT 567 PROJECT REPORT SUMAN DUVVURU STAT 567 PROJECT REPORT SURVIVAL ANALYSIS OF HEROIN ADDICTS Background and introduction: Current illicit drug use among teens is continuing to increase in many countries around the world.

More information

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1

Rob J Hyndman. Forecasting using. 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Rob J Hyndman Forecasting using 11. Dynamic regression OTexts.com/fpp/9/1/ Forecasting using R 1 Outline 1 Regression with ARIMA errors 2 Example: Japanese cars 3 Using Fourier terms for seasonality 4

More information

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution

A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution A Primer on Mathematical Statistics and Univariate Distributions; The Normal Distribution; The GLM with the Normal Distribution PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 4: September

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

More information

Statistical pitfalls in Solvency II Value-at-Risk models

Statistical pitfalls in Solvency II Value-at-Risk models Statistical pitfalls in Solvency II Value-at-Risk models Miriam Loois, MSc. Supervisor: Prof. Dr. Roger Laeven Student number: 6182402 Amsterdam Executive Master-programme in Actuarial Science Faculty

More information

TIME SERIES ANALYSIS

TIME SERIES ANALYSIS TIME SERIES ANALYSIS Ramasubramanian V. I.A.S.R.I., Library Avenue, New Delhi- 110 012 ram_stat@yahoo.co.in 1. Introduction A Time Series (TS) is a sequence of observations ordered in time. Mostly these

More information

BayesX - Software for Bayesian Inference in Structured Additive Regression

BayesX - Software for Bayesian Inference in Structured Additive Regression BayesX - Software for Bayesian Inference in Structured Additive Regression Thomas Kneib Faculty of Mathematics and Economics, University of Ulm Department of Statistics, Ludwig-Maximilians-University Munich

More information

2. Linear regression with multiple regressors

2. Linear regression with multiple regressors 2. Linear regression with multiple regressors Aim of this section: Introduction of the multiple regression model OLS estimation in multiple regression Measures-of-fit in multiple regression Assumptions

More information

Forecasting methods applied to engineering management

Forecasting methods applied to engineering management Forecasting methods applied to engineering management Áron Szász-Gábor Abstract. This paper presents arguments for the usefulness of a simple forecasting application package for sustaining operational

More information

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models

Forecasting the US Dollar / Euro Exchange rate Using ARMA Models Forecasting the US Dollar / Euro Exchange rate Using ARMA Models LIUWEI (9906360) - 1 - ABSTRACT...3 1. INTRODUCTION...4 2. DATA ANALYSIS...5 2.1 Stationary estimation...5 2.2 Dickey-Fuller Test...6 3.

More information

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

More information

Software Review: ITSM 2000 Professional Version 6.0.

Software Review: ITSM 2000 Professional Version 6.0. Lee, J. & Strazicich, M.C. (2002). Software Review: ITSM 2000 Professional Version 6.0. International Journal of Forecasting, 18(3): 455-459 (June 2002). Published by Elsevier (ISSN: 0169-2070). http://0-

More information

ITSM-R Reference Manual

ITSM-R Reference Manual ITSM-R Reference Manual George Weigt June 5, 2015 1 Contents 1 Introduction 3 1.1 Time series analysis in a nutshell............................... 3 1.2 White Noise Variance.....................................

More information

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study

A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study A Review of Cross Sectional Regression for Financial Data You should already know this material from previous study But I will offer a review, with a focus on issues which arise in finance 1 TYPES OF FINANCIAL

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Lecture 14: GLM Estimation and Logistic Regression

Lecture 14: GLM Estimation and Logistic Regression Lecture 14: GLM Estimation and Logistic Regression Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South

More information

Lecture 8: Gamma regression

Lecture 8: Gamma regression Lecture 8: Gamma regression Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Models with constant coefficient of variation Gamma regression: estimation and testing

More information

Analysis of Financial Time Series

Analysis of Financial Time Series Analysis of Financial Time Series Analysis of Financial Time Series Financial Econometrics RUEY S. TSAY University of Chicago A Wiley-Interscience Publication JOHN WILEY & SONS, INC. This book is printed

More information

Nonlinear Regression:

Nonlinear Regression: Zurich University of Applied Sciences School of Engineering IDP Institute of Data Analysis and Process Design Nonlinear Regression: A Powerful Tool With Considerable Complexity Half-Day : Improved Inference

More information

Detekce změn v autoregresních posloupnostech

Detekce změn v autoregresních posloupnostech Nové Hrady 2012 Outline 1 Introduction 2 3 4 Change point problem (retrospective) The data Y 1,..., Y n follow a statistical model, which may change once or several times during the observation period

More information

Applications of R Software in Bayesian Data Analysis

Applications of R Software in Bayesian Data Analysis Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

More information

VI. Real Business Cycles Models

VI. Real Business Cycles Models VI. Real Business Cycles Models Introduction Business cycle research studies the causes and consequences of the recurrent expansions and contractions in aggregate economic activity that occur in most industrialized

More information

ER Volatility Forecasting using GARCH models in R

ER Volatility Forecasting using GARCH models in R Exchange Rate Volatility Forecasting Using GARCH models in R Roger Roth Martin Kammlander Markus Mayer June 9, 2009 Agenda Preliminaries 1 Preliminaries Importance of ER Forecasting Predicability of ERs

More information

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia

Estimating the Degree of Activity of jumps in High Frequency Financial Data. joint with Yacine Aït-Sahalia Estimating the Degree of Activity of jumps in High Frequency Financial Data joint with Yacine Aït-Sahalia Aim and setting An underlying process X = (X t ) t 0, observed at equally spaced discrete times

More information

Ch.3 Demand Forecasting.

Ch.3 Demand Forecasting. Part 3 : Acquisition & Production Support. Ch.3 Demand Forecasting. Edited by Dr. Seung Hyun Lee (Ph.D., CPL) IEMS Research Center, E-mail : lkangsan@iems.co.kr Demand Forecasting. Definition. An estimate

More information

ARMA, GARCH and Related Option Pricing Method

ARMA, GARCH and Related Option Pricing Method ARMA, GARCH and Related Option Pricing Method Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook September

More information

LOGNORMAL MODEL FOR STOCK PRICES

LOGNORMAL MODEL FOR STOCK PRICES LOGNORMAL MODEL FOR STOCK PRICES MICHAEL J. SHARPE MATHEMATICS DEPARTMENT, UCSD 1. INTRODUCTION What follows is a simple but important model that will be the basis for a later study of stock prices as

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Interaction between quantitative predictors

Interaction between quantitative predictors Interaction between quantitative predictors In a first-order model like the ones we have discussed, the association between E(y) and a predictor x j does not depend on the value of the other predictors

More information

5. Multiple regression

5. Multiple regression 5. Multiple regression QBUS6840 Predictive Analytics https://www.otexts.org/fpp/5 QBUS6840 Predictive Analytics 5. Multiple regression 2/39 Outline Introduction to multiple linear regression Some useful

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Model Comparison 2 Model

More information

1 Short Introduction to Time Series

1 Short Introduction to Time Series ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The

More information

Lecture 2: ARMA(p,q) models (part 3)

Lecture 2: ARMA(p,q) models (part 3) Lecture 2: ARMA(p,q) models (part 3) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Jan. 2012 Florian Pelgrin (HEC) Univariate time series Sept.

More information

SAS Software to Fit the Generalized Linear Model

SAS Software to Fit the Generalized Linear Model SAS Software to Fit the Generalized Linear Model Gordon Johnston, SAS Institute Inc., Cary, NC Abstract In recent years, the class of generalized linear models has gained popularity as a statistical modeling

More information

Studying Achievement

Studying Achievement Journal of Business and Economics, ISSN 2155-7950, USA November 2014, Volume 5, No. 11, pp. 2052-2056 DOI: 10.15341/jbe(2155-7950)/11.05.2014/009 Academic Star Publishing Company, 2014 http://www.academicstar.us

More information

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480

Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500 6 8480 1) The S & P/TSX Composite Index is based on common stock prices of a group of Canadian stocks. The weekly close level of the TSX for 6 weeks are shown: Week TSX Index 1 8480 2 8470 3 8475 4 8510 5 8500

More information

Imputing Values to Missing Data

Imputing Values to Missing Data Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data

More information

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics

Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in

More information

Multiple Linear Regression in Data Mining

Multiple Linear Regression in Data Mining Multiple Linear Regression in Data Mining Contents 2.1. A Review of Multiple Linear Regression 2.2. Illustration of the Regression Process 2.3. Subset Selection in Linear Regression 1 2 Chap. 2 Multiple

More information

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software

Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software STATA Tutorial Professor Erdinç Please follow the directions once you locate the Stata software in your computer. Room 114 (Business Lab) has computers with Stata software 1.Wald Test Wald Test is used

More information

Time Series Analysis

Time Series Analysis Time Series 1 April 9, 2013 Time Series Analysis This chapter presents an introduction to the branch of statistics known as time series analysis. Often the data we collect in environmental studies is collected

More information

Regression Modeling Strategies

Regression Modeling Strategies Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

More information

Univariate and Multivariate Methods PEARSON. Addison Wesley

Univariate and Multivariate Methods PEARSON. Addison Wesley Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston

More information

Estimating Volatility

Estimating Volatility Estimating Volatility Daniel Abrams Managing Partner FAS123 Solutions, LLC Copyright 2005 FAS123 Solutions, LLC Definition of Volatility Historical Volatility as a Forecast of the Future Definition of

More information

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech.

MSwM examples. Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech. MSwM examples Jose A. Sanchez-Espigares, Alberto Lopez-Moreno Dept. of Statistics and Operations Research UPC-BarcelonaTech February 24, 2014 Abstract Two examples are described to illustrate the use of

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

3. Regression & Exponential Smoothing

3. Regression & Exponential Smoothing 3. Regression & Exponential Smoothing 3.1 Forecasting a Single Time Series Two main approaches are traditionally used to model a single time series z 1, z 2,..., z n 1. Models the observation z t as a

More information

2. Simple Linear Regression

2. Simple Linear Regression Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

How To Analyze The Time Varying And Asymmetric Dependence Of International Crude Oil Spot And Futures Price, Price, And Price Of Futures And Spot Price

How To Analyze The Time Varying And Asymmetric Dependence Of International Crude Oil Spot And Futures Price, Price, And Price Of Futures And Spot Price Send Orders for Reprints to reprints@benthamscience.ae The Open Petroleum Engineering Journal, 2015, 8, 463-467 463 Open Access Asymmetric Dependence Analysis of International Crude Oil Spot and Futures

More information

Week 5: Multiple Linear Regression

Week 5: Multiple Linear Regression BUS41100 Applied Regression Analysis Week 5: Multiple Linear Regression Parameter estimation and inference, forecasting, diagnostics, dummy variables Robert B. Gramacy The University of Chicago Booth School

More information

Threshold Autoregressive Models in Finance: A Comparative Approach

Threshold Autoregressive Models in Finance: A Comparative Approach University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Informatics 2011 Threshold Autoregressive Models in Finance: A Comparative

More information

Some useful concepts in univariate time series analysis

Some useful concepts in univariate time series analysis Some useful concepts in univariate time series analysis Autoregressive moving average models Autocorrelation functions Model Estimation Diagnostic measure Model selection Forecasting Assumptions: 1. Non-seasonal

More information

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model

Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort xavier.conort@gear-analytics.com Motivation Location matters! Observed value at one location is

More information

Basic Statistical and Modeling Procedures Using SAS

Basic Statistical and Modeling Procedures Using SAS Basic Statistical and Modeling Procedures Using SAS One-Sample Tests The statistical procedures illustrated in this handout use two datasets. The first, Pulse, has information collected in a classroom

More information

i=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by

i=1 In practice, the natural logarithm of the likelihood function, called the log-likelihood function and denoted by Statistics 580 Maximum Likelihood Estimation Introduction Let y (y 1, y 2,..., y n be a vector of iid, random variables from one of a family of distributions on R n and indexed by a p-dimensional parameter

More information

Forecasting in supply chains

Forecasting in supply chains 1 Forecasting in supply chains Role of demand forecasting Effective transportation system or supply chain design is predicated on the availability of accurate inputs to the modeling process. One of the

More information