Презентация Forecasting with bayesian techniques MP онлайн
На нашем сайте вы можете скачать и просмотреть онлайн доклад-презентацию на тему Forecasting with bayesian techniques MP абсолютно бесплатно. Урок-презентация на эту тему содержит всего 72 слайда. Все материалы созданы в программе PowerPoint и имеют формат ppt или же pptx. Материалы и темы для презентаций взяты из открытых источников и загружены их авторами, за качество и достоверность информации в них администрация сайта не отвечает, все права принадлежат их создателям. Если вы нашли то, что искали, отблагодарите авторов - поделитесь ссылкой в социальных сетях, а наш сайт добавьте в закладки.
Презентации » Экономика и Финансы » Forecasting with bayesian techniques MP
Оцените!
Оцените презентацию от 1 до 5 баллов!
- Тип файла:ppt / pptx (powerpoint)
- Всего слайдов:72 слайда
- Для класса:1,2,3,4,5,6,7,8,9,10,11
- Размер файла:2.38 MB
- Просмотров:95
- Скачиваний:0
- Автор:неизвестен
Слайды и текст к этой презентации:
№3 слайд
![Introduction Two Perspectives](/documents_6/063a42c19110f21f69f86d26721441ad/img2.jpg)
Содержание слайда: Introduction: Two Perspectives in Econometrics
Let θ be a vector of parameters to be estimated using data
For example, if yt~ i.i.d. N(μ,σ2), then θ=[μ,σ2] are to be estimated from a sample {yt}
Classical perspective:
there is an unknown true value for θ
we obtain a point estimator as a function of the data:
Bayesian perspective:
θ is an unknown random variable, for which we have initial uncertain beliefs - prior prob. distribution
we describe (changing) beliefs about θ in terms of probability distribution (not as a point estimator!)
№5 слайд
![Why a Bayesian Approach to](/documents_6/063a42c19110f21f69f86d26721441ad/img4.jpg)
Содержание слайда: Why a Bayesian Approach to VAR?
Dimensionality problem with VARs:
y contains n variables, p lags in the VAR
The number of parameters in c and A is n(1+np), and the number of parameters in Σ is n(n+1)/2
Assume n=4, p=4, then we are estimating 78 parameters, with n=8, p=4, we have 133 parameters
A tension: better in-sample fit – worse forecasting performance
Sims (Econometrica, 1980) acknowledged the problem:
“Even with a small system like those here, forecasting, especially over relatively long horizons, would probably benefit substantially from use of Bayesian methods or other mean-square-error shrinking devices…”
№6 слайд
![Why a Bayesian Approach to](/documents_6/063a42c19110f21f69f86d26721441ad/img5.jpg)
Содержание слайда: Why a Bayesian Approach to VAR? (2)
Usually, only a fraction of estimated coefficients are statistically significant
parsimonious modeling should be favored
What could we do?
Estimate a VAR with classical methods and use standard tests to exclude variables (i.e. reduce number of lags)
Use Bayesian approach to VAR which allows for:
interaction between variables
flexible specification of the likelihood of such interaction
№7 слайд
![Combining information prior](/documents_6/063a42c19110f21f69f86d26721441ad/img6.jpg)
Содержание слайда: Combining information: prior and posterior
Bayesian coefficient estimates combine information in the prior with evidence from the data
Bayesian estimation captures changes in beliefs about model parameters
Prior: initial beliefs (e.g., before we saw data)
Posterior: new beliefs = evidence from data + initial beliefs
№10 слайд
![Introduction to Bayesian](/documents_6/063a42c19110f21f69f86d26721441ad/img9.jpg)
Содержание слайда: Introduction to Bayesian Econometrics: Objects of Interest
Objects of interest:
Prior distribution:
Likelihood function: - likelihood of data at a given value of θ
Joint distribution (of unknown parameters and observables/data):
Marginal likelihood:
Posterior distribution:
i.e. what we learned about the parameters (1) having prior and (2) observing the data
№12 слайд
![Bayesian Econometrics](/documents_6/063a42c19110f21f69f86d26721441ad/img11.jpg)
Содержание слайда: Bayesian Econometrics: maximizing criterion
For practical purposes, it is useful to focus on the criterion:
Traditionally, priors that let us obtain analytical expressions for the posterior would be needed
Today, with increased computer power, we can use any prior and likelihood distribution, as long as we can evaluate them numerically
Then we can use Markov Chain Monte-Carlo (MCMC) methods to simulate the posterior distribution (not covered in this lecture)
№15 слайд
![Estimating a Sample Mean Let](/documents_6/063a42c19110f21f69f86d26721441ad/img14.jpg)
Содержание слайда: Estimating a Sample Mean
Let yt~ i.i.d. N(μ,σ2), then the data density function is:
where y={y1,…yT}
For now: assume variance σ2 is known (certain)
Assume the prior distribution of mean μ is normal, μ~ N(m,σ2/ν):
where the key parameters of the prior distribution are m and ν
№16 слайд
![Estimating a Sample Mean The](/documents_6/063a42c19110f21f69f86d26721441ad/img15.jpg)
Содержание слайда: Estimating a Sample Mean
The posterior of μ:
…has the following analytical form
with
So, we “mix” prior m and the sample average (data)
Note:
The posterior distribution of μ is also normal: μ~ N(m*,σ2/{ν+T})
Diffuse prior: ν→0 (prior is not informative, everything is in data)
Tight prior: ν→ ∞ (data not important, prior is rather informative)
№17 слайд
![Estimating a Sample Mean](/documents_6/063a42c19110f21f69f86d26721441ad/img16.jpg)
Содержание слайда: Estimating a Sample Mean: Example
Assume the true distribution is Normal yt~N(3,1)
So, μ=3 is known to… God
A researcher (one of us) does not know μ
for him/her it is a normally distributed random variable μ~N(m,1/v)
The researcher initially believes that m=1 and ν=1, so his/her prior is μ~N(1,1)
№21 слайд
![Examples Regression Model I](/documents_6/063a42c19110f21f69f86d26721441ad/img20.jpg)
Содержание слайда: Examples: Regression Model I (2)
Assume that the prior mean of β has multivariate Normal distribution N(m,σ2M):
where the key parameters of the prior distribution are m and M
Bayesian rule states:
i.e., the posterior of β is proportional to the product of the data density of data and prior
№23 слайд
![Since we do not like black](/documents_6/063a42c19110f21f69f86d26721441ad/img22.jpg)
Содержание слайда: Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior)
Since we do not like black boxes… there are 2 ways to get m* and M* (2 parameters to characterize posterior)
The long: manipulate the product of density functions (see Hamilton book, p367)
The smart: use GLS regression…
We have 2 ingredients:
prior distribution , which implies
and our regression model that “catches” the impact of the data on the estimate of β
№24 слайд
![Define a new regression model](/documents_6/063a42c19110f21f69f86d26721441ad/img23.jpg)
Содержание слайда: Define a “new” regression model
Define a “new” regression model
We simply stack our “ingredients” together to mix the information (prior and data) so that now β takes into account both!
The GLS estimator of β… is exactly our posterior mean
And the posterior variance of β is
№25 слайд
![Examples Regression Model II](/documents_6/063a42c19110f21f69f86d26721441ad/img24.jpg)
Содержание слайда: Examples: Regression Model II
So far the life was easy(-ier), in the linear regression model
β was random and unknown, but σ2 was fixed and known
What if σ2 is random and unknown?..
Bayesian rule states:
i.e., the posterior of β and σ2 is proportional to the product of the density of data, prior of β (given σ2) and prior of σ2
№26 слайд
![Examples Regression Model II](/documents_6/063a42c19110f21f69f86d26721441ad/img25.jpg)
Содержание слайда: Examples: Regression Model II ()
To manipulate the product
…we assume the following distributions:
Normal for data
Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M)
and Inverse-Gamma for the prior for σ2 : σ2 ̴ IG(λ,l)
Note: inverse-gamma is handy! It guaranties that random draws σ2 >0!
№28 слайд
![Priors summary In the above](/documents_6/063a42c19110f21f69f86d26721441ad/img27.jpg)
Содержание слайда: Priors: summary
In the above examples we dealt with 2 types of prior distributions of our parameters:
Case 1 prior
assumes β is unknown and normally distributed (Gaussian)
σ2 is a known parameter
the assumption Gaussian errors delivers posterior normal distribution for β
Case 2 (conjugate) priors
assumes β and σ2 are unknown
β and σ2 have prior normal and Inverse-Gamma distributions respectively
with Gaussian errors delivers posterior distributions for β and σ2 of the same family
№29 слайд
![Bayesian VARs Linear](/documents_6/063a42c19110f21f69f86d26721441ad/img28.jpg)
Содержание слайда: Bayesian VARs
Linear Regression examples will help us to deal with our main object – Bayesian VARs
A VAR is typically written as
where yt contains n variables, the VAR includes p lags, and the data sample size is T
We have seen that it is convenient to work with a matrix representation for a regression
Can we get it for our VAR? Yes!
…and it will help to get posteriors for our parameters
№31 слайд
![How to Estimate a BVAR Case](/documents_6/063a42c19110f21f69f86d26721441ad/img30.jpg)
Содержание слайда: How to Estimate a BVAR: Case 1 Prior
Consider Case 1 prior for a VAR:
coefficients in A are unknown with multivariate Normal prior distribution:
and known Σe
“Old trick” to get the posterior: use GLS estimator (appendix C for details)
Result
So the posterior distribution is multivariate normal
№32 слайд
![How to Estimate a BVAR Case](/documents_6/063a42c19110f21f69f86d26721441ad/img31.jpg)
Содержание слайда: How to Estimate a BVAR: Case 2 (conjugate) Priors
Before we see the case of an unknown Σe
need to introduce a multivariate distribution to characterize the unknown random error covariance matrix Σe
Consider a matrix
Each raw is a draw form N(0,S)
The nxn matrix
has an Inverse Wishart distribution with k degrees of freedom: Σe~IWn(S,l)
If Σe ~ IWn(S,l), then Σe-1 follows a Wishart distribution: Σe-1~Wn(S-1,l)
Wishart distribution might be more convenient
Σe-1 is a measure of precision (since Σe is a measure of dispersion)
№33 слайд
![How to Estimate a BVAR](/documents_6/063a42c19110f21f69f86d26721441ad/img32.jpg)
Содержание слайда: How to Estimate a BVAR: Conjugate Priors
Assume Conjugate priors:
The VAR parameters A and Σe are both unknown
prior for A is multivariate Normal:
and for Σe is Inverse Wishart:
Follow the analogy with univariate regression examples to put down the moments for posterior distributions
Recall matrix representation for our VAR:
Posterior for A is multivariate normal:
Posterior for Σe is Inv. Wishart:
See appendix D for details
№34 слайд
![BVARs Minnesota Prior](/documents_6/063a42c19110f21f69f86d26721441ad/img33.jpg)
Содержание слайда: BVARs: Minnesota Prior Implementation
The Minnesota prior – a particular case of the “Case 1 prior” (unknown model coefficients, but known error variance):
Assume random walk is a reasonable model for every yit in the VAR
Hence, for every yit
coefficient for the first own lag yit-1 has a prior mean of 1
coefficients for all other lags yit-k , yjt-1 , yjt-k have 0 prior mean
So, our prior for coefficients of VAR(2) example would be:
№35 слайд
![BVARs Minnesota Prior](/documents_6/063a42c19110f21f69f86d26721441ad/img34.jpg)
Содержание слайда: BVARs: Minnesota Prior Implementation
The Minnesota prior
The prior variance for the coefficient of lag k in equation i for variable j is:
… and depends only on three hyperparameters:
the tightness parameter γ (typically the same in all equations)
and the relative weight parameter w: is 1 for own lags and <1 for other variables
parameter q governs the tightness of the prior depending on the lag (often set to 1)
is a “scale correction”
the ratio of residual variances for OLS-estimated AR:
№36 слайд
![BVARs Minnesota Prior](/documents_6/063a42c19110f21f69f86d26721441ad/img35.jpg)
Содержание слайда: BVARs: Minnesota Prior Implementation
The Minnesota prior
Interpretation:
the prior on the first own lag is
the prior on the own lag k is
the prior std. dev. declines at a rate k, i.e. coefficients for longer lags are more likely to be close to 0
the prior on the first lag of another variable is
the prior std. dev. is reduced by a factor w: i.e. it is more likely that the first lags of other variables are irrelevant
the prior std. dev. on other variables’ lags
declines at a rate k
№37 слайд
![Remarks Remarks The overall](/documents_6/063a42c19110f21f69f86d26721441ad/img36.jpg)
Содержание слайда: Remarks:
Remarks:
The overall tightness of the prior is governed by γ
smaller γ model for yit shrinks towards random walk
The effect of other lagged variables is controlled by w
smaller estimates shrink towards AR model (yit is not affected by yjt)
Practitioner’s advice (RATS Manual) on the choice of hyperparameters:
Set γ=0.2, =0.5
Focus on forecast errors statistics, when selecting alternative hyperparameters
Loosen priors on own lags and tighten on other lags to improve
Substitute priors manually if there is a strong reason
№38 слайд
![BVARs Prior Selection](/documents_6/063a42c19110f21f69f86d26721441ad/img37.jpg)
Содержание слайда: BVARs: Prior Selection
Minnesota and conjugate priors are useful (e.g., to obtain closed-form solutions), but can be too restrictive:
Independence across equations
Symmetry in the prior can sometimes be a problem
Increased computer power allows to simulate more general prior distributions using numerical methods
Three examples:
DSGE-VAR approach: Del Negro and Schorfheide (IER, 2004)
Explore different prior distributions and hyperparameters: Kadiyala and Karlsson (1997)
Choosing the hyperparameters to maximize the marginal likelihood: Giannone, Lenza and Primiceri (2011)
№39 слайд
![Del Negro and Schorfheide](/documents_6/063a42c19110f21f69f86d26721441ad/img38.jpg)
Содержание слайда: Del Negro and Schorfheide (2004): DSGE-VAR Approach
Del Negro and Schorfheide (2004)
We want to estimate a BVAR model
We also have a DSGE model for the same variables
It can be solved and linearized: approximated with a RF VAR
Then, we can use coefficients from the DSGE-based VAR as prior means to estimate the BVAR
Several advantages:
DSGE-VAR may improve forecasts by restricting parameter values
At the same time, can improve empirical performance of DSGE relaxing its restrictions
Our priors (from DSGE) are based on deep structural parameters consistent with economic theory
№40 слайд
![Del Negro and Schorfheide We](/documents_6/063a42c19110f21f69f86d26721441ad/img39.jpg)
Содержание слайда: Del Negro and Schorfheide (2004)
We estimate the following BVAR:
The solution for the DSGE model has a reduced-form VAR representation
where θ are deep structural parameters
Idea:
Combine artificial and T actual observations (Y,X) and to get the posterior distribution
T*=λT “artificial” observations are generated from the DSGE model: (Y*,X*)
№41 слайд
![Del Negro and Schorfheide](/documents_6/063a42c19110f21f69f86d26721441ad/img40.jpg)
Содержание слайда: Del Negro and Schorfheide (2004)
Parameter λ is a “weight” of “artificial” (prior) data from DSGE
λ=0 delivers OLS-estimated VAR: i.e. DSGE not important
Large λ shrinks coefficients towards the DSGE solution: i.e. data not important
to find an “optimal” λ marginal likelihood is maximized (appendix E)
Can implement the procedure analytically… let’s see
№48 слайд
![Kadiyala and Karlsson Small](/documents_6/063a42c19110f21f69f86d26721441ad/img47.jpg)
Содержание слайда: Kadiyala and Karlsson (1997)
Small Model: a bivariate VAR with unemployment and industrial production
Sample period: 1964:1 to 1990:4.
Estimate the model through 1978:4
Criterion to chose hyperparameters: forecasting performance over 1979:1-1982:3
Use the remaining sub-sample 1982:4-1990:4 for forecasting
Large “”Litterman” Model: a VAR with 7 variables (real GNP, inflation, unemployment, money, investment, interest rate and inventories)
Sample period: 1948:1 to 1986:4.
Estimate the model through 1980:1
Use the remaining sub-sample 1980:2-1986:4 for forecasting
№50 слайд
![Prior distributions in K amp](/documents_6/063a42c19110f21f69f86d26721441ad/img49.jpg)
Содержание слайда: Prior distributions in K&K
K&K use a number of competing prior distributions…
Minnesota, Normal-Wishart, Normal-Diffuse, Extended Natural Conjugate (see appendix E)
… for and
Parameters of the prior distribution for :
each yit is a random walk (just as in Minnesota priors above)
The variance of each coefficient depends on two hyper-parameters w, :
№54 слайд
![Giannone, Lenza and Primiceri](/documents_6/063a42c19110f21f69f86d26721441ad/img53.jpg)
Содержание слайда: Giannone, Lenza and Primiceri (2011)
Use three VARs to compare forecasting performance
Small VAR: GDP, GDP deflator, Federal Funds rate for the U.S
Medium VAR: includes small VAR plus consumption, investment, hours worked and wages
Large VAR: expand the medium VAR with up to 22 variables
The prior distributions of the VAR parameters ϴ={, Σ, Σe} depend on a small number of hyperparameters
The hyperparameters are themselves uncertain and follow either gamma or inverse gamma distributions
This is to the contrast of Minnesota priors where hyperparameters are fixed!
№56 слайд
![Giannone, Lenza and Primiceri](/documents_6/063a42c19110f21f69f86d26721441ad/img55.jpg)
Содержание слайда: Giannone, Lenza and Primiceri (2011)
We interpret the model as a hierarchical model by replacing pγ(θ)=p(θ|γ) and evaluate the marginal likelihood:
The hyperparameters γ are uncertain
Informativeness of their prior distribution is chosen via maximizing the posterior distribution
Maximizing the posterior of γ corresponds to maximizing the one-step ahead forecasting accuracy of the model
№58 слайд
![In all cases BVARs](/documents_6/063a42c19110f21f69f86d26721441ad/img57.jpg)
Содержание слайда: In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs
In all cases BVARs demonstrate better forecasting performance vis-à-vis the unrestricted VARs
BVARs are roughly at par with the factor models, known to be good forecasting devices
№59 слайд
![Conclusions BVARs is a useful](/documents_6/063a42c19110f21f69f86d26721441ad/img58.jpg)
Содержание слайда: Conclusions
BVARs is a useful tool to improve forecasts
This is not a “black box”
posterior distribution parameters are typically functions of prior parameters and data
Choice of priors can go:
from a simple Minnesota prior (that is convenient for analytical results)
…to a full-fledged DSGE model that incorporates theory-consistent structural parameters
The choice of hyperparameters for the prior depends on the nature of the time series we want to forecast
No “one size fits all approach”
№61 слайд
![Appendix A Remarks about the](/documents_6/063a42c19110f21f69f86d26721441ad/img60.jpg)
Содержание слайда: Appendix A: Remarks about the marginal likelihood
Remarks about the marginal likelihood:
If we have M1,….MN competing models, the marginal likelihood of model Mj, f({yt}|Mj) can be seen as:
The update on the weight of model Mj after observing the data
The out-of-sample prediction record of model j.
Model comparison between two models is performed with the posterior odds ratio:
Favor’s parsimonious modeling: in-built “Occam’s Razor.”
№62 слайд
![Appendix A Remarks about the](/documents_6/063a42c19110f21f69f86d26721441ad/img61.jpg)
Содержание слайда: Appendix A: Remarks about the marginal likelihood
Remarks about the marginal likelihood:
Predict the first observation using the prior:
Record the first observable and its probability, f(y1o). Update your beliefs:
Predict the second observation:
Record f(y2o|y1o).
Eventually, you get f({yo})=f(y1o) f(y2o|y1o)…..f(yTo|y1o, y2o,…, yT-1o).
№63 слайд
![Appendix B Linear Regression](/documents_6/063a42c19110f21f69f86d26721441ad/img62.jpg)
Содержание слайда: Appendix B: Linear Regression with conjugate priors
To calculate the posterior distribution for parameters
…we assume the following for distributions:
Normal for data
Normal for the prior for β (conditional on σ2): β|σ2 ̴ N(m, σ2M)
and Inverse-gamma for the prior for σ2 : σ2 ̴ IΓ(λ,k)
Next consider the product
№68 слайд
![Appendix D How to Estimate a](/documents_6/063a42c19110f21f69f86d26721441ad/img67.jpg)
Содержание слайда: Appendix D: How to Estimate a BVAR: Conjugate Priors
Note that in the case of the Conjugate priors we rely on the following VAR representation
… while in the Minnesota priors case we employed
Though, if we have priors for vectorized coefficients in the form
we can also get priors for coefficients in the matrix form
For the mean we simply need to convert α back to the matrix form A
The variance matrix for can be obtained from the variance for :
Скачать все slide презентации Forecasting with bayesian techniques MP одним архивом:
Похожие презентации
-
Forecasting Free Cash Flow of an Industrial Enterprise Using Fuzzy Set Tools
-
European Prosper Summit. Preparing for Success with Rebecca Hintze
-
This course is concerned with making good economic decisions in engineering
-
Modeling and forecasting. Volatility
-
Analysis of graph centralities with help of Shapley values
-
Training course in revenue forecasting
-
Экономический цикл, рост и развитие
-
Регулювання ринку праці
-
Неоклассическая и неокейнсианская теория формирования ставки процента
-
Невидимая рука рынка. Законы спроса и предложения
-
Решение экономических задач с помощью таблицы в ЕГЭ