“main” — 2012/8/18 — 14:12 — page 293 — #1 Pesquisa Operacional (2012) 32(2): 293-313 © 2012 Brazilian Operations Research Society Printed version ISSN 0101-7438 / Online version ISSN 1678-5142 www.scielo.br/pope COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD FOR ARCH MODELS USING BRAZILIAN FINANCIAL TIME SERIES Sandra C. Oliveira1* and Marinho G. Andrade2 Received March 16, 2009 / Accepted September 14, 2011 ABSTRACT. In this work we compared the estimates of the parameters of ARCH models using a com- plete Bayesian method and an empirical Bayesian method in which we adopted a non-informative prior distribution and informative prior distribution, respectively. We also considered a reparameterization of those models in order to map the space of the parameters into real space. This procedure permits choosing prior normal distributions for the transformed parameters. The posterior summaries were obtained using Monte Carlo Markov chain methods (MCMC). The methodology was evaluated by considering the Telebras series from the Brazilian financial market. The results show that the two methods are able to adjust ARCH models with different numbers of parameters. The empirical Bayesian method provided a more parsimo- nious model to the data and better adjustment than the complete Bayesian method. Keywords: ARCH models, Bayesian approach, MCMC methods. 1 INTRODUCTION The dynamics of world financial markets requires increasingly sophisticated, complex and effi- cient models to describe the trends and characteristics of financial assets as accurately as possible. The analysis of financial series shows they have a high rate of change of the conditional variance in time. The square root of this rate (standard deviation) is called volatility. Understanding how volatility changes over time is critical to the financial market, influencing the risk assessment of investments and asset pricing. It determines the degree of change of the price of the asset in the future, i.e., a low value implies small changes (low risk) in the future, whereas a high value implies significant variations (high risk) (Enders, 2009). *Corresponding author 1Curso de Administração, Campus Experimental de Tupã – CET, Universidade Estadual Paulista – UNESP, Av. Domin- gos da Costa Lopes, 780, 17602-496 Tupã, SP, Brazil. E-mail: sandra@tupa.unesp.br 2Departamento de Matemática Aplicada e Estatı́stica – SME, Instituto de Ciências Matemáticas e de Computação – ICMC, Universidade de São Paulo – USP, Av. do Trabalhador Sãocarlense, 400, 13566-590 São Carlos, SP, Brazil. E-mail: marinho@icmc.usp.br “main” — 2012/8/18 — 14:12 — page 294 — #2 294 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD Let pt be the price of a given asset at time t , normally a business day. Suppose first that no dividend was paid during the period. The compounded continuous return or simply log-return is given by zt = ln ( pt / pt−1 ) . This definition is commonly used in the literature and in this paper zt will be called simply return. In practice, it is preferable to work with returns which are free of scale than with prices, because the former are easier to model using techniques of time series analysis (Morettin, 2008). The characterization of the statistical properties of financial return series is essential for a cor- rect application of models to the data, allowing inferences about the characteristics of this return, especially with regard to the mean and variance, which determine the expected return and volatil- ity forecast for the coming periods. Estimates of these quantities are fundamental to investment decisions not only in assets but also on their derivatives. There are a large number of nonlinear models for estimating the volatility of financial asset return series. The most widespread in the literature are the autoregressive conditional heteroskedas- ticity (ARCH) models, proposed by Engle (1982), and their extension, the generalized ARCH (GARCH) models, proposed by Bollerslev (1986). These models feature a nonlinear dependence among the returns, according to the serial dependence of the conditional variance. A compre- hensive review of the properties of these models can be found in Degiannakis & Xekalaki (2004) and Bollerslev (2008). Since it is considered that the volatility at a given instant of time depends on past values of the series, the determination of maximum likelihood estimators (MLE) of the parameters of ARCH models requires maximizing a nonlinear function. Therefore, the estimates can only be obtained numerically. Engle (1982) suggested the use of Newton’s method as an iterative method to calcu- late maximum likelihood estimates. This relaxes the restrictions on parameters (e.g., they must be positive and their sum must be less than one), which ensures stationary covariance. On the other hand, the determination of asymptotic properties of MLE with restrictions involves some difficulties and may lead to local maxima, since properties such as asymptotic normality do not allow restrictions (Barndorff-Nielsen & Cox, 1994). In addition, procedures for identification, di- agnosis and adjustment of the models, as well as prediction values of econometric series, require properties of asymptotic theory. Because the models are very far from linear, the asymptotic properties of these estimators can only be verified for very long series and, in general, are more appropriate in the presence of symmetrical distribution of the errors and normal distribution of the data (Geweke, 1986). In this context, the use of bootstrap methods can be considered to improve estimates of the parameters of models of the ARCH family. But the results of this approach can be severely affected in the MLE calculation, which is performed several times in the procedure and presents convergence difficulties, also due to the restrictions imposed on the parameters (Oliveira, 2005). An alternative to estimate these models, bypassing these difficulties, is to consider a Bayesian approach. To the best of our knowledge, Geweke (1989b) was the first author to propose a Bayesian ap- proach for ARCH models, in which a particular case of reparameterization allowed the use of Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 295 — #3 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 295 non-informative prior distributions. The parameters were obtained using algorithms of the Monte Carlo simulation. In a subsequent work, a Bayesian semi-nonparametric approach was proposed for ARCH models by Koop (1994), who used the advantages of Geweke’s Bayesian methodology and also estimated the model in a semi-nonparametric way. A Bayesian approach was proposed for GARCH models specifically, within the class of dynamic models, in Migon & Mazucheli (1999). Nakatsuma (2000) proposed a Markov Chain Monte Carlo (MCMC) method for linear regression models with ARMA-GARCH errors. He used normal prior and Metropolis-Hastings algorithm simulation for the model parameters to generate the posterior distribution. Polasek & Kozumi (2000) and Polasek (2001) developed a Bayesian approach with hierarchical structure for VAR-VARCH (Vector-AR-Vector-ARCH) and PAR-ARCH (Periodic-AR-ARCH) models, respectively, using MCMC. In the context of unobserved components models, Giakoumatos et al. (2005) proposed a Bayesian approach for ARCH models by using auxiliary variables (Pitt & Walker, 2005). More recently, Ausin & Galeano (2007) proposed a Bayesian approach for GARCH models with Gaussian mixture errors, and Barreto et al. (2008) compared the Bayesian and maximum likelihood methods using simulated series, according to ARCH processes with different orders and under conditions of finite and infinite variances. Finally, Andrade & Oliveira (2011) presented a Bayesian approach for ARCH models using the normal prior for their parame- ters and compared credibility intervals with bootstrap intervals. Therefore, given the importance of the subject, in this paper we propose a complete Bayesian method and an empirical Bayesian method for estimating the parameters of ARCH processes, in order to compare the estimates obtained by those methods. In the complete approach, we considered the prior distribution based on the non-informative proposal of Geweke (1989b) and used MCMC simulation methods to obtain posterior summaries. In the empirical approach, we partitioned the data set considering a Bayesian analysis for the first part of the data, using the non-informative prior distribution of Geweke (1989b) and MCMC simulation methods. With the information obtained from that first step, we defined an informative prior distribution, which was combined with the second part of the data, resulting in posterior summaries. We also considered a reparameterization of those models to reduce the rejection rate of the MCMC simulation algorithm and accelerate the convergence process. The proposed method- ology was used to model a series of daily prices of Telebras shares traded on the São Paulo Stock Exchange (Bovespa). 2 ARCH(q) MODELS The regression model proposed by Engle (1982) with zero mean, expressed as a linear combina- tion of exogenous variables, has a structure that can be summarized as: zt |�t−1 ∼ P (0, ht ) (1) ht = α0 + q∑ j=1 α j z 2 t− j (2) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 296 — #4 296 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD where zt represents a return series, P (∙) is a parametric distribution, usually normal or Student-t and �t−1 = {zt−1, zt−2, . . .} represents a set of information available at t −1. Let zt be a process that satisfies the model zt = h1/2 t εt (3) In this work we assumed that {εt , t ≥ 0} is a sequence of white noise i.i.d. N (0, 1). For the model (1)-(3) to be plausible (ht > 0 for all t), we must have α0 > 0 and α j ≥ 0, j = 1, . . . , q. Furthermore, the process zt has finite variance, and therefore has stationary covariance if and only if all the roots of the polynomial 1− ∑q j=1 α j l j are outside the unit circle, where l is the delay operator, such that l j zt = zt− j . When this condition is satisfied, it can be shown that the unconditional variance of zt is given by α0 / 1 − q∑ j=1 α j   . Therefore the sufficient condition for this process to have stationary covariance is ∑q j=1 α j < 1. 3 LIKELIHOOD FUNCTION Let ZT 1 = {z1, z2, . . . , zT } be an observed trajectory of the return zt . Assuming that the first q observations Zq 1 = { z1, z2, . . . , zq } are known, the likelihood function of the observations ZT q+1 = { zq+1, zq+2, . . . , zT } , conditional on Zq 1 , is given by L ( α|ZT q+1, Zq 1 ) ∝ T∏ t=q+1 ( 1 ht )1/2 exp { − z2 t 2ht } (4) where α = [ α0 α1 . . . αq ]′ is the vector of parameters of the ARCH(q) model and ht = α0 + ∑q j=1 α j z2 t− j . Thus, maximum likelihood estimators for the parameter vector α can be obtained by means of maximizing the likelihood function L ( α|ZT q+1, Zq 1 ) . It is worth noting that the use of the conditional likelihood function instead of the exact likelihood function can be done without great loss of precision in the estimates when the series zt is large. This approximation is generally considered due to the great practical advantages in calculating the estimators (Box et al., 1994). 4 BAYESIAN INFERENCE FOR ARCH(q) MODELS The Bayesian approach for inferring the parameters of ARCH(q) models starts from the com- bination of the likelihood function, L ( α|ZT q+1, Zq 1 ) , with an a prior probability density for the parameters, π(α), by means of Bayes’ rule (Gelman et al., 1995): π ( α|ZT q+1 ) ∝ L ( α|ZT q+1, Zq 1 ) π(α) (5) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 297 — #5 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 297 The expression π ( α|ZT q+1 ) is called the a posterior probability density of α and tells how the random variable α will be distributed after the data ZT q+1 have been observed. In sections 4.1 and 4.2 we propose two different approaches for Bayesian inference for the parameters α of ARCH(q) models. 4.1 Complete Bayesian model (CBM) with non-informative prior Bayesian analysis requires the specification of an a prior density for π(α) to reflect the prior knowledge about the distribution of π(α). When there is little or no prior knowledge about the distribution of a model’s parameters, one must adopt a non-informative prior density. In this work we adopted a non-informative prior density for the parameters α of ARCH(q) models based on the proposal of Geweke (1989b), defined as: π(α) ∝         1 − q∑ j=1 α j α0      , α0 > 0, α j ≥ 0, j = 1, . . . , q 0 , otherwise (6) Depending on α j = 0, j = 1, . . . , q, this is the invariant prior density of Jeffreys (Box & Tiao, 1973) for the normal linear regression model. We also consider a reparameterization, which consists of: φ j = log ( α j − a j b j − α j ) j = 0, 1, 2, . . . , q (7) The values of a j and b j can be chosen based on some prior information, for example, previous studies on the series analyzed. This reparameterization leads to the choice of a transformation that maps the intervals (−∞, +∞) in the domain ( a j , b j ) and vice versa. Accordingly, we re- duced the rejection rate of the MCMC simulation algorithm, accelerating its convergence. Using the reparametrization defined in (7), we have: π (ϕ) ∝    ( 1 k (ϕ) ) 1 2 , −∞ < ϕ j < ∞, j = 0, 1, . . . , q 0 , otherwise (8) where ϕ = [ φ0φ1 . . . φq ]′ and k (ϕ) = α0 / 1 − q∑ j=1 α j   with α j , j = 0, 1, 2, . . . , q given by (7). Combining the expressions (4) and (8), we write the posterior joint density for the ϕ as: π ( ϕ|ZT q+1 ) ∝ ( 1 k (ϕ) )1/2 × T∏ t=q+1 ( 1 ht (ϕ) )1/2 exp { − z2 t 2ht (ϕ) } (9) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 298 — #6 298 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD We note that the posterior density (9) has a form that only resembles the known probability density functions, which prevents the calculation of the quantities of interest of the parameters α j , j = 0, 1, . . . , q, such as mean, mode, median, standard deviation, among others (Kotz et al., 2000). Since the adjustment of ARCH processes to financial series generally are of high orders, the use of numerical integration methods would require large computational effort and have low accuracy. This problem can be solved by using Monte Carlo Markov chain (MCMC) simulation methods, specifically the Metropolis-Hastings algorithm. The methods based for this estimation are preferable because they can be generalized to more complex models. Other algorithms could be used (e.g., acceptance/rejection, importance sampling, among others), but the Metropolis- Hastings is one of those requiring the least computational effort (Gilks et al., 1996). 4.2 Empirical Bayes Model (EBM) The empirical Bayesian methods are distinguished by using the observed data to estimate the parameters of the prior and avoid the difficulties of determining the distribution, either due to lack of knowledge about the variable of interest or some subjectivity involved in choosing it. The empirical method can be viewed as an approximation of the complete hierarchical Bayesian analysis (Gelman et al., 1995). In this work we consider a partition of the return series ZT 1 = {z1, z2, . . . , zT }, such that ZT 1 = [ ZT1 1 ZT T1+1 ]′ , where ZT1 1 = { z1, z2, . . . , zT1 } and ZT T1+1 = { zT1+1, zT1+2, . . . , zT } , and we propose a Bayesian framework composed of two steps. In the first step we must perform an analysis with the first part of the series of observed data, ZT1 1 , assuming a non-informative prior density, if possible, π1(α), to make inferences about the parameter vector α = [ α0 α1 . . . αq1 ]′. We thus propose the following Bayesian model: π ( α|ZT1 q1+1 ) ∝ L ( α|ZT1 q1+1, Zq1 1 ) π1(α) (10) where Zq1 1 = { z1, z2, . . . , zq1 } and ZT1 q1+1 = { zq1+1, zq1+2, . . . , zT1 } . This procedure provides posterior estimates for α through the density π ( α|ZT1 q1+1 ) . These estimates are used for inference about the density of α∗ = [ α0 α1 . . . αq2 ]′, π2(α ∗), which is used as an informative prior density for the second stage of the analysis (main analysis), where we combine the likelihood function constructed from the second part of the series of observed data, ZT T1+1, with the prior density π(α∗), leading to the posterior density, given by: π ( α∗|ZT T1+q2+1 ) ∝ L ( α∗|ZT T1+q2+1, ZT1+q2 T1+1 ) π2(α ∗) (11) where ZT1+q2 T1+1 = { zT1+1, zT1+2, . . . , zT1+q2 } and ZT T1+q2+1 = { zT1+q2+1, zT1+q2+2, . . . , zT } . Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 299 — #7 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 299 In this paper, we partition the data set so that the first half of the data series (i.e., T1 = T 2 observations) are used in the first stage of the Bayesian analysis and the second half of the data series are used in the second stage (main analysis). The choice of T1 requires careful analysis of the sensitivity of the parameter estimates in function of the sample size. A minimum training sample is always desirable. However, the computational effort is very great for this choice. Further details and discussions about this procedure can be found in Berger & Pericchi (2004). For the Bayesian analysis of the first step, we consider non-informative prior density of Geweke (1989b), defined in (6)-(8), assuming little or no prior knowledge about the distribution of the model’s parameters. Assuming that the first q1 observations are known, the conditional likeli- hood function is defined by means of the volatility ht rewritten as a function of the parameters α j , j = 0, 1, . . . , q1, transformed into φ j , j = 0, 1, . . . , q1, by (7), i.e.: L ( ϕ|ZT1 q1+1, Zq1 1 ) ∝ T1∏ t=q1+1 ( 1 ht (ϕ) )1/2 exp { − z2 t 2ht (ϕ) } (12) where ϕ = [ φ0 φ1 . . . φq1 ]′. Combining expressions (12) and (8), we can write the prior joint distribution for ϕ as: π ( ϕ|ZT1 q1+1 ) ∝ ( 1 k (ϕ) )1/2 × T1∏ t=q1+1 ( 1 ht (ϕ) )1/2 exp { − z2 t 2ht (ϕ) } (13) where k (ϕ) = α0 / 1 − q1∑ j=1 α j   and α j = b j eφ j + a j 1 + eφ j ; j = 0, 1, . . . , q1. For the Bayesian analysis of the second stage, we set an informative prior density for α∗, π(α∗), which is built from the analysis of the first stage, i.e., using the posterior estimates obtained by means of the density π ( α|ZT1 q1+1 ) . We consider that all α j , j = 0, 1, . . . , q2 are independent with a prior density defined in the interval [ a j , b j ] , with a j > 0 and b j < 1, such that a j ≤ α j ≤ b j , since the constraint ∑q2 j=1 α j < 1 requires that α j ∈ (0, 1) , j = 1, 2, . . . , q2. Because we want to generate values for α j close to the averages of each interval ( a j + b j )/ 2 with greater frequency than values near the limits a j and b j , we adopt the proposed reparameterization in equation (7), such that φ∗ j is a component of the vector of parameters ϕ∗ = [ φ∗ 0 φ∗ 1 . . . φ∗ q2 ]′ and a j and b j is selected based on the upper and lower bounds of the 95% credibility intervals for α, obtained from the posterior density π ( α|ZT1 q1+1 ) . In this case, the conditional likelihood function is defined with volatility ht rewritten in terms of the parameters α∗ j , j = 0, 1, . . . , q2, transformed similarly in φ∗ j , j = 0, 1, . . . , q2, by (7), i.e.: L ( ϕ∗|ZT T1+q2+1, ZT1+q2 T1+1 ) ∝ T∏ t=T1+q2+1 ( 1 ht (ϕ∗) )1/2 exp { − z2 t 2ht (ϕ∗) } (14) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 300 — #8 300 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD where ϕ∗ = [ φ∗ 0 φ∗ 1 . . . φ∗ q2 ]′ . Assuming that φ∗ j , j = 0, 1, . . . , q2 are normally distributed, i.e., π(φ∗ j ) ∼ Normal (0, σ 2 j ), the a prior joint density is given by: π ( ϕ∗) = q2∏ j=0 π ( φ∗ j ) (15) Combining expressions (14) and (15), we can write the posterior joint density for ϕ∗ as: π ( ϕ∗|Z T T1+q2+1 ) ∝   q2∏ j=0 π ( φ∗ j )   × T∏ t=T1+q2+1 ( 1 ht (ϕ∗) )1/2 exp { − z2 t 2ht (ϕ∗) } (16) Thus, the posterior conditional densities for φ∗ j , j = 0, 1, . . . , q2, obtained by means of expres- sion (16), are given by: π ( φ∗ j |ϕ ∗ − j , ZT T1+q2+1 ) ∝ L ( ZT T1+q2+1|Z T1+q2 T1+1 , ϕ∗ ) π ( φ∗ j ) (17) where ϕ∗ − j is a vector formed by the model’s parameters, except the parameter φ∗ j , i.e., ϕ∗ − j = [ φ∗ 0 φ∗ 1 . . . φ∗ j−1φ ∗ j+1 . . . φ∗ q2 ]′ . 5 ALGORITHMS TO ESTIMATE THE PARAMETERS OF ARCH(q) MODELS To obtain representative samples from the posterior joint density (9), (13) and (16), we use MCMC simulation algorithms. The MCMC simulation method is a form of Monte Carlo integration. The idea is to simulate an irreducible non-periodic Markov chain whose stationary density is the density of interest, i.e., the posterior density. There are two methods to generate Markov chains with specified stationary density, the Metropolis-Hastings (Chib & Greenberg, 1995), which has been used for many years in statistical physics, and Gibbs sampling, which was introduced in the statistical literature by Gelfand & Smith (1990). In general, the use of these algorithms is necessary when the non- iterative generation of the density to be sampled is too complicated or costly. The Metropolis-Hastings algorithm When posterior conditional densities are not easily identified as having a standard form (normal, gamma, etc.), preventing the direct generation from these densities, the Metropolis-Hastings algorithm can be used. This technique requires a nucleus, i.e., a transition density q ( φ, φ′ ) , not necessarily having equilibrium probability π , but which represents a passing rule to define a chain. Consider also the probability of acceptance p ( φ(r−1), φ′ ) set forth below. The algorithm follows these steps: Step 1: Assign an initial value φ = φ(0) and start the iteration count at r = 1. Step 2: Move the chain to a new value φ′ generated from the density q ( φ(r−1), φ′ ) . Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 301 — #9 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 301 Step 3: Generate u from the uniform density (0, 1). Step 4: Accept the value generated φ′ if u ≤ p ( φ(r−1), φ′ ) = min ( 1, π ( φ′ ) q ( φ′, φ(r−1) ) π ( φ(r−1) ) , q ( φ(r−1), φ′ ) ) Otherwise keep φ(r) = φ(r−1). Step 5: Update r and go to step 2. For r sufficiently large, φ(r) is a sample of the posterior distribution. For the vector case ϕ =[ φ0 φ1 . . . φq ] , we have a transition density given by q ( ϕ, ϕ′ ) and an acceptance probability given by p ( ϕ(r−1), ϕ′ ) , and we must do the same. One feature that should be considered in the use of the Metropolis-Hastings algorithm is the fact that it can generate correlated samples, since if the generated candidate is rejected, the previous value is taken as a candidate. To avoid biased estimates, it is common to disregard the initial part of the chain in order to avoid the influence of initial conditions, and to select values for each k steps to eliminate the sample correlation. The evaluation of this procedure is reflected in the acceptance rate, and the convergence is checked using the criterion of Geweke (1992). To ensure that the Metropolis-Hastings algorithm will converge to an equilibrium density, one must observe the regularity conditions of the posterior distributions, which are guaranteed in our case. Assuming a quadratic loss function, the Bayesian estimates for α (Sections 4.1 and 4.2) can be expressed as evaluating the expectancy of a function of interest g (α) with respect to the posterior density π (α|Z). In general, it is given by: E [g (α)] = ∫ α g (α) π (α|Z) dα (18) The exact solution of the integral in (18) cannot be calculated analytically for the models consid- ered in this work. However, if the expression E [g (α)] exists, an approximation to expression (18) can be obtained by Monte Carlo simulation (Geweke, 1989a). 6 MODEL SELECTION CRITERIA There are several criteria that can be used for the selection of the order q of ARCH models under a Bayesian framework. In this paper, we use the following criteria: Bayesian information criterion (BIC), Akaike information criterion (AIC), deviation information criterion (DIC) and predictive ordinate criterion (POC), which are described more generally below. 6.1 Bayesian information criterion (BIC) and Akaike information criterion (AIC) The Bayesian information criterion (BIC) is a model selection criterion proposed by Schwarz (1978) and modified by Carlin & Louis (2000) to be applied in the Bayesian context. This crite- Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 302 — #10 302 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD rion is a weighting between the maximum of the posterior expected value of the log-likelihood function and the number of model parameters. The best model is the one that has the lowest BIC value, given by: B I C = −2E(l) + M ln (T ) (19) where E(l) is the expected value taken with respect to the posterior density of the log-likelihood function, M is the size of the parameter vector and T is the size of the series. The Akaike information criterion (AIC), proposed by Akaike (1974), was also modified to be used in the Bayesian context. The best model is the one that has the lowest AIC value, given by: AI C = −2E(l) + 2M (20) Lower values of BIC and/or AIC indicate better models, but the BIC penalizes the number of parameters more and tends to select models with fewer parameters. 6.2 Deviation information criterion (DIC) The deviation information criterion (DIC) is a generalization of the BIC. Proposed by Spiegel- halter et al. (2002), this criterion is defined as: DI C = D ( α̂ ) + 2MD (21) where D ( α̂ ) = −2 ln L ( Z| _ α ) + C is the deviation assessed in the posterior mean, ln L ( Z| _ α ) is the natural logarithm of the likelihood function, C is a constant that is canceled and need not be known when comparing models and MD is the effective number of parameters of the model, given by: MD = D − D ( α̂ ) , where D = E {D (α)} is the posterior mean deviation, which measures the model’s goodness of fit. The model with the lowest DIC value will be considered the most suitable to represent the data. The DIC is commonly used for Bayesian models whose posterior distribution of parameters was obtained through MCMC simulation algorithms as well as proposed models for analysis of financial series. 6.3 Predictive ordinate criterion (POC) In the criterion for selecting models based on predictive ordinate density (Carlin & Chib, 1995) we use the distribution of zT +m , subject to the data Z and the parameters α = [ α0 α1 . . . αq ]′. Thus, the predictive density is given by: π (zT +m |Z) = ∫ α π(zT +m |Z, α)π(α|Z)dα cm = π (zT +m |Z) = ∫ α ( 1 2π hT +m )1/2 exp { − 1 2 z2 T +m hT +m } π (α|Z) dα (22) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 303 — #11 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 303 where π (α|Z) is the posterior distribution of the parameters α and the integral in (22) is a mul- tiple integral in the parameters space. Assessing the observations α (i) j , j = 0, 1, . . . , q obtained by MCMC simulation and considering hT +m as a function of the parameters α, the Monte Carlo estimate of the predictive density is given by: ĉm = 1 M M∑ i=1 1 √ 2πhT +m(α(i)) exp { − 1 2 z2 T +m hT +m(α(i)) } (23) A low value of ĉm indicates that the observation zT +m is unlikely for the model in question. Thus, the POC consists of choosing the model l that provides the greatest value of ĉ (l) = T +K∏ m=T +1 ĉm (l) . 7 APPLICATION The analyzed time series consists of daily prices of shares of Telebras (Brazilian telecom hold- ing company controlled by the federal government), traded on the Bovespa in the period from January 2, 1992 to January 5, 1996, for a total of 986 observations. This was one of the most liquid papers (according to the number of trades and financial volume) in the Brazilian stock market during the study period, i.e. one of the most representative of those making up the Bovespa index (Ibovespa). The behavior of the Telebras stock has been widely studied in the literature and among the works on this issue, we can mention: Baidya & Costa (2001), Oliveira (2005) and Sáfadi & Andrade (2007). Let pt be the closing price of Telebras shares on a trading day, in which the return is given by zt = ln ( pt / pt−1 ) . The share prices of Telebras and their returns are illustrated in Figure 1. The graphs in Figure 2 also show the behavior of the series of squared returns of Telebras, z2 t . We can see that this series is correlated. This behavior is typically associated with that of ARCH (q) models. In implementing the Metropolis-Hastings algorithm, for each model (CBM and EBM), we sim- ulated a chain with 50,000 iterations for each parameter, then discarded 50% of the values to decrease the effect of initial conditions and then took values spaced 5 to 5, for a total sample of 5,000 observations. The convergence of the algorithm was verified by Geweke’s criterion (1992), at a significance level of 5% for the test under the null hypothesis Ho. The convergence was considered for values obtained by the diagnosis between −1.96 and 1.96. In this work, all the computational algorithms were implemented using the MATLAB c© software. For CBM, the adjusted model for the Telebras series, according to the AIC, DIC and POC (Table 1), was an ARCH (7), described as: zt = ( α0 + α1z2 t−1 + α2z2 t−2 + α3z2 t−3 + α4z2 t−4 + α5z2 t−5 + α6z2 t−6 + α7z2 t−7 ) 1 2 εt (24) Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 304 — #12 304 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD 0 200 400 600 800 1000 0.01 0.02 0.03 0.04 0.05 0.06 0.07 time (days) 0 200 400 600 800 1000 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 time (days) (a) (b) Figure 1 – Price (a) and return (b) of the Telebras series. 0 200 400 600 800 1000 0 0.02 0.04 0.06 0.08 0.1 0.12 time (days) (a) (b) Figure 2 – Series of squared returns of Telebras, z2 t (a), and its autocorrelation function (b). Table 1 – Model selection criteria for the ARCH (q) process, considering CBM. Model order AIC BIC DIC POC – E{log l} ( × 103) ( × 103) ( × 103) ( × 1027) ( × 103) q = 6 −5.13244 −5.09818 −1.27337 1.16306 −2.00322 q = 7 −5.13583 −5.09668 −1.27675 1.25362 −2.00591 q = 8 −5.12954 −5.08550 −1.27302 0.77599 −2.00377 Table 2 shows the estimates and 95% credibility intervals (CI) for α = [ α0 α1 . . . αq ]′, using the CBM, obtained in accordance with the procedures proposed for the inference about parameters of ARCH(q) processes, q = 7. The convergence of parameters was checked by Geweke’s criterion (GC), which was observed for those. The last column of this table also shows the acceptance rates (AR) for the parameter values generated by the Metropolis-Hastings algorithm. As previously mentioned, in the EBM we partitioned the data set such that the first half of the data series was used in Bayesian analysis in the first step, according to the procedure presented Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 305 — #13 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 305 Table 2 – Complete Bayesian estimate and 95% CI. Mean Standard Median Mode CI (95%) GC AR (%) Deviation α0 0.00069 0.00003 0.00069 0.00055 [0.00063; 0.00076] 0.90273 85.098 α1 0.10307 0.01923 0.10219 0.11560 [0.06635; 0.13683] 0.15009 86.314 α2 0.15264 0.01982 0.15474 0.15780 [0.11469; 0.18763] 0.11001 87.456 α3 0.10277 0.01884 0.10270 0.10770 [0.06639; 0.13749] 1.88097 86.368 α4 0.09878 0.01902 0.09564 0.10460 [0.06468; 0.13449] 0.00754 84.584 α5 0.05173 0.01857 0.05338 0.05240 [0.01688; 0.08664] −1.55891 84.318 α6 0.10362 0.01912 0.10107 0.12090 [0.06587; 0.13771] 0.78153 85.960 α7 0.10379 0.01871 0.10292 0.11350 [0.06727; 0.13761] −0.79080 85.606 in Section 4.2, to obtain posterior estimates α = [ α0 α1 . . . αq1 ]′. Thus, the model fit to the data Telebras series, according to the AIC, BIC and POC (Table 3) was an ARCH (6), given by: zt = ( α0 + α1z2 t−1 + α2z2 t−2 + α3z2 t−3 + α4z2 t−4 + α5z2 t−5 + α6z2 t−6 ) 1 2 εt (25) Table 3 – Model selection criteria for the ARCH (q1) process, considering EBM: Bayesian analysis of the first step. Model order AIC BIC DIC POC – E{log l} ( × 103) ( × 103) ( × 103) ( × 1028) ( × 102) q1 = 5 −2.45388 −2.42452 −5.89629 2.31037 −5.82941 q1 = 6 −2.48262 −2.44836 −6.00084 2.66551 −5.98308 q1 = 7 −2.47961 −2.44046 −6.07358 2.46414 −5.97805 Table 4 shows the estimates and 95% credibility intervals (CI) for α = [ α0 α1 . . . αq1 ]′, obtained in the first stage of the EBM, according to an ARCH(q1) process, q1 = 6. This information served to elucidate the informative prior distribution of α∗, π(α∗), where their hyperparameters a j and b j assumed values of lower and upper bounds, respectively, in the 95% credibility intervals presented in Table 4 (in bold), and σ 2 j = 1, for j = 0, 1, . . . , q2. Thus, the second half of the data series was used in the second stage (main analysis), with π(α∗) resulting in posterior summaries of interest. The model fit to the Telebras data series in the main analysis, according to the model selection criteria (Table 5), was an ARCH (5), given by: zt = ( α0 + α1z2 t−1 + α2z2 t−2 + α3z2 t−3 + α4z2 t−4 + α5z2 t−5 ) 1 2 εt (26) Table 6 shows the estimates and 95% credibility intervals (CI) for α∗ = [ α0 α1 . . . αq2 ]′, using EBM, obtained from the main analysis by an ARCH(q2) process, q2 = 5. Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 306 — #14 306 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD Table 4 – Empirical Bayesian estimate and 95% CI of the first step. Mean Standard Median Mode CI (95%) GC AR (%) Deviation α0 0.00071 0.00003 0.00071 0.00070 [0.00064; 0.00077] –0.85421 87.734 α1 0.09549 0.03084 0.09359 0.09556 [0.04179; 0.15933] 0.56397 69.398 α2 0.09081 0.03617 0.08880 0.09103 [0.02916; 0.16344] 1.15489 73.296 α3 0.09826 0.03416 0.09723 0.09803 [0.03566; 0.16578] –0.14777 75.122 α4 0.17912 0.04231 0.17801 0.17971 [0.10014; 0.26113] –0.01410 73.816 α5 0.09748 0.03483 0.09578 0.09665 [0.03348; 0.16691] –1.03893 76.228 α6 0.19613 0.03854 0.19580 0.19677 [0.12089; 0.26848] –1.68313 70.296 Table 5 – Model selection criteria for the ARCH(q2) process, considering EBM: main Bayesian analysis. Model order AIC BIC DIC POC – E{log l} ( × 103) ( × 103) ( × 103) ( × 1027) ( × 102) q2 = 4 –2.61215 –2.58768 –6.49695 1.53242 –6.61074 q2 = 5 −2.62362 −2.59426 −6.51874 1.83186 −6.67811 q2 = 6 –2.60406 –2.56980 –6.48157 0.90524 –6.59028 Table 6 – Empirical Bayesian estimate and 95% CI: Main analysis. Mean Standard Median Mode CI (95%) GC AR (%) Deviation α0 0.00071 0.00003 0.00071 0.00071 [0.00070; 0.00072] –0.41660 98.400 α1 0.12299 0.01847 0.12546 0.12881 [0.08050; 0.15075] –1.56153 48.112 α2 0.12926 0.01741 0.13228 0.14029 [0.08874; 0.15455] 0.97803 33.782 α3 0.11652 0.02331 0.11922 0.13008 [0.06607; 0.15356] –0.53695 65.944 α4 0.15254 0.02725 0.14804 0.13122 [0.11293; 0.21611] –0.78737 52.852 α5 0.09784 0.02562 0.09736 0.09750 [0.05117; 0.14758] –0.95037 89.318 The graphs in Figure 3 represent the histograms constructed from the sample selected for the parameters α0, α1, . . . , α7, i.e., the posterior distribution of the parameters of the ARCH (7) model, which was adjusted to the Telebras series considering a complete Bayesian model. The graphs in Figure 4 represent the histograms constructed from the sample selected for the parameters α0, α1, . . . , α5, i.e., the posterior distribution of the parameters of the ARCH (5) model, which was adjusted to the Telebras series considering the main analysis of an empirical Bayesian model. Figure 5 shows the estimated volatility versus the observed volatility (values of z2 t ). When the model is appropriate to the data, i.e., it properly estimates the volatility of the series, we expect the graphic representation of this relationship to show those values grouped in a straight line. In this respect, this behavior can be observed in both cases (complete and empirical models). Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 307 — #15 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 307 Figure 3 – Posterior distribution of the parameters – CBM. Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 308 — #16 308 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD Figure 4 – Posterior distribution of the parameters – EBM. Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 309 — #17 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 309 estimated volatility (a) estimated volatility (b) Figure 5 – Estimated volatility versus z2 t : Complete Bayesian model (a) and empirical Bayesian model (b). As the BIC and other selection criteria similar to it are influenced by sample size, the criterion based on predictive ordinate density is the best for the final comparison between the two Bayesian procedures presented. Thus, although the plots in Figure 5 show that both approaches presented very close results regarding fit, the graphs about predictive ordinate (and values of POC) in Figure 6 highlight that the EBM best fitted the data of daily Telebras share prices, and it is even more parsimonious than the CBM. This fact suggests that the use of a prior information can lead to a more representative model of that data series. Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 310 — #18 310 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD Figure 6 – Predictive ordinate criterion (cm , m = 30): Complete Bayesian model and empirical Bayesian model. Finally, Figure 7 illustrates the estimated volatility from the best model, EBM. It can be seen that events were detected generating significant behavioral changes in the series of returns during the period studied, often irregular, of which we can highlight the crisis in Mexico in February and March 1995. 0 200 400 600 800 1000 0 0.005 0.01 0.015 0.02 0.025 time (days) Figure 7 – Estimated volatility – Telebras series. 8 CONCLUDING REMARKS In general, it is important to emphasize the feasibility of the Bayesian approach in the inference of the parameters of ARCH(q) processes, since it allows the possibility of incorporating the Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 311 — #19 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 311 experience of experts in finance, which usually is a relevant issue in the analysis of economic and financial series. In this context, studies show that empirical Bayesian models present similar results to those pre- sented by complete Bayesian models, but the former have the advantage of being easily integrated for various cases. In this paper, we performed an empirical Bayesian analysis, through a detailed study. However, due to the complexity of this analysis, posterior estimates could only be found numerically and, so we used MCMC simulation algorithms. The results were compared to those obtained by using the complete Bayesian model and showed that the two methods have adjusted ARCH processes with different orders. We noted that the empirical Bayesian model provides a more parsimonious adjustment to the Telebras series, i.e., with a smaller number of parameters for the ARCH process, compared to the complete Bayesian model. Moreover, the selection criterion of models based on the ordinate predictive density showed that the empirical Bayesian model provided better adjustment than the complete Bayesian model. It is worth noting that the use of empirical prior distributions and of reparametrization contributed to faster convergence of the process of inference of the parameters of ARCH(q) models using MCMC simulation algorithms. 9 ACKNOWLEDGMENTS We thank the reviewers for their careful reading and valuable comments, and Fundação para o Desenvolvimento da UNESP – FUNDUNESP for financial support (Process no. 00502/07-DF). REFERENCES [1] AKAIKE H. 1974. A new look at the statistical identification model. IEEE Trans on Automatic Control, 19: 716–723. [2] ANDRADE MG & OLIVEIRA SC. 2011. A comparative study of Bayesian and Maximum Likelihood approaches for ARCH models with evidence from Brazilian financial series. New Mathematics and Natural Computation, 7: 347–361. [3] AUSÍN MC & GALEANO P. 2007. Bayesian estimation of the Gaussian mixture GARCH model. Computational Statistics and Data Analysis, 51: 2636–2652. [4] BARNDORFF-NIELSEN OE & COX DR. 1994. Inference and Asymptotics. Chapman-Hall, London. [5] BARRETO GA, OLIVEIRA SC & ANDRADE MG. 2008. Estimação de Parâmetros de Modelos ARCH(p): Abordagem Bayes-MCMC versus Máxima Verossimilhança. Revista Brasileira de Es- tatı́stica, 69: 7–24. [6] BERGER JO & PERICCHI LR. 2004. Training samples in objective Bayesian model selection. The Annals of Statistics, 32: 841–869. [7] BOLLERSLEV T. 2008. Glossary to ARCH (GARCH). CREATES Research Paper 2008-49. Available at SSRN: . Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 312 — #20 312 COMPARISON BETWEEN THE COMPLETE BAYESIAN METHOD AND EMPIRICAL BAYESIAN METHOD [8] BOLLERSLEV T. 1986. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics, 31: 307–327. [9] BOLLERSLEV T, CHOU RY & KRONER KF. 1992. ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence. Journal of Econometrics, 52: 5–59. [10] BOX GE, JENKINS GM & REINSEL G. 1994. Time Series Analysis: Forecasting and Control. Third Edition. Englewood Cliffs: Prentice Hall, San Francisco. [11] BOX GE & TIAO GC. 1973. Bayesian Inference in Statistical Analysis. Addison-Wesley, New York. [12] CARLIN B & CHIB S. 1995. Bayesian Model Choice via Markov Chain Monte Carlo Methods. Journal of the Royal Statistical Society, 57: 473–484. [13] CARLIN B & LOUIS T. 2000. Bayes and Empirical Bayes Methods for Data Analysis. Texts in Sta- tistical Science Series, Chapman and Hall, London. [14] CHIB S & GREENBERG E. 1995. Understanding the Metropolis-Hastings Algorithm. The American Statistician, 49: 327–335. [15] COSTA PHS & BAIDYA TKN. 2001. Propriedades estatı́sticas das séries de retornos das principais ações brasileiras. Pesquisa Operacional, 21: 61–87. [16] DEGIANNAKIS S & XEKALAKI E. 2004. Autoregressive Conditional Heteroscedasticity (ARCH) Models: A review. Quality Technology and Quantitative Management, 1: 271–324. [17] ENDERS W. 2009. Applied Econometric Times Series. 3rd edition, John Wiley & Sons, New York. [18] ENGLE R. 1982. Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of UK Inflation. Econometrica, 50: 987–1007. [19] GELFAND AE & SMITH AF. 1990. Sampling – Based Approaches to Calculating Marginal Densities. Journal of the Statistical Association, 5: 398–409. [20] GELMAN A, CARLIN J, STERN H & RUBIN D. 1995. Bayesian Data Analysis. London: Chapman & Hall. [21] GEWEKE J. 1992. Evaluating the accuracy of sampling-based approaches to the calculation of pos- terior moments (with discussion). In: BERGER J, BERNARDO J, DAWID A & SMITH A. (editors), Bayesian Statistics, 4: 164–193. [22] GEWEKE J. 1989a. Bayesian inference in econometric models using Monte Carlo integration. Econo- metrica, 57: 1317–1339. [23] GEWEKE J. 1989b. Exact Predictive Densities for Linear Models with ARCH Disturbances. Journal of Econometrics, 40: 63–86. [24] GEWEKE J. 1986. Exact Inference in the Inequality Constrained Normal Linear Regression Model. Journal of Applied Econometrics, 1: 127–141. [25] GIAKOUMATOS S, DELLAPORTAS P & POLITIS D. 2005. Bayesian Analysis of the Unobserved ARCH Model using Auxiliary Variable Sampling. Statistics and Computing, 15: 103–112. [26] GILKS WR, RICHARDSON S & SPIEGELHALTER DJ. 1996. Markov Chain Monte Carlo in Practice. London: Chapman and Hall. [27] KOOP G. 1994. Bayesian Semi-nonparametric ARCH Models. Review of Econometrics and Statistics, 76: 176–181. Pesquisa Operacional, Vol. 32(2), 2012 “main” — 2012/8/18 — 14:12 — page 313 — #21 SANDRA C. OLIVEIRA and MARINHO G. ANDRADE 313 [28] KOTZ S, BALAKRISHNAN N & JOHNSON NL. 2000. Continuous Multivariate Distributions, Models and Applications, 2nd edition, Wiley Series in Probability and Statistics: Applied Probability and Statistics. Wiley-Interscience, 1: 1–722. [29] MIGON HS & MAZUCHELI J. 1999. Modelos GARCH Bayesianos: Métodos Aproximados e Aplicações. The Brazilian Review of Econometrics, 19: 111–138. [30] MORETTIN PA. 2008. Econometria Financeira. Edgard Blucher Ltda, São Paulo. [31] NAKATSUMA T. 2000. Bayesian Analysis of ARMA-GARCH Models: A Markov Chain Sampling Approach. Journal of Econometrics, 95: 57–69. [32] NAKATSUMA T & TSURUMI H. 1996. ARMA-GARCH Models: Bayes Estimation versus MLE, and Bayes Non-Stationarity Test. Technical report, Department of Economics, New Brunswick. [33] OLIVEIRA SC. 2005. Modelos Estocásticos com Heterocedasticidade para Séries Temporais em Finanças. Doctoral Thesis, Universidade de São Paulo, Instituto de Ciências Matemáticas e de Computação. [34] PITT M & WALKER S. 2005. Constructing stationary time series models using auxiliary variables with applications. Journal of the American Statistical Association, 100: 554–564. [35] POLASEK W. 2001. MCMC Methods for Periodic AR-ARCH Models. Technical report, University of Basel. [36] POLASEK W & KOZUMI H. 2000. The VAR-VARCH Model: A Bayesian Approach. Technical report, University of Basel. [37] SÁFADI T & ANDRADE MG. 2007. Abordagem Bayesiana de Modelos de Séries Temporais. Mini- curso, Universidade Federal do Rio Grande do Sul, 12a Escola de Séries Temporais e Econometria. [38] SCHWARZ G. 1978. Estimating the Dimension of a Model. The Annals of Statistics, 6: 461–464. [39] SPIEGELHALTER D, BEST N, CARLIN B & VAN DER LINDE A. 2002. Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, 64: 583–639. Pesquisa Operacional, Vol. 32(2), 2012