A Comprehensive Look at The Empirical Performance of Equity Premium Prediction

A comprehensive interpretation of paper A Comprehensive Look at The Empirical Performance of Equity Premium Prediction

数据

被解释变量就是股票超额收益(equity premium)

Our dependent variable is always the equity premium, that is, the total rate of return on the stock market minus the prevailing short-term interest rate.

解释变量有以下三种:与股票特征相关的变量,与利率相关的变量以及市场上的宏观变量:

  • 股票特征相关变量 stock characteristics

    • 股息 Dividends : d/p (dividend price ratio) & d/y (dividend yield)

      Dividends are 12-month moving sums of dividends paid on the S&P 500 index.

    • 收入 Earnings :e/p (earnings price ratio) & d/e (dividend payout ratio)

      Earnings are 12-month moving sums of earnings on the S&P 500 index.

    • 股票方差 Stock Variance (svar): 股票方差是以标准普尔500指数日回报的平方之和计算的

      Stock Variance is computed as sum of squared daily returns on the S&P 500.

    • 横截面溢价 Cross-Sectional Premium (csp):测度了高 beta 股票和低 beta 股票的相对估值。

    • 账面价值 BookValue

    • 公司发行活动 Corporate Issuing Activity

      • Net Equity Expansion (ntis) is the ratio of 12-month moving sums of net issues by NYSE listed stocks divided by the total end-of-year market capitalization of NYSE stocks.
      • Percent Equity Issuing (eqis), is the ratio of equity issuing activity as a fraction of total issuing activity.
  • 利率相关变量

    • 国库券 Treasury Bills (tbl)
    • 长期收益率 Long Term Yield (lty)
    • 公司债券收益率 Corporate Bond Returns
      • 企业债券收益率 Corporate Bond Yields
      • 默认收益率差 Default Yield Spread (dfy)
      • 违约收益率 Default Return Spread (dfr)
    • 通胀率 Inflation (infl)
  • 宏观变量

    • 投资对资本的比率 Investment to Capital Ratio (i/k)
    • i/k = 总投资 / 整个经济体的总资本

方法

  • Simple univariate model

  • ‘‘Kitchen Sink’’ Regression (all)

    包括上述所以的变量(它不包括cay,部分原因是数据的可用性有限)。

  • Consumption, wealth, income ratio (cay)

    where c is the aggregate consumption, a is the aggregate wealth, and y is the aggregate income.

    Because the cay is constructed using look-ahead (in-sample) estimation regression coefficients, we also created an equivalent measure that excludes advance knowledge from the estimation equation and thus uses only prevailing data. In other words, if the current time period is ‘s’, then we estimated using only the data up to ‘s’ through

    This measure is called caya (‘‘ante’’) to distinguish it from the traditional variable cayp constructed with look-ahead bias (‘‘post’’).

  • model selection (ms)

    如果有 $k$ 个变量,就会有 $2^k$ 个随机的模型组合方式。在每个时期 $t$,选出一个最好的模型 —— 标准是 OOS 的预测误差最小(minimum OOS prediction errors)。

    The latter two models, cay and ms, are revised every period, which render IS regressions problematic. This is also why we did not include caya in the kitchen sink specification.

预测

  1. 首先,将 T 个样本分为 m 个样本内数据和 p 个样本外数据;
  2. 其次,为了预测第 m+1 期的值,我们要用前 m 期共 m-1 个有效数据回归,得到系数 α 和 β;
  3. 最后,代入第 m 期的解释变量求第 m+1 期的 r;
  4. 然后,预测第 m+2 期的 r, 此时第 m+ 1 期的是所有真实数据都已知了,用前 m+1 期共 m 个有效数据回归,再得到系数 α 和 β(与第一次的可能不同);代入第 m+1 期的解释变量求第 m+2 期的 r;
  5. 重复以上过程,直到把 q 个样本外预测做完。

实证分析 Empirical Procedure

OOS Statistic

eN 表示 OOS 与历史均值(无条件预测)之间的误差;eA 表示 OOS 与 OLS 回归模型(条件预测)之间的误差。

For our encompassing tests in Section 6, we compute

重抽样 Bootstrap

论文

We then generate 10,000 bootstrapped time series by drawing with replacement from the residuals. The initial observation—preceding the sample of data used to estimate the models—is selected by picking one date from the actual data at random. This bootstrap procedure not only preserves the autocorrelation structure of the predictor variable, thereby being valid under the Stambaugh (1999) specification, but also preserves the cross-correlation structure of the two residuals.

我们 Moving Block Bootstrap

Bootstrap

它的核心思想是通过使用数据本身,从而估计从该数据中计算出来的统计数据的变化。现代计算机强大的计算能力使得该方法的实现非常简单。

放到参数估计的上下文中,Bootstrap 意味着我们仅仅通过使用手头上的样本数据(样本数据 “自力更生”)而不对总体的分布做任何假设(比如传统方法中的正态分布假设),来计算样本统计量在估计总体统计量时的误差。

Bootstrap 原则指出:Bootstrap 样本统计量 u* 围绕原始样本统计量 u 的变化(简称为 u* 的变化)是 原始样本统计量 u 围绕总体统计量 v 的变化(简称为 u 的变化) 的一个很好的近似。

为了计算 u* 的变化,我们只需要对原始样本数据进行大量的可置换重采样。

Block Bootstrap

The block bootstrap is used when the data, or the errors in a model, are correlated. In this case, a simple case or residual resampling will fail, as it is not able to replicate the correlation in the data. The block bootstrap tries to replicate the correlation by resampling instead blocks of data.

由于时间序列存在自相关性,因此在重采样的时候应使用 Block Bootstrap。顾名思义,Block Bootstrap 就是每次从序列中有放回的抽取一个由连续 n 个相邻数据点构成的 block(大小由 block size 决定)。主流的 Block Bootstrap 算法包括以下三种:

  • Moving Block Bootstrap(Kunsch 1989, Liu and Singh 1992);
  • Circular Block Bootstrap(Politis and Romano 1992);
  • Stationary Bootstrap(Politis and Romano 1994)。

下图说明了 Moving Block Bootstrap(MBB)的原理:

Moving Block Bootstrap(MBB)的原理

从上图的原理可知,MBB 最大的问题是对于原始序列首尾两端样本采样不足。为了规避这个问题,Circular Block Bootstrap(CBB)被提出。顾名思义,它是将原始数据的首尾相连,构成一个圆圈(Circular 一词的出处),然后再按照给定的 block size 进行重采样,避免首尾两端采样不足。

最后一种方法是 Stationary Bootstrap(SB),它和前两者最大的区别是使用非固定的 block size。SB 中的 block size 满足几何分布;作为输入而给定的 block size 是它的期望。该方法得到的 bootstrapped 样本可以更好的满足平稳性的要求,因此当原始时间序列难以满足平稳性时有更好的效果。

Statistical Power

Our article entertains both IS and OOS tests. Inoue and Kilian (2004) show that the OOS tests used in this paper are less powerful than IS tests. We believe this is the wrong way to look at the issue of power for two reasons:

  • In our forecasting regression context, OOS performance just happens to be one natural and especially useful diagnostic statistic. It can help determine whether a model is stable and wellspecified, or changing over time, either suddenly or gradually. It is unreasonable to propose a model if the IS performance is insignificant, regardless of its OOS performance.
  • All of the OOS tests in our paper do not fail in the way the critics suggest. Low-power OOS tests would produce relatively poor predictions early and relatively good predictions late in the sample. Instead,allofourmodelsshowthe opposite behavior—good OOS performance early, bad OOS performance late.

Estimation Period

  • The first begins OOS forecasts 20 years after data are available;
  • The second begins OOS forecast in 1965 (or 20 years after data are available, whichever comes later);
  • The third ignores all data prior to 1927 even in the estimation.

结论

  • 大多数模型是不稳定的、甚至是虚假的。即使单个变量模型在某段时间内具有良好的样本外预测能力,这种预测能力也很难持续,比如经济结构不稳定或结构变化。
  • 到 2005 年末为止,大多数模型无论是在 IS 还是在 OOS 中都丧失了统计显著性。在 OOS 中,大多数模型不仅不能在统计意义上或经济意义上打败无条件基准水平(历史均值),而且表现的还不如它。如果我们把目光集中在 1975 年以后的时间里,我们会发现,没有哪一个模型在 OOS 中有突出的表现,而且也几乎没有可接受的 IS 显著水平。
  • 当我们把视角从研究者转向为投资者时,我们相信有证据表明这些模型并不能给今天的投资提供支持或建议。

参考

用 Bootstrap 进行参数估计大有可为

Author

Haojun(Vincent) Gao

Posted on

2019-05-01

Updated on

2022-02-22

Licensed under

Comments