The second theorem addresses the use of the estimated factors to forecast yt+ j • Intuitively, the uniform consistency of Pt suggests that these estimated factors can in effect be treated as the true factors for the purposes of forecasting У t Two complications arise however. The first is that, if the parameters in (2.2) evolve over time and var(AjSt)=0(l), then is not consistently estimable even were known. Here, we provide formal results only for the case of no time variation in the forecasting equation, i.e. /3t=/3.
The second complication arises when the true number of factors is unknown, as is the case in practice. We therefore consider the problem of the estimation of the number of factors, q, that enter the forecasting equation using an information criterion. The information criterion is of the form, where o^(q) = SSR(q)/T, where SSR(q) is the sum of squared residuals from estimation of (2.2) by OLS using q estimated factors. The function g(T) is the penalty function, for example g(T)=lnT/T for the Bayes Information Criterion (BIC). The information criterion estimate of r, r, solves mini <q<kICq*

The following theorem provides sufficient conditions for forecasts based on P to be uniformly consistent for forecasts based on F^. In (2.2), the efficient forecast of yt+ ^ given past (yt, Xt) (and given 0t=j3) is 0’F^. Theorem 2 implies that this efficient forecast can be achieved (in a mean-square sense) asymptotically even if the factors, and indeed the number of factors, are unknown. Theorem 2(a) states that forecast efficiency can be achieved even if “too many” factors are estimated, and that the overestimation introduces no additional error asymptotically. In practice, however, one might be concerned about the effect of estimating more coefficients than are needed, so it might be desirable to use an information criterion to reduce the number of factors. Theorem 2(b) provides conditions under which doing so produces an efficient forecast and moreover provides a consistent estimate of the number of factors.