12 Panel Data Models

Two data dimensions:

\begin{align*} \color{red}{i = 1, 2, \ldots, N} \color{black}{} \quad &\text{(cross-section units)} \\ \color{blue}{t = 1, 2, \ldots, T} \color{black}{} \quad &\text{(time periods)} \end{align*}

Observations from the same units

Usually: \color{red}{N >> T}

Observed (controlled) heterogeneity:

y_{it} = x'_{it}\beta + \color{red}{\underbrace{z'_i \gamma}_{\alpha_i}} \color{black}{} + u_{it}

\Rightarrow individual characteristics are assumed to be constant in time

dealing with \alpha_i by

dummy variables
substracting the means

12.1 Fixed effect model

\alpha_i is “deterministic”: Dummy variable model

\begin{align} y_{it} &= x'_{it}\beta + \color{red}{\alpha_i} \color{black}{ + u_{it} } \\ &= x'_{it}\beta + \gamma_2\color{red}{ D_{2i}} \color{black}{ + \gamma_3} \color{red}{ D_{3i} } \color{black}{ + \cdots + \gamma_n} \color{red}{D_{ni} } \color{black}{ + u_{it} } \end{align}

\color{blue}{u_{it} \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \sigma^2)}

subtracting individual specific means (“entity-demeaned”) yields:

y_{it} \color{red}{- \bar{y}_i} \color{black}{ = (x_{it}} \color{red}{- \bar{x}_i} \color{black}{)' \beta + u_{it} - \bar{u}_i}

with \bar{y}_i = T^{-1} \sum_{t=1}^{T} y_{it}

\Rightarrow individual effects cancel out

both approaches yield the same results

individual and time effects (two-way effects):

y_{it} = x'_{it} \beta + \alpha_i + \lambda_t + u_{it} \Rightarrow including also time dummies

12.2 Random effects model

If \color{red}{\alpha_i \stackrel{\text{iid}}{\sim} \mathcal{N}(0, \sigma^2_\alpha)} then the GLS estimator is obtained from

\begin{align*} y_{it} \color{red}{- \theta} \color{black}{ \bar{y}_i = (x_{it}} &\color{red}{- \theta}\color{black}{ \bar{x}_i)' \beta + u_{it} - \theta \bar{u}_i} \\ \\ \text{where} \quad \theta = \ &1 - \sqrt{\frac{ \sigma^2_u}{T \sigma^2_\alpha + \sigma^2_u} } \end{align*}

Estimation of \sigma^2_\alpha is based on the fact that

\text{var}(\bar{u}_i) = \text{var} \left( \frac{1}{T} \sum_{t=1}^{T} u_{it} \right) = \color{blue}{ \sigma^2_\alpha + \frac{1}{T} \sigma^2_u}

such that

\begin{align} \hat{\sigma}^2_\alpha &= \frac{1}{N} \sum_{i=1}^{N} \overbrace{ (\bar{y}_i - \bar{x}'_i \hat{\beta})^2}^{\bar{u}_i} - \frac{1}{T} \hat{\sigma}^2_u \\ \hat{\sigma}^2_u &= \frac{1}{N(T - 1) \color{blue}{- k}} \color{black}{\sum_{i=1}^{N} \sum_{t=1}^{T} \hat{u}^2_{it} }\\ \hat{u}_{it} &= y_{it} - \bar{y}_i - (x_{it} - \bar{x}_i)' \widehat\beta \end{align}

Goodness of fit

Some software packages compute the dummy-variable R^2, i.e., the regression R^2 that includes the dummies as ‘explanatory’ variables

The dummy variables do not ‘explain’ anything but just represent heterogeneity \Rightarrow R^2 is too large

Good practice to present the “within-R^2”, that is, the R^2 of the demeaned (within) regression

Interpretation of the panel data model. Assume that \alpha_i is correlated with \bar{x}_i such that \alpha_i = \lambda \bar{x}_i + \mu_i yielding

y_i = \underbrace{(x_{it} - \bar{x}_i)'}_{\text{"short-run"}} \beta \ + \!\!\!\!\!\! \underbrace{\bar{x}'_i}_{\text{"long-run"}}\!\!\!\!\!\!\!\gamma + \mu_i + u_{it}

where \color{blue}{\gamma = \beta + \lambda}

Estimating this model yields \hat\beta_{FE} as an estimator for the “short-run” coefficients. The random effects model implies \lambda = 0 and therefore \color{red}{\beta = \gamma}.

12.3 Model specification

a) Tests for individual specific effects: Null hypothesis:

H_0 : \color{red}{\alpha_1 = \alpha_2 = \cdots = \alpha_N = \mu}

F-statistic:

F = \frac{(S_0 - S_1) / (N - 1)}{S_1 / (NT - N - K)} \sim F (\color{red}{N - 1} \color{black}{,} \color{blue}{ NT - N - K} \color{black}{)}

where: S_0 and S_1 are RSS of the pooled OLS and FE estimation

b) Hausman test: Deciding between random and fixed effects:

H_0: random effects or \color{red}{E(x_{it}\alpha_i) = 0}

Under the null hypothesis \widehat\beta_{FE} and \tilde\beta_{RE} are “similar” or E(\widehat\beta_{\text{FE}} - \widetilde\beta_{\text{RE}}) = 0

Hausman-Wu Test: test of \delta = 0 in

\widetilde y_{it} = \widetilde x'_{it}\beta + \color{blue}{(x_{it} - \bar{x}_i)' \delta} \color{black}{ \ + \ \epsilon_{it}} with \tilde y_{it} and \tilde x_{it} as GLS-transformed variables.