Variables instrumentales
Transcripción
Variables instrumentales
Endogeneities IV under Exact Identification IV: the Overidentified Case Instrumental Variables Walter Sosa-Escudero Econ 507. Econometric Analysis. Spring 2009 July 2, 2009 Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Motivation Consistency of OLS crucially depends on E(ui xi ) = 0. In general, when this assumption does not hold xi will be endogenous Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Endogeneities: Examples Simultaneous Equations The simplest supply and s qi qd is qi demand system: = xsi β1s + β2s pi + si = xdi β1d + β2d pi + di = qid In equilibrium: pi = (β2s − β2d )−1 (xdi β1d − xsi β1s + di − si ) In both, supply and demand, pi as an explanatory variable depends on the error term. Simple OLS is not consistent Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Omitted Variables Y = β1 X1 + β2 X2 + u if we ‘omit’ X2 Y = β1 X1 + ν ν ≡ β1 X2 . Then the predeterminedness assumption will be violated unless X1 and X2 are orthogonal or β2 = 0. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Explanatory variable measured with error Ci = β1 + β2 Xi∗ + ui and assume all assumptions for consistency hold. Now suppose we observe a ‘noisy’ version of Xi : Xi = Xi∗ + ωi ωi is a measurement error. We will assume ωi is iid, with E(ωi ) = 0, V (ωi ) = σω2 and uncorrelated with Xi∗ and ui . Replacing Xi∗ = Xi − ωi , the regression model can be written as Ci = β1 + β2 Xi + νi with ν = −β2 ωi + ui Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Now Cov(Xi , νi ) = E(Xi , νi ) = E [(Xi∗ + ωi ) (−β2 ωi + ui )] = −β2 σω2 6= 0 Then the OLS estimator that regresses Yi on Xi is inconsistent. We will derive more details in the next homework. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case IV under Exact Identification Model as before. zi is a vector of K instrumental variables. 1 Linearity: yi = x0i β0 + ui 2 Random sample: {xi , zi , ui } is a jointly i.i.d. process. 3 i = 1, . . . , n. IV validity 1): rank. E(zi x0i ) = Σzx is a finite, invertible matrix 4 IV validity 2): orthogonality. E(zik ui ) = 0 for all i and k = 1, . . . , K. 5 V (zi ui ) = S finite positive definite. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case The Method-of-Moments and the IV estimator. The orthogonality condition E(zi ui ) = E zi (yi − x0i β0 ) = 0 is a set of K moment conditions for the K unknown parameters. Method-of-moments: choose as estimator the values of the parameters that force the sample moments to hold. β̂IV satisfies: n 1X zi (yi − x0i β̂IV ) = 0 n i=1 Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case This is a system of K linear equations with K unknowns with solution !−1 n ! n X X 0 β̂IV = zi x i zi yi = (Z 0 X)−1 Z 0 Y i=1 i=1 Existence: asympotically guaranteed by the rank condition. Exact identification: same number of variables and instruments (K). If xi is exogenous, it is a valid vector of instruments for itself. Then β̂IV = β̂OLS . Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Large sample properties p Consistency: β̂IV → β0 Proof (sketch): β̂IV = (Z 0 X)−1 Z 0 Y . Replacing: β̂IV = β0 + = β0 + (Z 0 X)−1 Z 0u 0 −1 0 Z X n p → Σzx < ∞ Z u n p →0 using the rank and ortogonality conditions. Asymptotic normality: p d 0−1 (β̂IV − β0 ) → N (0, Σ0−1 zx SΣzx ). Finish both proofs as a useful homework Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Sources of valid instruments Instrument Validity IV validity 1): rank. E(zi x0i ) = Σzx , finite and invertible. Intuitively this requires the instruments to be correlated with the variables to be instrumented. This might be checked empirically. More later. IV validity 2): orthogonality. E(zik ui ) = 0. Instruments must be uncorrelated with whatever is not observed that is a determinant of yi . This depends on things we do not observe and on how we setup the model. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Where do valid instruments came from? Complex question. It is an econometric and a modeling issue. Instrument for prices in the demand function: price determinants related to the supply side (costs). Angrist and Krueger (1991): wages as a function of education. Education endogenous. Instrument: month of birth!: individuals born in the first months of the year may abandon school earlier, then they should have less education than the rest. Validity? Growth regression: What is truly exogenous for growth?(Durlauf, 2001). Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case IV: the Overidentified Case Suppose now that there are p > K instruments. Then the MM logic implies that n 1X zi (yi − x0i β̂IV ) = 0 n i=1 is a system of p linear equations with K unknowns. If we only care about consistency there is something obvious we can do. Less obvious? We will explore a consistent and hopefully more efficient strategy. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Model as before. zi0 is a vector of p > K instruments. 1 2 3 Linearity: yi = x0i β0 + ui i = 1, . . . , n. Random sample: {xi , zi0 , ui } is a jointly i.i.d. process. IV validity 1): rank. E(zi0 x0i ) = Σz 0 x is a p × K matrix, exists, is finite, an has full column rank. 4 IV validity 2): orthogonality. E(zi0 ui ) = 0 for all i. 5 V (zi0 ui ) = σ 2 E(zi0 zi0 ) finite positive definite. 0 Walter Sosa-Escudero Instrumental Variables that Endogeneities IV under Exact Identification IV: the Overidentified Case Variables instrumentales: p > K instrumentos zi0 vector de p > K instrumentos. Podria descartar p − K instrumentos. Voy a obtener un estimador consistente. Es facil probar que cualquier combinacion lineal de VI’s es un instrumento valido: podria combinar los p instrumentos de alguna manera, para obtener K: Z = Z 0A 0 ,A con Zn×K , Zn×p p×K . El estimador seria: β̂IV = (Z 0 X)−1 Z 0 Y tal como en el caso anterior. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Problema: ¿como elegir A? Resultado: La combinacion lineal de VI’s que tiene varianza asintotica minima corresponde a: A = (Z 00 Z 0 )−1 Z 00 X Entonces, el estimador de VI optimo es: β̂IV = (Z 0 X)−1 Z 0 X con Z = Z 0 A = Z 0 (Z 00 Z 0 )−1 Z 00 X = PZ 0 X Reemplazando, y dado que PZ 0 es simetrica: β̂IV = (Z 0 X)−1 Z 0 X = (X 0 PZ 0 X)−1 X 0 PZ 0 Y Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Dado que PZ 0 es idempotente y simetrica: β̂IV = (X 0 PZ 0 X)−1 X 0 PZ 0 Y = (X 0 PZ0 0 PZ 0 X)−1 X 0 PZ0 0 PZ 0 Y = (X ∗0 X ∗ )−1 X ∗0 Y ∗ con X ∗ ≡ PZ 0 X y Y ∗ ≡ PZ 0 Y . Entonces: el estimador de VI optimo para el caso de p > K instrumentos se puede computar en dos etapas. 1 Obtener X ∗ y Y ∗ . Pregunta: ¿que es intuitivamente esto? 2 Correr MCO con X ∗ y Y ∗ . A este estimador de VI optimo se lo llama estimador de minimos cuadrados en dos etapas. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Variables instrumentales: cuestiones operativas Si E(xi ui ) = 0, trivialmente zi = xi y β̂V I = β̂M CO . K variables explicativas, solo 1 de ellas endogena: x1i exogena, x2i endogena. zi puede incluir x1i . Falta al menos 1 1 instrumento. Trivialmente, en la primera etapa, x1∗ i = xi (porque?). Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Variables instrumentales: resultados de muestra finita La propiedad clave del metodo de VI es consistencia. ¿Que sucede en muestras finitas? VI es sesgado (Sawa, 1962). El sesgo se aproxima al de MCO cuando el R2 entre los instrumentos y las variables endogenas tiende a cero (Bound, Jaeger y Baker, 1995). Si la correlacion entre los instrumentos y las variables endogenas es baja, una pequeña correlacion entre los instrumentos y el error puede causar una gran inconsistencia en VI (BJB, 1995). En muestras grandes, p grande es mejor (porque?), pero en muestras chicas, p grande sesga a VI en la direccion de MCO. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Test de Hausman Es • • • un test de ‘endogenidad’ (cuidado, no es tan asi) Bajo H0 , β̂M CO es consistente y eficiente. Bajo HA , β̂M CO es inconsistente. β̂IV es siempre consistente. −1 H = (β̂M CO − β̂IV )0 V (β̂IV ) − V (β̂M CO ) (β̂M CO − β̂IV ) tiene distribucion asintotica χ2 (K) bajo H0 . Caso particular: un solo coeficiente β1 H= (β̂1,M CO − β̂1,IV )2 (V (β̂1,IV ) − V (β̂1,M CO ))1/2 Walter Sosa-Escudero ∼ N (0, 1) Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Test de Restricciones de Sobreidentificacion Si p > K hay mas instrumentos que los necesarios. ‘Validez’: • Z no esta correlacionado con u • Especificacion correcta: los Z que no pertenecen a la regresion efectivamente no pertenecen. Test: nR2 ∼ χ2 (p − k) R2 , de regresar: r̂ = Zπ + residuo r̂ ≡ y − x0 β̂V I R2 6= 0 : instrumentos correlacionados con el error o modelo mal especificado. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Teoria asintotica para estimadores MV Consistencia β̂V I = (X 0 PZ 0 X)−1 X 0 PZ 0 Y = (X 0 PZ 0 X)−1 X 0 PZ 0 (X 0 β0 + u) = β0 + (X 0 PZ 0 X)−1 X 0 PZ 0 u = β0 + (X 0 Z(Z 0 Z)−1 Z 0 X)−1 X 0 Z(Z 0 Z)−1 Z 0 u !−1 X 0 Z Z 0 Z −1 Z 0 X X 0 Z Z 0 Z −1 Z 0 u = β0 + n n n n n n • A partir de aqui los dejo solos. Usar los supuestos y la estrategia que usamos en la clase anterior. Walter Sosa-Escudero Instrumental Variables Endogeneities IV under Exact Identification IV: the Overidentified Case Normalidad Asintotica Del resultado anterior: √ n β̂V I −β0 = X 0Z n Z 0Z n −1 Z 0X n !−1 X 0Z n Z 0Z n • A partir de aqui, tal como lo hicimos para MCO, hay que establecer normalidad asintotica de el teorema de Slutzky, se los dejo como ejercicio. El resultado es: √ d n β̂V I − β0 → N (0, V ) con −1 V = σ 2 E(xz 0 )E(zz 0 )−1 E(zx0 ) Walter Sosa-Escudero Instrumental Variables −1 √ Z 0u n n √ Z0 u n n y utilizar