Variables instrumentales

Transcripción

Variables instrumentales
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Instrumental Variables
Walter Sosa-Escudero
Econ 507. Econometric Analysis. Spring 2009
July 2, 2009
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Motivation
Consistency of OLS crucially depends on E(ui xi ) = 0.
In general, when this assumption does not hold xi will be
endogenous
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Endogeneities: Examples
Simultaneous Equations
The simplest supply and
 s
 qi
qd
 is
qi
demand system:
= xsi β1s + β2s pi + si
= xdi β1d + β2d pi + di
= qid
In equilibrium:
pi = (β2s − β2d )−1 (xdi β1d − xsi β1s + di − si )
In both, supply and demand, pi as an explanatory variable depends
on the error term. Simple OLS is not consistent
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Omitted Variables
Y = β1 X1 + β2 X2 + u
if we ‘omit’ X2
Y = β1 X1 + ν
ν ≡ β1 X2 . Then the predeterminedness assumption will be
violated unless X1 and X2 are orthogonal or β2 = 0.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Explanatory variable measured with error
Ci = β1 + β2 Xi∗ + ui
and assume all assumptions for consistency hold. Now suppose we
observe a ‘noisy’ version of Xi :
Xi = Xi∗ + ωi
ωi is a measurement error. We will assume ωi is iid, with
E(ωi ) = 0, V (ωi ) = σω2 and uncorrelated with Xi∗ and ui .
Replacing Xi∗ = Xi − ωi , the regression model can be written as
Ci = β1 + β2 Xi + νi
with ν = −β2 ωi + ui
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Now
Cov(Xi , νi ) = E(Xi , νi ) = E [(Xi∗ + ωi ) (−β2 ωi + ui )]
= −β2 σω2 6= 0
Then the OLS estimator that regresses Yi on Xi is inconsistent.
We will derive more details in the next homework.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
IV under Exact Identification
Model as before. zi is a vector of K instrumental variables.
1
Linearity: yi = x0i β0 + ui
2
Random sample: {xi , zi , ui } is a jointly i.i.d. process.
3
i = 1, . . . , n.
IV validity 1): rank. E(zi x0i ) = Σzx is a finite, invertible
matrix
4
IV validity 2): orthogonality. E(zik ui ) = 0 for all i and
k = 1, . . . , K.
5
V (zi ui ) = S finite positive definite.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
The Method-of-Moments and the IV estimator.
The orthogonality condition
E(zi ui ) = E zi (yi − x0i β0 ) = 0
is a set of K moment conditions for the K unknown parameters.
Method-of-moments: choose as estimator the values of the
parameters that force the sample moments to hold. β̂IV satisfies:
n
1X
zi (yi − x0i β̂IV ) = 0
n
i=1
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
This is a system of K linear equations with K unknowns with
solution
!−1 n
!
n
X
X
0
β̂IV =
zi x i
zi yi = (Z 0 X)−1 Z 0 Y
i=1
i=1
Existence: asympotically guaranteed by the rank condition.
Exact identification: same number of variables and
instruments (K).
If xi is exogenous, it is a valid vector of instruments for itself.
Then β̂IV = β̂OLS .
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Large sample properties
p
Consistency: β̂IV → β0
Proof (sketch): β̂IV = (Z 0 X)−1 Z 0 Y . Replacing:
β̂IV
= β0 +
= β0 +
(Z 0 X)−1
Z 0u
0 −1 0 Z X
n
p
→ Σzx < ∞
Z u
n
p
→0
using the rank and ortogonality conditions.
Asymptotic normality:
p
d
0−1
(β̂IV − β0 ) → N (0, Σ0−1
zx SΣzx ).
Finish both proofs as a useful homework
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Sources of valid instruments
Instrument Validity
IV validity 1): rank. E(zi x0i ) = Σzx , finite and invertible.
Intuitively this requires the instruments to be correlated with
the variables to be instrumented. This might be checked
empirically. More later.
IV validity 2): orthogonality. E(zik ui ) = 0. Instruments must
be uncorrelated with whatever is not observed that is a
determinant of yi . This depends on things we do not observe
and on how we setup the model.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Where do valid instruments came from?
Complex question. It is an econometric and a modeling issue.
Instrument for prices in the demand function: price
determinants related to the supply side (costs).
Angrist and Krueger (1991): wages as a function of
education. Education endogenous. Instrument: month of
birth!: individuals born in the first months of the year may
abandon school earlier, then they should have less education
than the rest. Validity?
Growth regression: What is truly exogenous for
growth?(Durlauf, 2001).
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
IV: the Overidentified Case
Suppose now that there are p > K instruments. Then the MM
logic implies that
n
1X
zi (yi − x0i β̂IV ) = 0
n
i=1
is a system of p linear equations with K unknowns.
If we only care about consistency there is something obvious
we can do.
Less obvious?
We will explore a consistent and hopefully more efficient
strategy.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Model as before. zi0 is a vector of p > K instruments.
1
2
3
Linearity: yi = x0i β0 + ui
i = 1, . . . , n.
Random sample: {xi , zi0 , ui } is a jointly i.i.d. process.
IV validity 1): rank. E(zi0 x0i ) = Σz 0 x is a p × K matrix,
exists, is finite, an has full column rank.
4
IV validity 2): orthogonality. E(zi0 ui ) = 0 for all i.
5
V (zi0 ui ) = σ 2 E(zi0 zi0 ) finite positive definite.
0
Walter Sosa-Escudero
Instrumental Variables
that
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Variables instrumentales: p > K instrumentos
zi0 vector de p > K instrumentos.
Podria descartar p − K instrumentos. Voy a obtener un
estimador consistente.
Es facil probar que cualquier combinacion lineal de VI’s es un
instrumento valido: podria combinar los p instrumentos de
alguna manera, para obtener K:
Z = Z 0A
0 ,A
con Zn×K , Zn×p
p×K . El estimador seria:
β̂IV = (Z 0 X)−1 Z 0 Y
tal como en el caso anterior.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Problema: ¿como elegir A?
Resultado: La combinacion lineal de VI’s que tiene varianza
asintotica minima corresponde a:
A = (Z 00 Z 0 )−1 Z 00 X
Entonces, el estimador de VI optimo es:
β̂IV = (Z 0 X)−1 Z 0 X
con
Z = Z 0 A = Z 0 (Z 00 Z 0 )−1 Z 00 X = PZ 0 X
Reemplazando, y dado que PZ 0 es simetrica:
β̂IV
= (Z 0 X)−1 Z 0 X
= (X 0 PZ 0 X)−1 X 0 PZ 0 Y
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Dado que PZ 0 es idempotente y simetrica:
β̂IV
= (X 0 PZ 0 X)−1 X 0 PZ 0 Y
= (X 0 PZ0 0 PZ 0 X)−1 X 0 PZ0 0 PZ 0 Y
= (X ∗0 X ∗ )−1 X ∗0 Y ∗
con X ∗ ≡ PZ 0 X y Y ∗ ≡ PZ 0 Y .
Entonces: el estimador de VI optimo para el caso de p > K
instrumentos se puede computar en dos etapas.
1
Obtener X ∗ y Y ∗ . Pregunta: ¿que es intuitivamente esto?
2
Correr MCO con X ∗ y Y ∗ .
A este estimador de VI optimo se lo llama estimador de minimos
cuadrados en dos etapas.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Variables instrumentales: cuestiones operativas
Si E(xi ui ) = 0, trivialmente zi = xi y β̂V I = β̂M CO .
K variables explicativas, solo 1 de ellas endogena: x1i
exogena, x2i endogena. zi puede incluir x1i . Falta al menos 1
1
instrumento. Trivialmente, en la primera etapa, x1∗
i = xi
(porque?).
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Variables instrumentales: resultados de muestra finita
La propiedad clave del metodo de VI es consistencia. ¿Que sucede
en muestras finitas?
VI es sesgado (Sawa, 1962).
El sesgo se aproxima al de MCO cuando el R2 entre los
instrumentos y las variables endogenas tiende a cero (Bound, Jaeger
y Baker, 1995).
Si la correlacion entre los instrumentos y las variables endogenas es
baja, una pequeña correlacion entre los instrumentos y el error
puede causar una gran inconsistencia en VI (BJB, 1995).
En muestras grandes, p grande es mejor (porque?), pero en
muestras chicas, p grande sesga a VI en la direccion de MCO.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Test de Hausman
Es
•
•
•
un test de ‘endogenidad’ (cuidado, no es tan asi)
Bajo H0 , β̂M CO es consistente y eficiente.
Bajo HA , β̂M CO es inconsistente.
β̂IV es siempre consistente.
−1
H = (β̂M CO − β̂IV )0 V (β̂IV ) − V (β̂M CO )
(β̂M CO − β̂IV )
tiene distribucion asintotica χ2 (K) bajo H0 .
Caso particular: un solo coeficiente β1
H=
(β̂1,M CO − β̂1,IV )2
(V (β̂1,IV ) − V (β̂1,M CO ))1/2
Walter Sosa-Escudero
∼ N (0, 1)
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Test de Restricciones de Sobreidentificacion
Si p > K hay mas instrumentos que los necesarios.
‘Validez’:
• Z no esta correlacionado con u
• Especificacion correcta: los Z que no pertenecen a la regresion
efectivamente no pertenecen.
Test: nR2 ∼ χ2 (p − k)
R2 , de regresar: r̂ = Zπ + residuo
r̂ ≡ y − x0 β̂V I
R2 6= 0 : instrumentos correlacionados con el error o modelo mal
especificado.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Teoria asintotica para estimadores MV
Consistencia
β̂V I
= (X 0 PZ 0 X)−1 X 0 PZ 0 Y
= (X 0 PZ 0 X)−1 X 0 PZ 0 (X 0 β0 + u)
= β0 + (X 0 PZ 0 X)−1 X 0 PZ 0 u
= β0 + (X 0 Z(Z 0 Z)−1 Z 0 X)−1 X 0 Z(Z 0 Z)−1 Z 0 u
!−1
X 0 Z Z 0 Z −1 Z 0 X
X 0 Z Z 0 Z −1 Z 0 u
= β0 +
n
n
n
n
n
n
•
A partir de aqui los dejo solos. Usar los supuestos y la estrategia que usamos en la clase anterior.
Walter Sosa-Escudero
Instrumental Variables
Endogeneities
IV under Exact Identification
IV: the Overidentified Case
Normalidad Asintotica
Del resultado anterior:
√ n β̂V I −β0 =
X 0Z
n
Z 0Z
n
−1
Z 0X
n
!−1
X 0Z
n
Z 0Z
n
• A partir de aqui, tal como lo hicimos para MCO, hay que establecer normalidad asintotica de
el teorema de Slutzky, se los dejo como ejercicio.
El resultado es:
√ d
n β̂V I − β0 → N (0, V )
con
−1
V = σ 2 E(xz 0 )E(zz 0 )−1 E(zx0 )
Walter Sosa-Escudero
Instrumental Variables
−1 √ Z 0u
n
n
√ Z0 u
n n y utilizar

Documentos relacionados