Name-based discrimination in U.S. labor markets

1) Return to the “lakisha” dataset. Replicate the three columns of Table 5 in the associated paper using OLS regression. The empirical specification is straightforward to detect from the table. Your results will be close to, but will not exactly match, the results presented in the paper-check for direction and magnitude.

Using the specification in column (1), now include an additional variable for the race of the applicant. Present the output table. In intuitive terms, what is the interpretation of the coefficient on the race variable in this regression (be sure to check how race is defined)?

1) Return to the Card dataset.

  1. Replicate the estimates reported in Column 2 of Table 2 in the working paper I provided previously. (This uses OLS regression. Your coefficients should match to the hundredths with rounding. You have already done this in PS3.)
  2. Perform a Breusch-Pagan test to check for heteroskedasticity in the main (OLS) equation. What do you find? Is heteroskedasticity a problem? If it is, apply the “fix” we discussed in class on the regression in part (a) of this problem and comment on any changes in the coefficients and standard errors.
  3. Perform IV estimation where you use an individual’s location relative to 2 and 4-year colleges as instruments for education. [STATA syntax: ivregress 2sls Y X1 X2 X3…Xk-1 (Xk= Z1 Z2), first] where X1…Xk-1 are exogenous regressors, Xk is the endogenous regressor, and Z1, Z2 are the exogenous instruments.
  4. Present the regression output tables (first and second stages) and provide discussion of the IV results versus the OLS case which does not account for endogeneity (above). Discuss any changes in magnitude of the coefficients and significance levels of parameters of interest (education) in both cases.
  5. Perform an F-test of joint significance for the two instruments in the first stage regression. You will need to run the first stage OLS regression manually (instead of having STATA do it for you as part of the ivregress command). Report the test statistic and whether or not you reject the null hypothesis.