Decomposition of gender wage inequalities through calibration: Application to the Swiss structure of earnings survey
Section 1. Introduction
Wage discrimination can be based on different criteria, such as gender, race or religion. Gender wage discrimination occurs when a man and a woman receive different remuneration for a job that requires the same qualifications or which implies identical productivity (see, for instance Neumark, 1988; Gardeazabal and Ugidos, 2005). Since a quantification of discrimination is required in order to assess its magnitude, the topic has awakened the interest of statisticians. The original technique proposed by Blinder (1973) and Oaxaca (1973) estimates how much of the difference between the average wages of men and the average wages of women is due to discrimination. However, in general, there is an uneven allocation of women and men among jobs (see, for instance Bielby and Baron, 1986). If the members of one of these two groups, usually women, are concentrated in lower paying jobs, the difference in average wages might not be of great relevance. So instead of analysing the discrimination level in average wages, it might be interesting to see if discrimination occurs uniformly in all types of jobs. A detailed reference of the different statistical papers devoted to the estimation of discrimination can be found in Fortin, Lemieux, and Firpo (2011).
While there are many decomposition methods available in the literature, only two of them will be discussed in this paper. These two methods are not presented in their original forms, but by taking into account survey weights. They are the Blinder-Oaxaca method (hereafter, BO) and the semi-parametric method developed by DiNardo, Fortin, and Lemieux (1996) (hereafter, DFL). Originally, the BO method analysed the difference between the average wages of men and the average wages of women. However, it does not allow for an analysis of the wage differences for other parameters, such as quantiles. The original DFL method addresses this issue. Its starting point is a logistic model where, for each observation, the probability of being a man or a woman is modelled as a function of the observed characteristics. The ratio of these probabilities is used to construct a reweighting factor. Its aim is to approach the distribution of the characteristics of women to the distribution of characteristics of men. By having similar distributions of the characteristics, an estimation of the discrimination level at parameters other than the mean is achievable. However, the reweighting factor may have a large variance in cases where one or more characteristics are good predictors of the gender. Moreover, the reweighted distribution of characteristics of women may not match the distribution of characteristics of men. We address the problems related to the two methods through a calibration approach. The idea behind calibration is the same as that of the DFL method. It consists of approaching the distribution of characteristics of women to that of men, in order to estimate the discrimination level along the entire wage distributions.
The paper is structured as follows: after the definition of the notation in Section 2, the BO decomposition is re-expressed with the use of survey data in Section 3. Sampling weights are taken into account in order to correct for the difference between the sample and the population of interest. Therefore, the decomposition will be termed “weighted BO”. The key concept of women’s counterfactual wage distribution is also presented. It is defined as the wage distribution of women if they had the same characteristics as men. Next, we discuss the use of the counterfactual wage distribution in the wage difference decomposition. In Section 4, the DFL method is developed, again using survey weights. Since the original method does not include survey weights, it will be termed “weighted DFL”. Next in Section 5, a new approach to compute the counterfactual wage distribution is proposed, using the calibration method (Deville and Särndal, 1992). The use of two particular cases of calibration are discussed. These are the linear calibration and the raking-ratio calibration. The first case yields the same result as the weighted BO method for average wages. The second case has a similar approach to the weighted DFL method, but without assuming a logistic model. In other words, the proposed technique can be regarded as a generalization of the two methods discussed above. Section 6 includes an overview of the dataset used as well as descriptive statistics on the observed wages. A brief description of the model used and the results obtained using the discussed methods are presented. Finally, Section 7 summarizes the conclusions and in Appendix B, the computation of the variance of the counterfactual wage is shown.
- Date modified: