2 Data and analytical approach

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Our data come from two sources: the master files of the Canadian Census for 1981, 1991 and 2001, a 20% sample of the Canadian population, and the U.S. 5% Public-Use Microdata Sample files for 1980, 1990 and 2000. The very large samples available from these sources overcome the limitations inherent in alternative data sources for studying small populations, e.g., single mothers, such as the U.S. Current Population Survey or the Canadian Survey of Consumer Finances and its successor, the Survey of Labour and Income Dynamics. Since data on income, earnings and labour force attachment are for the previous calendar year, we identify our results for Canada with the years 1980, 1990 and 2000 and for the United States with the years 1979, 1989 and 1999. Since the observation years roughly approximate business cycle peaks, we can be reasonably confident that our conclusions are not confounded with business cycle effects.4 We restrict our sample to single mothers under age 65, with one or more children aged 18 or younger.

Our leading question is to what extent can the rise in the employment rate and earnings of single mothers be explained by changes in their demographic composition? To establish the contribution of compositional changes, we consider three outcomes. First we examine rates of labour force participation (indicated by the presence of positive earnings during the previous year) and, then, of (log) annual earnings among those with positive earnings. Earnings are expressed in constant 1999 and 2000 dollars.5

Since trends in annual earnings reflect changes in both wage rates and labour supply (hours and weeks worked), we also include estimates of changes in (log) weekly earnings. We take advantage of the fact that the change in the mean of log annual earnings is simply the sum of the change in the mean of log weekly earnings and the mean of log weeks worked. Comparison of the two sets of results allows us to determine the extent to which changes in annual earnings and the components affecting those changes were the result of changes in labour supply (weeks worked) or changes in earnings per week. Ideally, our estimates for average weekly earnings would also control for hours worked per week. Unfortunately, the Canadian census data do not allow for accurate estimates of hours worked. Instead, we include a control for whether the respondent usually worked on a full- time or part-time basis.

The independent variables include a set of dummies for age (divided into five-year cohorts); education (less than high school, high school completion, any postsecondary, university degree); number of children 18 and under, and dummies indicating presence of a child less than 6; immigrant status; and, marital status (never married, divorced or separated, widowed).6 Historically, non- whites have made up a substantial share of the U.S. population and a small but rising share of the Canadian population. We include standard census indicators for 'race' (Whites, Blacks, Asians, Hispanics and Other) for the United States and for 'visible minority' status (Whites, Blacks, Asians and Other) for Canada.

To determine the extent to which changes in the employment and earnings of single mothers can be explained by changes in their demographic composition, we employ a standard Oaxaca-Blinder (Oaxaca 1973; Blinder 1973) decomposition. The objective is to 'decompose' the change in our dependent variables (employment and earnings) into three portions: the portion that can be 'explained' by changes in demographic composition (as indicated by changes in the means of the explanatory variables in our model), the portion that is 'unexplained' (as indicated by changes in the associated coefficient estimates) and their 'joint effect' or interaction.

For each of our dependent variables, we begin by estimating separate regression models for each time period. For example, we have regression models Y1 = a + B1iX1i + e1 for earnings at time 1 and Y2 = a + B2iX2i + e2 for earnings at time 2. The difference in the means between Ȳ1 and Ȳ2Ȳ ) that can be attributed to the differences in the means between X1i and X2i is called the 'explained' component in means-coefficients analysis. The explained portion identifies the contribution of changes in measured characteristics such as education, age, number of children, etc. The remaining portion of ΔȲ represents changes that are 'unexplained' by changes in the values of the independent variables, that is, by changes in the coefficients. The unexplained portion contains the effects of all unmeasured variables that are not part of the model including, but not limited to, behavioural changes due to social policy reforms. The unexplained portion associated with changes in the coefficients identifies the share of change that could potentially be accounted for by policy reforms; but our method does not allow us to isolate the magnitude of the policy impact relative to other omitted variables or to real changes in returns to education and other characteristics included in the models

The size of the explained component may vary greatly, depending on whether B1i or B2i, are used as weights (Blau and Graham 1990). The differences in the explained components derived from B1i or B2i, equals the joint effect of means and coefficients captured by the interaction term. A very large interaction term implies the results are conditional on the choice of weights and no unique interpretation of the shares allocated to the explained and unexplained components is possible. As our results show, our findings are largely unaffected by this problem.

Because our employment variable (share with positive earnings) is dichotomous, we made separate estimates with both the Oaxaca-Blinder method and the Even-MacPherson (1994) approach. The former uses an ordinary least-squares (OLS) regression model to estimate the probability of being employed; the latter uses a logit model. The advantage of the Oaxaca-Blinder decomposition is that it can decompose the overall change into three components: (a) the share due to changes in composition (the Xs); (b) the share due to changes in the coefficients (the effect of Xs); and, (c) the joint effect (or interaction) of changes in composition and coefficients. Its limitation, however, lies in the well-known problem of fitting OLS models for a dichotomous dependent variable. When the outcome is highly skewed (e.g., less than 20% in a category) the results are subject to 'floor' or 'ceiling' effects and can generate predicted probabilities outside the 0–1 range. The Even- MacPherson approach is statistically more appropriate for dichotomous outcomes; however, unlike the Oaxaca-Binder method, it is unable to identify the contribution due to the means-coefficient interaction. Moreover, the Even-MacPherson approach lacks the ready interpretation of the linear probability (OLS) approach. Since the distribution of our dichotomous outcome is well within the acceptable range for the OLS approach (Moffit 1999), and since both techniques yield substantively identical results, we present the OLS results for ease of interpretation.


4 The U.S. observation years are somewhat superior in this respect. The observation year for the 1990 Census is 1989, before the onset of the recession that began in the middle of 1990.

5 Since our focus is on change in log earnings (which approximates a percentage change) rather than the absolute change, earnings are expressed in national currencies without adjustment for differences in purchasing power.

6 Our marital status indicator is less than ideal since, among the never married, we cannot separate the previously single from those previously in common-law unions.