2 Data
Archived Content
Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.
2.1 Longitudinal Worker File
The dataset used in this study is the Longitudinal Worker File (LWF). This is a 10% random sample of all Canadian workers, constructed by integrating data from four sources: the Record of Employment (ROE) files of Human Resources and Social Development Canada; the T1 and the T4 files of Canada Customs and Revenue Agency; and the Longitudinal Employment Analysis Program of Statistics Canada's Business and Labour Market Analysis Division.
The ROE indicates the reason for a job interruption; among those reasons is maternity leave. Maternity leave is protected by the employment standards legislations in all Canadian jurisdictions. It is designed to give mothers the possibility to temporarily withdraw from the labour force because of pregnancy and to allow them some time to recuperate after childbirth. We identify the motherhood status of an employed woman through her maternity leave.
An immediate question is how well the above birth concept captures the actual birth. To address this, we use data from Statistics Canada's Survey of Labour and Income Dynamics, from 1993 to 2004, to calculate the number of births given by employed women who were 18- to 49-year-olds. These are contained in Column 2 of Appendix Table A.1. These can be compared with Column 1, the estimated number of mothers from the LWF for the same period using the maternity-leave based concept of childbirth. Notice that Column 1 contains the estimated number of women who took maternity leaves (and hence the number of women who gave birth), while Column 2 represents the estimated number of children born from employed women; therefore, it is natural there are some variations between Columns 1 and 2 because of sampling errors, as well as non-sampling errors such as multiple births. Nevertheless, the two columns are fairly close to each other, and thus, we are reasonably confident that our maternity-leave based concept of childbirth captures the actual births very well.
The LWF offers two main advantages over other data sources. On the one hand, the large sample size allows us to establish a group of women—the control group in the language of program evaluation—who did not give birth during a certain period of time. This control group of women helps us to estimate the counterfactual earnings profile of the mothers. On the other hand, since the data span more than 20 years, they allow us to study how the effects of motherhood on employment and earnings have changed over time.
The LWF also provides more reliable and accurate information than other data sources. For instance, since employers must register with Canada Customs and Revenue Agency and issue each employee a T4 slip that summarizes earnings received during a year, the earnings data should be free from recall errors of individual workers. In terms of information on job separation, the Employment Insurance Act requires an employer to issue a ROE when an earnings interruption occurs to an employee who works on an insurable job, and a penalty or prosecution under the Employment Insurance Act for non-compliance may apply to employers who fail to issue a ROE and employers who enter a false or misleading reason for a job separation. In addition, the LWF contains employer identifiers. This allows us to distinguish mothers who returned to the same employers from those who did not return to their previous employers after having taken maternity leave.
Like other administrative datasets, the LWF contains little information on workers' individual and family characteristics such as education, work experience, union status, occupation, family income, marital status and so on.6 But the fact that the LWF tracks a nationally representative sample of workers over a very long period of time makes it a unique dataset to examine both the long- and the short-term employment and earnings effects of motherhood.
2.2 Samples of mothers and non-mothers
The lack of certain variables in the data forces us to work with a rather restrictive sample of mothers in order to estimate the motherhood-earnings penalties. To examine the effect of the sampling restrictions, we start with a fairly broad sample of mothers. They consist of those who satisfied the following three conditions: (1) they were aged from 20 to 39 in the year they gave birth (year t); (2) in year t, they were employed before taking maternity leave and experienced no other job separation (other than that due to maternity leave); and, (3) in year t-1, they had worked and had experienced no childbirth. The comparison group of women satisfied all of the above conditions, except that they did not give birth in year t.
We impose the first restriction to avoid potential problems associated with teenage mothers, who may have different earnings profiles than other women on account of, for example, missing/delayed education and thus had lowered/delayed human capital investment. The second and the third restrictions enable us to focus on women who had held a paid job for a period of time before giving birth such that they were qualified for the maternity benefits and were covered by mandatory job protection. These restrictions allow us to construct 19 cohorts of mothers (from the 1984, the 1985, up to the 2003 cohort). They represent about 86% of all employed women who became mothers in the 1984-to-2003 period.
For women in the comparison group, we also assign each of them to a cohort of non-mothers. For example, the 1984 cohort of non-mothers includes women who were aged from 20 to 39 in 1984 (year t), who were employed in both 1983 (year t-1) and 1984, and who did not experience any job separation in these two years. We use the broad sample primarily to produce some time series measures of post-childbirth employment, job mobility and earnings for Canadian mothers and the corresponding measures for the comparison groups. Table 1 provides some simple statistics on three representative cohorts (1984, 1994 and 1999). Additional characteristics for these mothers and other women are contained in Appendix Table A.3.
Table 1
The two samples of mothers and other women, mean age and number of
observations
Our second (narrow) sample of mothers consists of those who had one or two childbirths in the following five-year periods: 1991 to 1995, 1992 to 1996, 1993 to 1997, 1994 to 1998, 1995 to 1999 and 1996 to 2000.7 They must not have given birth to a child in years other than the specified period. For example, the 1991 cohort of mothers consists of those who had only one birth in 1991 and those who had one birth in 1991 and another in the following four-year period (1992 to 1995). A mother from this cohort must not have given birth to a child beyond the 1991-to-1995 period. That is, they experienced no childbirth from 1983 to 1990 and from 1996 to 2004.
We further restrict the above cohorts of mothers to have been born between 1954 and 1968, so that it was possible for them to give birth in the first year that they were observed (1983), and their earnings were unlikely to have been affected by retirement considerations during the last year of observation (2004). We also restrict them to have been in the labour market, have had positive earnings in each year, and had not experienced any permanent layoff from 1983 to 2004.8
We could have included more cohorts of mothers in our sample, but we have chosen to focus only on the 1991-to-1995, 1992-to-1997 up to the 1996-to-2000 cohorts. While this reduces our sample of mothers, it enables us to estimate the effect of childbirth on the earnings of mothers for up to nine post-childbirth years under our empirical framework. The choice also allows us to examine the effect of childbirth on the pre-childbirth earnings for up to three years. On the other hand, if we were to have included those who had given birth earlier—for example, those who gave birth between 1988 and 1992—we would only have been able to assess the post-childbirth earnings losses for up to five years. Also, if we were to have included a later cohort— for example, those who gave birth between 1997 and 2001—we would only have been able to assess pre-childbirth earnings for up to two years.
A potential problem associated with the birth-year restriction on the mothers is that some of the mothers would be relatively 'old' when they were first observed (in 1983)—for example, those who were born in 1954 would have been 29 years old when they were first observed. Some of these older women might have already given birth one or more times before they were observed. So our captured births could be of the second or the third births by a woman, and their earnings could have been affected by those possible, yet unobserved, births. But if we were to select only those who were unlikely to have given birth before they were first observed, say those who were 15- to 19-year olds in the first year of observation, we would have obtained a small and non-representative sample of Canadian mothers. They would become non-representative once the other restriction— they must have been in the labour market in all observable years—is imposed, because those who had started working at ages 15 to 19 and continued in the labour market every year for the subsequent 20 years were likely to be low-educated workers. Hence, we have chosen to include women as old as 29 when they were first observed in our sample. We will check how this may affect the robustness of our results later.
There are several reasons why we require the sampled women to be in the labour market every year. The main reason is that by imposing such a restriction, we can overcome some disadvantages of the data. It is well known that in any earnings study, education and work experience are key explanatory variables. Yet we do not have any information on these variables in LWF. If we require the sampled women to be in the labour market each year, then their age can be used as a proxy for work experience. As well, if a woman worked continuously, her level of schooling would largely remain constant, and the effect of education on earnings can be taken account of with a fixed-effects model.
Another reason is that we would like to capture all childbirths by employed women. If we allow some women to have been out of the labour market for a few years, then it is possible that they gave birth once or more during these years: with LWF, we would not be able to capture these births, and yet our estimates are likely to be affected by those births if the effect of childbirth on earnings lasts for a few years. In addition, we also want to purge the effects of permanent layoff on earnings of potential mothers. Previous research shows that a permanent layoff due to plant closure or mass layoff reduces a worker's (including mother's) annual earnings in the years before, during and after the displacement.9 Hence, by imposing the 'no permanent layoff' restriction, we will end up with an earnings profile that is free from the effect of job displacement. Finally, by requiring mothers to be in the labour market every year, we may avoid the potential biases caused by sample attrition or missing data.
When the above restrictions are imposed, we obtain our narrow sample of mothers. The six cohorts of mothers summed to 7,086 women, who gave a total of 9,440 births in the 1991-to-1996 period; and among the 9,440 births, 3,714 were from mothers who had one birth only (see Table 1). In terms of the number of mothers captured, the 7,086 mothers represented about 13% of all employed women (at the same age) who gave birth in the 1991-to-1996 period.10 Table 1 also shows some basic information for the corresponding comparison group (other women). They consist of women who (1) were born in the 1954-to-1968 period, and (2) had positive earnings, did not experience any permanent layoff and did not give birth from 1983 to 2004.11
6 The Survey of Labour and Income Dynamics contains this information, but the six-year panels are not long enough to allow us to examine the earnings changes several years prior to and several years after childbirth.
7 The number of women who gave birth three or more times in a five-year period was very small. We exclude them in order to facilitate the analysis.
8 'Permanent layoff' is defined as a layoffin which the worker did not return to the previous employer in the year of layoff and the year thereafter.
9 The case for Canada is investigated in Morissette, Zhang and Frenette (2007).
10 See Columns 3 and 4 in Appendix Table A.1 and Appendix Table A.4.
11 The age distributions of the mothers and the comparison group are slightly different in the narrow sample. We attempted to establish a different comparison group by restricting their year of birth to be in the 1957-to-1968 period, which produced similar mean and median ages to those of the treated group. With this alternative comparison group, the regression results changed little compared with those presented in the paper, where the years of birth for the control group are restricted to be the same (from1954 to 1968) as those of the treated group.
- Date modified: