A framework for discussing earnings generation

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

This section sets out a simple framework for considering earnings generation and its relationship to cognitive skills. This will prove useful in understanding the role of these skills in immigrant and Canadian-born earnings. The framework is based on the one used by Green and Riddell (2003) in a discussion of literacy and earnings among non-immigrants. They distinguish among attributes (personal characteristics that can be acquired by the worker and enhance individual earnings), skills (personal characteristics that aid in productivity in specific tasks and which can be acquired by the worker) and abilities (innate, productive characteristics). In this taxonomy, skills are a subset of attributes, where the former focus on facility with specific tasks while the latter also includes characteristics such as persistence and willingness to follow orders. Abilities are similar to attributes but are innate, while attributes are acquirable. In this paper, we group together attributes and skills and refer to them simply as skills. Thus, the key distinction is that between skills and abilities.

Assume, for the moment, there are three skills a worker can possess, and workers can possess them in varying amounts. We begin with three skills because it allows us to emphasize key points. The framework can easily be extended to address the more likely scenario that there are more than three. Individual earnings are determined according to some function of the skills an individual possesses and puts into use, as follows:

Formula 1

where Eiare earnings for individual i, Gki is the amount of skill k that person i sells in the market, and ei is a disturbance term that is independent of the skills. The disturbance term captures either measurement error in earnings or individual idiosyncratic events that are independent of the skill levels. The earnings generation function f(.) can be viewed as derived ultimately from marginal product conditions related to an overall production function that is separable in other (non-skill) inputs. Alternatively, it can be seen as representing worker capacities to capture rents from firms (e.g., Bowles, Gintis and Osborne, 2001). We remain agnostic on which interpretation is correct. In either case, by characterizing the f(.) function, we can learn about the importance of the various skills and how they interact in earnings generation. To help focus ideas, we will think of G1 as cognitive skills of the type measured in literacy, numeracy and problem-solving tests, G2 as other (perhaps manual) skills that are not captured in such tests and that might be acquired through work experience, and G3 as non-cognitive characteristics such as persistence.

The earnings function in equation (1) is quite general. However, it will prove easier to work with a more specific functional form. In our empirical investigations, we find that the data is well characterized by first or second order polynomials in observable variables. Thus, for empirical purposes we work with:1

Formula 1¹

We are interested in characterizing the f(.) function and obtaining estimates of the γ and d parameters. Doing so will provide information about the relative importance of the various skills in earnings generation and whether the skills are complements or substitutes in generating earnings.

Characterizing the earnings function would be relatively straightforward if we observed the skills, Gki. Typically, of course, we do not observe them. What we do observe are some of the inputs used in generating the skills. To see how they enter our framework, consider a set of skill production functions:

Formula 2

where k indexes the skill type, edn corresponds to a set of dummy variables representing different levels of formal schooling, exp is years of work experience and θkis an ability specific to the production of the k-th skill. Of course, an h function could be constructed such that a skill corresponds one for one with an ability (e.g., persistence may be an innate characteristic rather than something that can be produced).

As with the f(.) function, our discussion of the features of the hk(.) functions is simplified by considering a quadratic version:

Formula 2¹

where the e, s and θ subscripts on the α's correspond to experience, schooling and ability variables, respectively. Note that edn corresponds to a vector of education dummy variables and thus the α's correspond to either scalar parameters or vectors of parameters as appropriate.

If we do not observe the Gki's directly, we can obtain an estimating equation by substituting equation (2) into (1). This yields a reduced form specification for earnings as a function of schooling and experience. The ability variables are unobserved and thus end up in the error term. An inspection of equations (1') and (2') makes it clear that the coefficient on an observable variable such as educational attainment in the reduced form earnings equation will consist of a combination of the α, γ and δ parameters. More specifically, such a coefficient reflects the combination of how that covariate contributes to production of each of the skills and how those skills contribute to earnings generation.

We are interested in how much we can learn about the structure of the functions in equations (1) and (2) when we observe some but not all of the skills. Labelling the set of observed skills G1, and using it to refer to a vector of cognitive skills, we obtain a quasi-reduced form earnings regression that includes G1 (the observed cognitive skills), experience and schooling variables. Thus, our general estimating regression is of the form:

Formula 3

where G1i corresponds to our measures of cognitive skills, edn is again a vector of education dummy variables, the β's are either scalars or vectors of parameters as appropriate and u is an error. Notice that the error term will include interactions of the ability variables and the observables. This means that some type of random coefficients estimator may be appropriate. As a first step, we will ignore this latter complication and present results based on mean regression (though we do correct the standard errors for general forms of heteroskedasticity). Given the model set out above, these estimates are not fully efficient and can provide only part of the story of how the various skills interact. Nonetheless, as we shall see, there is still a great deal we can learn from mean regressions, and they have the advantage of being easy to interpret and compare to the existing literature.

The framework set out to this point could be considered the relevant earnings generation model for a Canadian born individual. We assume that immigrants use the same sets of skills to generate earnings in the Canadian labour market. Immigrants could differ from the Canadian born in both of the main building blocks of the model: in the returns they obtain from a given set of skills (i.e., immigrants could have a different f(.) function); and in the production functions for creating individual skills (i.e., immigrants could have different h(.) functions).

Differences in the f(.) function between immigrants and the Canadian born correspond to discrimination in this model since they represent differences in earnings between immigrants and Canadian born workers who are in fact equally productive. Thus, if we could directly observe all relevant skills, we could determine whether shortfalls in earnings for immigrants relative to the Canadian born arise from discrimination. It is tempting to think that differences between immigrants and the Canadian born in the coefficients on the non-interacted G1i terms (i.e., β5 and β6) can provide direct evidence on whether discrimination exists (i.e., on whether immigrant and Canadian born workers with the same observed cognitive skills are paid differently). However, if interactions of G1i with the exp and edn variables are significant then this interpretation need not hold. A non-zero interaction of, for example, exp and G1i would imply both that the f(.) function involves an interaction of G1i and some other skill (say, G2i) and that exp helps to produceG2i. In that case, the return to G1i is a complicated function that varies with different levels of exp and β5 and β6 represent the effect of G1ion earnings at the base level for experience. Consequently, one could observe different coefficients related to G1ibetween immigrants and the Canadian born because exp is differentially productive in creating other skills for the two groups rather than because of discrimination. Thus, the coefficients β5 and β6 provide information about discrimination only if the coefficients on the interactions of G1i and other variables (i.e., β7 and β8) are zero.

Given results in earlier research both in Canada and in other countries, it seems very likely that the skill production functions differ between immigrants and the Canadian born. Thus, for immigrants, we rewrite these production functions as:

Formula 4

where edn and exp correspond to education and experience obtained in Canada, while fedn and fexp represent foreign-acquired education and experience. A standard claim in the immigrant earnings literature is that credentials recognition problems and mismatches in technological requirements imply that education and experience obtained in most other countries will not be as productive in Canada as Canadian education and experience. If this is not true, then equation (4) collapses to equation (2) and differences in earnings between immigrants and the Canadian born arise either because they have different levels of schooling, experience and ability or because there is discrimination. Often, studies do not have particularly good measures of fedn and fexp so it is difficult to check directly for differences in returns on these skill inputs. However, the IALSS data contains direct questions on education obtained abroad and permits calculation of age at arrival as a continuous variable. This means we can construct reliable versions of both fedn and fexp. With this information the immigrant earnings specification, with G1iincluded, becomes:

Formula 5

Equation (5) includes a wide variety of interactions of fexp and fedn with each other and other variables.2 Thus, the specification allows for complex interactions among foreign obtained inputs in the production of skills. For example, the interaction of fexp and exp represents the possibility that immigrants are better able to translate their source country experience into earnings after they have more experience in Canada.

A key conclusion of the previous literature on immigrant earnings in Canada is that more recent cohorts of immigrants have poorer earnings when compared to both earlier immigrants and Canadian-born workers with the same measured levels of education and experience. In our framework, that would arise either because of an increase in discrimination against more recent cohorts (for example, because they have a larger visible minority component) or because more recent cohorts have lower skills. With a single cross-section, we cannot separate effects of changes across immigrant cohorts from the effects of gradual adaptation to the Canadian labour market by new immigrants. The Canadian experience coefficients we estimate for immigrants will effectively combine true assimilation effects and the impact on earnings of differences across cohorts. Although this means we cannot distinguish between these features of immigrant adaptation, we are still able to learn much about the immigrant experience and how it relates to measured skills.

Cognitive skills play an important role in this analysis. As stated earlier, we assume that the IALSS test scores provide direct measures of these skills and thus we can examine G1i and its interactions with inputs such as experience and education to learn about the role of various skills in earnings generation. In equation (5), the interactions of cognitive skills with fexp and fedn are of special importance. Nonzero coefficients on these interactions may reflect impacts of cognitive skills in helping immigrants translate their foreign-obtained human capital into the Canadian labour market. Note that in our framework, such an effect would amount to improved cognitive skills leading to more production of G2i and G3i with given levels of fexp and fedn and would be captured by including G1i in the G2i and G3i skill production functions.

To this point we have not mentioned a key component of the immigrant assimilation experience: language skills. Using a variety of approaches to address potential endogeneity and measurement error issues, papers by Chiswick (1991), Chiswick and Miller (1995), Dustmann and Fabbri (2003), and Berman, Lang, and Siniver (2003) find substantial effects of host country language acquisition on immigrant earnings. In our framework, fluency in the host country language can enter either as a skill in its own right (i.e., we would add G4i to equation (1)) and/or as an input to the generation of other skills. In the latter case, employers care only about the usable amounts of each skill that a worker possesses. Thus, an engineer who is well trained but cannot communicate with his or her employer or fellow employees would be counted as having zero usable engineering skills. Language ability in French or English then enters as an input into the production of usable skills, with greater language ability leading to higher usable skills for any given level of other inputs. Both Chiswick (1991) and Dustmann and Fabbri (2003) include self-reported reading skills along with host country fluency in earnings regressions, interpreting the reading and speaking fluencies as separate skills. Chiswick (1991), using a sample of illegal immigrants to the US, finds that reading fluency has a much stronger effect on earnings than speaking fluency when both are included. Dustmann and Fabbri (2003), using United Kingdom immigrant data, find that reading fluency is a more important determinant of employment, but speaking fluency is a more important determinant of earnings. Following these and other authors, we control for language proficiency in our analysis.

Finally, the framework is useful for considering endogeneity issues. In either equation (4) or (5), the error term will contain ability factors and, potentially, the interaction of those factors with skill inputs such as education and experience. As in standard analyses of the endogeneity of schooling, if those ability factors are also inputs into choices about levels of schooling and skills then G1i and edniare endogenous. It is interesting to consider the assumptions under which such an endogeneity problem does not exist. Assume that cognitive ability is only an input into generating cognitive skills (i.e., it enters the G1i production function but not those for G2i and G3i)and other types of ability do not help produce cognitive skills. Thus, for example, social ability does not help produce cognitive skills and cognitive ability does not help produce social skills. In that case, θ1i does not enter the error term – it is fully captured by the included G1i variable. Then, assuming the various types of ability are uncorrelated is sufficient to imply that G1i is exogenous. Further, if schooling choices are related only to generation of cognitive skills (e.g., schooling may help create social skills but that is not why people choose to go to school) then education is also exogenous. These assumptions are strong but no stronger than what is assumed when researchers include measures of ability in earnings regressions to address the schooling endogeneity problem, and we do not view them as completely unreasonable.


Notes

  1. We omit higher order interaction terms because they do not enter our specifications.
  2. We have, however, left out further interactions of Canadian obtained education with source country variables since they turn out not to be important in our empirical analysis.