3 Data and methods

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

This study is based on the Statistics Canada 2002 Ethnic Diversity Survey (EDS). The EDS is a national survey of over 42,000 non-Aboriginal Canadian residents aged 15 years or over. The survey was designed to provide information on how Canadians of different ethnic backgrounds interpret and report their ethnicity and how people's backgrounds affect their participation in the social, economic and cultural life in Canada. For these purposes, the EDS covers a wide range of topics, including ethnic ancestry, ethnic identity, place of birth, visible minority status, religion, religious participation, knowledge of languages, family background, social networks, civic participation, interaction with society, attitudes, satisfaction with life, trust and socioeconomic activities. The survey also over-samples non-British/French minority groups and thus obtains relatively large samples to allow comparisons between these minority groups and more established, large ethnic communities in various characteristics.

This study focuses on group differences in obtaining university degrees among the second generation, including Canadian-born children of at least one immigrant parent and those who immigrated to Canada at age 12 or younger. We also include children of Canadian-born parents as the comparison group. Since young adults are more likely to finish university than older people, and ethnic groups differ significantly in age structures, we limit our analysis to a sub- sample of about 6,019 young adults aged from 25 to 34. In our study sample of children of immigrant parents (3,330), about 16% are child immigrants who immigrated to Canada in the 1970s and 1980s. The other 84% were born in Canada to parents who immigrated to Canada before the 1970s. The sample size of children of Canadian-born parents is 2,689.

Within the selected sample, we identify the following 18 source country/region groups among children of immigrant parents, each with a minimum sample size of about 50 persons. The grouping is based on individuals' country of birth for foreign-born youth, mothers' country of birth for Canadian-born youth if the mother was an immigrant, or fathers' country of birth if only the father was an immigrant. These source-region groups include eight non-Western countries/regions: Africa, the Caribbean, Latin America, China (including Hong Kong and Taiwan), the Philippines, India, West Asia/Middle East, and other Asia.

There are also 10 groups from the Western countries: the United States, the United Kingdom, Germany, Italy, Portugal, the Netherlands, other Northern/Western Europe, Eastern Europe, other Europe, other countries (mostly Oceania). See Table 1 for sample size for each identified group. Of these groups, the population composition is much more heterogeneous among some than among others. For instance, about 14% of the African immigrant group reported themselves as Blacks in response to the survey question on visible minority status, 37% reported themselves as other visible minorities, and 49% reported that they had European ethnic origins. About 62% of Caribbean immigrants reported themselves as Blacks and 23% as other visible minorities. By comparison, 96% of the immigrants from China reported themselves as Chinese in response to the survey question on visible minority status, 95% of the immigrants from India reported being South Asians and 84% of the immigrant from the Philippines reported being Filipinos.

In our regression analyses, we include five sets of explanatory variables. The first set is basic demographic variables, including age (ranging from 25 to 34), sex (female=1), family structure, place of residence and generation status. Family structure has four categories: lived mainly with biological parents until age 15, lived mainly with birth mother until age 15, lived mainly with birth father until age 15, and lived with neither birth mother nor birth father until age 15. The place of residence is coded as three categories: large metropolitan areas (the largest 8 metropolitan areas in Canada), small metropolitan areas (the other 18 metropolitan areas with a population of at least 100,000) and non-metropolitan areas. Generation status is coded as four categories: generation 1.5 (those whose age at immigration was from 6 to 12), generation 1.75 (those who immigrated before age 6), second generation (born in Canada with both parents who were immigrants) and generation 2.5 (born in Canada, but with only one immigrant parent). Previous U.S. studies have shown significant differences among these generational groups in adaptive outcomes (Rumbaut 2004).

The second set of variables measures father's and mother's education. For each parent, parental education is coded as four categories: with university degree, some postsecondary education, high school graduation and less than high school graduation.

The third set of variables captures the individual's mother tongue and family language environment and is coded as three categories: mother tongue is either English or French; mother tongue is neither English nor French, but spoke English or French with parents until age 15; and, mother tongue is neither English nor French, and did not speak English or French with parents until age 15.

The fourth set of variables captures what Borjas (1992, 1995) refers to as 'ethnic capital' which is essentially group-level human capital, as measured by the average socioeconomic resources among the generation of the respondents' parents. Following Borjas' approach, we derive the average percentage finishing university degrees and mean earnings for male immigrants, aged from 35 to 50, by country of birth from the 1991 Census. Then we merge these two variables with our EDS data by country of birth of respondents' fathers (or mothers if the father was not an immigrant) country of birth. In our EDS sample, we can identify 76 countries (or regions) of birth based on parents' information. We use this same 76-country grouping in deriving variables from the 1991 Census and in matching the two data sources. For children of Canadian-born parents, father's generation average percentage finishing university degrees and mean earnings were based on a 24-category grouping of ethnicity from the 1991 Census.

The final set of variables is the percentage living in rural areas or small towns (population less than 5,000) among the generation of the respondents' fathers. We follow the same approach used in deriving the above two group-level human capital variables. Since our group-level human capital variables and father's-generation residence are group-level variables, we allow within- group dependence in regression estimates.

We construct both logistic and Ordinary Least Squares (OLS) regression models in order to examine to what extent the above five sets of variables can account for the observed differences in university completion rates among immigrant groups.3 Based on OLS results, we also isolate the respective contribution of the above five sets of explanatory variables to each immigrant group's advantage or disadvantage in the outcome.4

We further examine how the effects of the five sets of variables differ across groups, by running models separately for European and non-European source country/region groups. Following this, we run separate models for the five large source countries/regions: the Caribbean, China, India, the United Kingdom and Italy.5

The EDS is a probabilistic survey, and a survey weight is assigned to each respondent to represent the target population at the national level. This weight is used in all our descriptive results. In our regression models, we standardize this survey weight by dividing it with the average weight among the selected immigrant groups in our study sample. This standardized weight has the advantage of maintaining the same distributions as those of non-standardized weights but of avoiding an overestimation of the critical level (Statistics Canada 2003).


3 Logistic regression is statistically more appropriate for the dichotomous outcome of whether someone finished university. However, Ordinary Least Squares (OLS) regression yields estimates very close to logistic regression results when the distribution of the outcome variable is in the 25%-to-75% range. As shown in the second and third columns of Table 4, the two approaches produce very similar results. The advantage of OLS regression is that it is straightforward to decompose the contribution of each explanatory variable to the 'explained' portion of each group's advantage or disadvantage, i.e., the difference in a group's university completion rate before and after controlling for all explanatory variables.

4 This is done following one variation of the Oaxaca decomposition method (Oaxaca and Ransom 1994). In this approach, the 'explained' component is calculated as the sum of the differences between group means and the means of the pooled sample of all groups, with the differences weighted by the model coefficients of the pooled sample.

5 Since the sample size was relatively small for individual groups and some variables have few cases in some categories, in running these group-specific models we re-code the family structure into two categories (lived with both birth parents versus others), and place of residence into two categories (large urban areas versus others). We also exclude the language variable from the models for immigrants from the Caribbean and the United Kingdom, since these groups have very few cases with mother tongue other than English or French.