4. Data

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

4.1 Sources

The data for the analysis come from numerous sources. The main source is the 1993 Survey of Innovation and Advanced Technology (SIAT). This is a unique, confidential and proprietary data set that surveyed approximately 2,500 plants covering the entire manufacturing sector across Canada. SIAT collected information on various aspects ofinnovation and adoption of advanced- manufacturing technologies. Specifically, this survey reported information on each plant's adoption of 22 advanced-manufacturing technologies within 6 different technology groups. The technologies are 'general-purpose technologies,' in that they are not specific to any particular industry, but can be used in the production process of any industry. 9 These technologies are listed in Table 1, along with the incidence of use in 1993 and 1984.

Table 1
List of advanced manufacturing technologies and incidence of technology use by plants

Table 2
Descriptive statistics of variables

A critical piece of information provided in SIAT is each plant's time of adoption of each of the 22 technologies. This information permits the construction of panel data from the given cross- sectional data set. As a result, a panel-data set consisting of three periods—1984 to 1986, 1987 to 1989 and 1990 to 1992—is constructed. Use of time intervals, rather than use of each year, reduces the effects of recall bias caused by the retrospective nature of panel data, in addition to leaving plenty of regional variations in each period. 10

Additional information on plant characteristics is obtained from the Annual Survey of Manufactures (ASM). The ASM is a longitudinal database of Canadian manufacturing plants that annually collects information for almost all manufacturing plants. Some 1,902 plants out of the 2,500 plants surveyed in the SIAT are also surveyed in the ASM. Detailed information on plants—such as geographical location, employment, outputs, country of ownership, plant age and multi-plant status—are taken from the ASM for these 1,902 plants.

Table 3
Variable names and definitions

To measure the characteristics of the regional economies, both the ASM and the Census of Population are used. All variables characterizing local manufacturing activities are calculated at the census-division level as to where a plant is located, utilizing information from the ASM. Variables characterizing regional demography are also calculated at the census-division level, using information from the Census of Population. 11

Other supplementary data come from the National Input–Output Tables from 1983 to 1992. We use the National Input–Output Tables at the most detailed level available, w , which consists of 145 3- and 4-digit Standard Industrial Classification (SIC) industries. The tables record the value ofintermediate inputs and outputs each industry buys and sells to other industries. Based on this information, forward and backward linkages are calculated.

The units of geography employed in this paper are economic regions and census divisions. Province, economic region and census division are geographical units, in descending order of size. An economic region is a statistically categorized region, comprising one or more census divisions, but confined within a province or territory. 12,13

4.2 Construction of variables

Measurement of similarities across industries in terms ofpattern of input purchases

The extent of knowledge spillovers from local prior adopters of technology τ to potential adopters may depend on the 'relatedness' between the two industries. One of the common criticisms of earlier geographic studies in the use of highly aggregated industry units (typically a 2-digit SIC scheme) to empirically define 'related' industries is that 2-digit SIC may not be appropriate to capture the similarities ofindustries. 14 For instance, SIC 39 includes Broom, brush and mop industry (in SIC 399) and Jewellery and silverware industry (in SIC 392), which are highly dissimilar in nature.

In the context of studying the effects of knowledge spillovers on technology adoption, the relatedness across industries can be better measured by the similarities in input purchases, which would mimic the similarities in input processes more closely than by standard industry classification. In order to measure the similarities in input purchases, we utilize information on the patterns ofinput purchases from the National Input–Output Tables at 145 3- and 4-digit SIC industries. For each industry i , we calculate its correlation ρ ij with every other industry j in terms of input purchases and then categorize each and every industry into one of three groups, based on the correlation. Industries with a correlation equal or greater than 0.50 are categorized as 'similar' industries, industries with a correlation between 0.50 and 0.20 are categorized as 'moderately similar' industries, and industries with a correlation ofless than 0.20 are categorized as 'different' industries. 15 For each industry, the groups of similar, moderately similar and different industries are neither symmetric nor of equal size. 16 Descriptive statistics on the industry categories based on input purchases are compared with the 2-digit SIC industry categories in Table B.1 in Appendix B.

Technology users

For each technology τ, T τ iRt is the number of plants in industry i in region R that have already adopted technology τ as of period t.

where

w is a plant weight that is provided in the survey to make the sample representative of the population. The unit of geography used in the calculation of the number of technology users is the economic region. Since information on technology adopters is drawn from SIAT, it is important to have enough observations in each cell to keep them representative of the population. Therefore, the number of technology adopters is calculated at the level of the economic region, denoted as R, rather than at the finer level ofthe census division, denoted as r.

The number of plants in similar industries in the same economic region that have already adopted technology τ as of time t, is calculated simply as

where i and j indexes industry, and F represents a group of industries that are categorized as similar industries for each industry i . The number of plants in the moderately similar industries and in the different industries which have adopted technology τ by time t , PriorAdopter _ . Mod Similarτ iRt and PriorAdopter _ Differentτ iRt , are calculated likewise, respectively.

Appendix A provides further details about the construction of other variables.

9 . The concept of General Purpose Technology (GPT) used here is not as broad as the one used in Bresnahan and Trajtenberg (1995).

10 . Plants may round the number of years a given technology has been in use. For example, plants may report 5 years instead of 4 or 6 years, and 10 years instead of 9 or 11 years. Indeed, there are peaks at 5 and 10 years, and a lower number of new technology adoptions are reported for 4, 6, 9 and 11 years.

11 . The Census of Population is quinquennial. For each Census, 20% of households receive the 'long questionnaire,' which seeks detailed information on individuals.

12 . In 1991, there were 10 provinces and 2 territories in Canada, with each province and territory being divided into a number of economic regions. There were 68 economic regions, each divided into one or more census divisions. There were 290 census divisions across provinces and territories.

13 . While boundaries of census divisions tend to stay constant over the years, there was a major reconstruction of census divisions in the provinces of Quebec and British Columbia in the late 1980s. In order to consistently measure the effects of regional economies, it is important to have a constant geographic region so that regional variables reflect the economic changes within the region, and not the changes due to the sizing of geographical unit. Thus, a constant census division code based on 1976 has been assigned to all plants in all years using Map Info by matching postal codes.

14. For example, Rosenthal and Strange (2001).

15 . The benchmark for this grouping choice is based on the distribution of correlations. The distribution of correlations exhibits an asymmetric weak tri-modal pattern: a small percentage ofindustries in the high range of correlations; a second group concentrated between 0.20 and 0.50; and the remainder in the lower end of the distribution.

16 . The average size of each group of industries, in terms of the number of 3-digit SIC industries it contains, is presented in Appendix B, Table B.1, and is compared with the size of 2-digit SIC industries.