Keyword search

Results

All (253)

All (253) (220 to 230 of 253 results)

221. The U.S. manufacturing plant ownership change database: Research possibilities Archived
Surveys and statistical programs – Documentation: 11-522-X19990015650
Description:
The U.S. Manufacturing Plant Ownership Change Database (OCD) was constructed using plant-level data taken from the Census Bureau's Longitudinal Research Database (LRD). It contains data on all manufacturing plants that have experienced ownership change at least once during the period 1963-92. This paper reports the status of the OCD and discuss its research possibilities. For an empirical demonstration, data taken from the database are used to study the effects of ownership changes on plant closure.
Release date: 2000-03-02
222. Creation of an occupational surveillance system in Canada: Combining data for a unique Canadian study Archived
Surveys and statistical programs – Documentation: 11-522-X19990015652
Description:
Objective: To create an occupational surveillance system by collecting, linking, evaluating and disseminating data relating to occupation and mortality with the ultimate aim of reducing or preventing excess risk among workers and the general population.
Release date: 2000-03-02
223. Meta analysis of bioassay data from the U.S. national toxicology program Archived
Articles and reports: 11-522-X19990015654
Description:
A meta analysis was performed to estimate the proportion of liver carcinogens, the proportion of chemicals carcinogenic at any site, and the corresponding proportion of anticarcinogens among chemicals tested in 397 long-term cancer bioassays conducted by the U.S. National Toxicology Program. Although the estimator used was negatively biased, the study provided persuasive evidence for a larger proportion of liver carcinogens (0.43,90%CI: 0.35,0.51) than was identified by the NTP (0.28). A larger proportion of chemicals carcinogenic at any site was also estimated (0.59,90%CI: 0.49,0.69) than was identified by the NTP (0.51), although this excess was not statistically significant. A larger proportion of anticarcinogens (0.66) was estimated than carcinogens (0.59). Despite the negative bias, it was estimated that 85% of the chemicals were either carcinogenic or anticarcinogenic at some site in some sex-species group. This suggests that most chemicals tested at high enough doses will cause some sort of perturbation in tumor rates.
Release date: 2000-03-02
224. An evaluation of data fusion techniques Archived
Surveys and statistical programs – Documentation: 11-522-X19990015666
Description:
The fusion sample obtained by a statistical matching process can be considered a sample out of an artificial population. The distribution of this artificial population is derived. If the correlation between specific variables is the only focus the strong demand for conditional independence can be weakened. In a simulation study the effects of violations of some assumptions leading to the distribution of the artificial population are examined. Finally some ideas concerning the establishing of the claimed conditional independence by latent class analysis are presented.
Release date: 2000-03-02
225. Integrated media planning through statistical matching: Development and evaluation of the New Zealand panorama service Archived
Surveys and statistical programs – Documentation: 11-522-X19990015670
Description:
To reach their target audience efficiently, advertisers and media planners need information on which media their customers use. For instance, they may need to know what percentage of Diet Coke drinkers watch Baywatch, or how many AT&T customers have seen an advertisement for Sprint during the last week. All the relevant data could theoretically be collected from each respondent. However, obtaining full detailed and accurate information would be very expensive. It would also impose a heavy respondent burden under current data collection technology. This information is currently collected through separate surveys in New Zealand and in many other countries. Exposure to the major media is measured continuously, and product usage studies are common. Statistical matching techniques provide a way of combining these separate information sources. The New Zealand television ratings database was combined with a syndicated survey of print readership and product usage, using statistical matching. The resulting Panorama service meets the targeting information needs of advertisers and media planners. It has since been duplicated in Australia. This paper discusses the development of the statistical matching framework for combining these databases, and the heuristics and techniques used. These included an experiment conducted using a screening design to identify important matching variables. Studies evaluating and validating the combined results are also summarized. The following three major evaluation criteria were used; accuracy of combined results, statibility of combined results and the preservation of currency results from the component databases. The paper then discusses how the prerequisites for combining the databases were met. The biggest hurdle at this stage was the differences between the analysis techniques used on the two component databases. Finally, suggestions for developing similar statistical matching systems elsewhere will be given.
Release date: 2000-03-02
226. Fusion of data and estimation by entropy maximization Archived
Surveys and statistical programs – Documentation: 11-522-X19990015672
Description:
Data fusion as discussed here means to create a set of data on not jointly observed variables from two different sources. Suppose for instance that observations are available for (X,Z) on a set of individuals and for (Y,Z) on a different set of individuals. Each of X, Y and Z may be a vector variable. The main purpose is to gain insight into the joint distribution of (X,Y) using Z as a so-called matching variable. At first however, it is attempted to recover as much information as possible on the joint distribution of (X,Y,Z) from the distinct sets of data. Such fusions can only be done at the cost of implementing some distributional properties for the fused data. These are conditional independencies given the matching variables. Fused data are typically discussed from the point of view of how appropriate this underlying assumption is. Here we give a different perspective. We formulate the problem as follows: how can distributions be estimated in situations when only observations from certain marginal distributions are available. It can be solved by applying the maximum entropy criterium. We show in particular that data created by fusing different sources can be interpreted as a special case of this situation. Thus, we derive the needed assumption of conditional independence as a consequence of the type of data available.
Release date: 2000-03-02
227. Spatial statistics and environmental epidemiology using routine data Archived
Surveys and statistical programs – Documentation: 11-522-X19990015674
Description:
The effect of the environment on health is of increasing concern, in particular the effects of the release of industrial pollutants into the air, the ground and into water. An assessment of the risks to public health of any particular pollution source is often made using the routine health, demographic and environmental data collected by government agencies. These datasets have important differences in sampling geography and in sampling epochs which affect the epidemiological analyses which draw them together. In the UK, health events are recorded for individuals, giving cause codes, a data of diagnosis or death, and using the unit postcode as a geographical reference. In contrast, small area demographic data are recorded only at the decennial census, and released as area level data in areas distinct from postcode geography. Environmental exposure data may be available at yet another resolution, depending on the type of exposure and the source of the measurements.
Release date: 2000-03-02
228. Diagnostics for comparison and combined use of diary and interview data from the U.S. consumer expenditure survey Archived
Surveys and statistical programs – Documentation: 11-522-X19990015686
Description:
The U.S. Consumer Expenditure Survey uses two instruments, a diary and an in-person interview, to collect data on many categories of consumer expenditures. Consequently, it is important to use these data efficiently to estimate mean expenditures and related parameters. Three options are: (1) use only data from the diary source; (2) Use only data from the interview source; and (3) use generalized least squares, or related methods, to combine the diary and interview data. Historically, the U.S. Bureau of Labor Statistics has focused on options (1) and (2) for estimation at the five or six-digit Universal Classification Code level. Evaluation and possible implementation of option (3) depends on several factors, including possible measurement biases in the diary and interview data; the empirical magnitude of these biases, relative to the standard errors of customary mean estimators; and the degree of homogeneity of these biases across strata and periods. This paper reviews some issues related to options (1) through (3); describes a relatively simple generalized least squares method for implementation of option (3); and discussed the need for diagnostics to evaluate the feasibility and relative efficiency of the generalized least squares method.
Release date: 2000-03-02
229. A method of generating a sample of artificial data from several existing tables: Application based on the residential electric power market Archived
Surveys and statistical programs – Documentation: 11-522-X19990015690
Description:
The artificial sample was generated in two steps. The first step, based on a master panel, was a Multiple Correspondence Analysis (MCA) carried out on basic variables. Then, "dummy" individuals were generated randomly using the distribution of each "significant" factor in the analysis. Finally, for each individual, a value was generated for each basic variable most closely linked to one of the previous factors. This method ensured that sets of variables were drawn independently. The second step consisted in grafting some other data bases, based on certain property requirements. A variable was generated to be added on the basis of its estimated distribution, using a generalized linear model for common variables and those already added. The same procedure was then used to graft the other samples. This method was applied to the generation of an artificial sample taken from two surveys. The artificial sample that was generated was validated using sample comparison testing. The results were positive, demonstrating the feasibility of this method.
Release date: 2000-03-02
230. Using meta-analysis to understand the impact of time-of-use rates Archived
Surveys and statistical programs – Documentation: 11-522-X19990015692
Description:
Electricity rates that vary by time-of-day have the potential to significantly increase economic efficiency in the energy market. A number of utilities have undertaken economic studies of time-of-use rates schemes for their residential customers. This paper uses meta-analysis to examine the impact of time-of-use rates on electricity demand pooling the results of thirty-eight separate programs. There are four key findings. First, very large peak to off-peak price ratios are needed to significantly affect peak demand. Second, summer peak rates are relatively effective compared to winter peak rates. Third, permanent time-or-use rates are relatively effective compared to experimental ones. Fourth, demand charges rival ordinary time-of-use rates in terms of impact.
Release date: 2000-03-02

Data (7)

Data (7) ((7 results))

1. Hierarchical File, 2006 Census (Public Use Microdata Files)
Public use microdata: 95M0029X
Description: This hierarchical file provides data on the characteristics of the population. The 2006 Census Public Use Microdata Files (PUMFs) contain samples of anonymous responses to the 2006 Census questionnaire. The files have been carefully scrutinized to ensure the complete confidentiality of the individual responses. The individual file was released on March 4, 2010 and the hierarchical file is available as of today, May 2, 2011.
Microdata files are unique among census products in that they give users access to non-aggregated data. The PUMFs user can group and manipulate these variables to suit data and research requirements. Tabulations excluded from other census products can be created or relationships between variables can be analysed using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people.

Most of the subject matter covered by the census is included in the microdata files. To ensure the respondents' anonymity, geographic identifiers have been restricted to provinces/territories and large metropolitan areas.

This product, offered on CD-ROM, contains the data file (in ASCII format), user documentation and SAS and SPSS program source codes to enable you to read the set of records. Note: users will require knowledge of data manipulation and retrieval software such as SAS or SPSS to be able to use this product.
Release date: 2023-09-12
2. Canadian Statistical Geospatial Explorer Hub Archived
Data Visualization: 71-607-X2020010
Description: The Canadian Statistical Geospatial Explorer empowers users to discover geo enabled data holdings of Statistics Canada at various levels of geography including at the neighbourhood level. Users are able to visualize, thematically map, spatially explore and analyze, export and consume data in various formats. Users can also view the data superimposed on satellite imagery, topographic and street layers.
Release date: 2023-01-24
3. Canadian Social Environment Typology: Data File and User Guide Archived
Table: 17-20-00022022001
Description: The Canadian Social Environment Typology (CanSET) data file on cluster membership by dissemination area is a downloadable data file. The file includes information on the variables that were used to create the clusters and a data table with cluster options on membership by dissemination area.
Release date: 2022-05-09
4. National Income and Expenditure Accounts: Data Tables Archived
Table: 13-019-X
Description: These data tables provide quarterly information on Canada's National Income and Expenditure Accounts (NIEA), 1961-2012. It contains seasonally adjusted data on gross domestic product (GDP) by income and by expenditure, saving and investment, borrowing and lending of each of four broad sectors of the economy: (i) persons and unincorporated businesses, (ii) corporate and government business enterprises, (iii) governments and (iv) non-residents. Information is also provided for selected subsectors. The tables include data beginning in 1961, and is no longer being released.
Release date: 2012-08-31
5. Livestock Statistics Archived
Table: 23-603-X
Description:
This publication contains data from 1976 to date for major livestock series: cattle and calves, hogs, sheep and lambs, wool, furs, trade and prices, stocks of frozen meats, and apparent per capita meat consumption. Data highlights are also included. New and revised estimates for these data are released four times a year.
Release date: 2003-03-05
6. Trade and Transportation: The Impact of the 1995 Transborder Air Services Accord Archived
Table: 51F0007X
Description:
For most of the post-war period, Canada and the United States have utilized an open regime to govern trade relations between the two countries. Such has not always been the case for transborder air services, however. In 1966, the two countries signed an air services accord (ASA) that governed commercial air services between the two. The 1966 accord was quite restrictive, limiting entry and price competition in transborder markets. This restrictive agreement governed Canada-U.S. air service for almost 30 years, finally being replaced in 1995 with a new ASA that has granted entry and pricing freedom in transborder markets.
Release date: 2001-06-05
7. Data Products: Dimensions Series: 1996 Census of Population Archived
Table: 94F0005X
Description:
This CD-ROM is part of the Dimensions Series which provides an in-depth analysis of census data. More than 150 tables represent a variety of special interest subjects linking a number of Census variables. Statistical information is presented on themes of considerable public interest with some tables examining historical trends and other tables detailing significant sub-populations. Data for geographical levels of Canada, Provinces and Territories are most widely represented with some data tables produced at the Census Metropolitan Area level. The Portrait of Official Language Communities in Canada and the Portrait of Aboriginal Population of Canada contain some information at the community level.Some tables show comparisons with data from earlier censuses to provide an historical perspective.
Release date: 1999-04-06

Analysis (187)

Analysis (187) (80 to 90 of 187 results)

81. Handling survey nonresponse in cluster sampling Archived
Articles and reports: 12-001-X20070019855
Description:
In surveys under cluster sampling, nonresponse on a variable is often dependent on a cluster level random effect and, hence, is nonignorable. Estimators of the population mean obtained by mean imputation or reweighting under the ignorable nonresponse assumption are then biased. We propose an unbiased estimator of the population mean by imputing or reweighting within each sampled cluster or a group of sampled clusters sharing some common feature. Some simulation results are presented to study the performance of the proposed estimator.
Release date: 2007-06-28
82. On an optimal controlled nearest proportional to size sampling scheme Archived
Articles and reports: 12-001-X20070019856
Description:
The concept of 'nearest proportional to size sampling designs' originated by Gabler (1987) is used to obtain an optimal controlled sampling design, ensuring zero selection probabilities to non-preferred samples. Variance estimation for the proposed optimal controlled sampling design using the Yates-Grundy form of the Horvitz-Thompson estimator is discussed. The true sampling variance of the proposed procedure is compared with that of the existing optimal controlled and uncontrolled high entropy selection procedures. The utility of the proposed procedure is demonstrated with the help of examples.
Release date: 2007-06-28
83. Defining retirement Archived
Articles and reports: 75-001-X200710213182
Geography: Canada
Description:
Even though the retirement wave will have significant labour market consequences over the next 20 years, no regular statistics are produced on retirement or the retired. Part of the problem stems from lack of clear definitions. For some, retirement means complete withdrawal from the labour force while for others it entails part- or even full-time work. The article examines the challenges faced by statistical organizations in measuring retirement and offers several recommendations to inform a discussion for arriving at international standards.
Release date: 2007-03-20
84. Paradata from Concept to Completion Archived
Articles and reports: 11-522-X20050019432
Description:
Why talk about paradata now, especially when it is already so ubiquitous? Well, perhaps, for that very reason. Certainly, while widely available and extensive, paradata are seldom being used at their full power, either during survey operations or, more commonly, at the survey inference stage.
Release date: 2007-03-02
85. Providing spatial data for secondary analysis: issues and current practices relating to confidentiality Archived
Articles and reports: 11-522-X20050019433
Description:
Spatially explicit data pose a series of opportunities and challenges for all the actors involved in providing data for long-term preservation and secondary analysis - the data producer, the data archive, and the data user.
Release date: 2007-03-02
86. Quality-preserving controlled tabular adjustment: a method for resolving confidentiality and data quality issues for tabular data Archived
Articles and reports: 11-522-X20050019434
Description:
Traditional methods for statistical disclosure limitation in tabular data are cell suppression, data rounding and data perturbation. Because the suppression mechanism is not describable in probabilistic terms, suppressed tables are not amenable to statistical methods such as imputation. Data quality characteristics of suppressed tables are consequently poor.
Release date: 2007-03-02
87. Data swapping is not the panacea Archived
Articles and reports: 11-522-X20050019435
Description:
Data swapping introduces noise in a dataset to improve the protection of statistical confidentiality. We demonstrate in this article that this technique introduces a bias in the estimates.
Release date: 2007-03-02
88. Common metadata constructs for statistical data Archived
Articles and reports: 11-522-X20050019436
Description:
Regardless of the specifics of any given metadata scheme, there are common metadata constructs used to describe statistical data. This paper will give an overview of the different approaches taken to achieve the common goal of providing consistent information.
Release date: 2007-03-02
89. Documenting data elements in statistical agencies Archived
Articles and reports: 11-522-X20050019437
Description:
The explanatory information accompanying statistical data is called metadata, and its presence is essential for the correct understanding and interpretation of the data. This paper will report on the experience of Statistics Canada in the conceptualization, naming and organization of variables on which data are produced.
Release date: 2007-03-02
90. Discovering microdata variables: Comparing DDI compliant documentation to an ISO/IEC 11179 metadata registry Archived
Articles and reports: 11-522-X20050019438
Description:
A variety of standards for documenting the contents of data files have evolved over time, each with their own constituency and users. The Data Documentation Initiative (DDI) is an effort to establish an international XML-based standard for the content, presentation, transport, and preservation of documentation for datasets in the social and behavioural sciences.
Release date: 2007-03-02

Reference (55)

Reference (55) (50 to 60 of 55 results)

51. Estimating the incidence of dementia from longitudinal two-phase sampling with nonignorable missing data Archived
Surveys and statistical programs – Documentation: 11-522-X19980015030
Description:
Two-phase sampling designs have been conducted in waves to estimate the incidence of a rare disease such as dementia. Estimation of disease incidence from longitudinal dementia study has to appropriately adjust for data missing by death as well as the sampling design used at each study wave. In this paper we adopt a selection model approach to model the missing data by death and use a likelihood approach to derive incidence estimates. A modified EM algorithm is used to deal with data missing by sampling selection. The non-paramedic jackknife variance estimator is used to derive variance estimates for the model parameters and the incidence estimates. The proposed approaches are applied to data from the Indianapolis-Ibadan Dementia Study.
Release date: 1999-10-22
52. Weighting and variance estimation for the exploration of possible time trends in data from the U.S. Third National Health and Nutrition Examination Survey Archived
Surveys and statistical programs – Documentation: 11-522-X19980015031
Description:
The U.S. Third National Health and Nutrition Examination Survey (NHANES III) was carried out from 1988 to 1994. This survey was intended primarily to provide estimates of cross-sectional parameters believed to be approximately constant over the six-year data collection period. However, for some variable (e.g., serum lead, body mass index and smoking behavior), substantive considerations suggest the possible presence of nontrivial changes in level between 1988 and 1994. For these variables, NHANES III is potentially a valuable source of time-change information, compared to other studies involving more restricted populations and samples. Exploration of possible change over time is complicated by two issues. First, there was of practical concern because some variables displayed substantial regional differences in level. This was of practical concern because some variables displayed substantial regional differences in level. Second, nontrivial changes in level over time can lead to nontrivial biases in some customary NHANES III variance estimators. This paper considers these two problems and discusses some related implications for statistical policy.
Release date: 1999-10-22
53. Multivariate logistic regression for data from complex surveys Archived
Surveys and statistical programs – Documentation: 11-522-X19980015036
Description:
Multivariate logistic regression, introduced by Glonek and McCullagh (1995) as a generalisation of logistic regression, is useful in the analysis of longitudinal data as it allows for dependent repeated observations of a categorical variable and for incomplete response profiles. We show how the method can be extended to deal with data from complex surveys and we illustrate it on data from the Swiss Labour Force Survey. The effect of the sampling weights on the parameter estimates and their standard errors is considered.
Release date: 1999-10-22
54. Statistics Canada's Business Surveys Archived
Surveys and statistical programs – Documentation: 61F0019X19990025579
Geography: Canada
Description:
The Unified Enterprise Survey (UES) incorporates several annual business surveys into an integrated survey framework. It aims to ensure Statistics Canada receives consistent and integrated data from many types and sizes of businesses, with enough detail to produce accurate provincial statistics. This year, 17 industry surveys are included in the UES, as well as two cross-industry surveys of large enterprises.
Release date: 1999-06-25
55. Guide to Data on Elementary and Secondary Education in Canada Archived
Surveys and statistical programs – Documentation: 81F0004G
Description:
The guide lists and briefly describes the main sources of data, and for each source gives: data coverage, main variables available, strengths and limitation of the data, historical continuity, frequency and means of dissemination, indication of the type of analysis that can be performed.
Release date: 1998-03-30

Report a problem or mistake on this page

Date modified:: 2024-07-08

Language selection

Search and menus

Search

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Portal

Content

Results

All (253) (220 to 230 of 253 results)

Data (7) ((7 results))

Analysis (187) (80 to 90 of 187 results)

Reference (55) (50 to 60 of 55 results)

Keyword search

Filter results by

Keyword(s)

Subject

Type

Year of publication

Geography

Survey or statistical program

Portal

Content

Results

All (253) (220 to 230 of 253 results)

Data (7) ((7 results))

Analysis (187) (80 to 90 of 187 results)

Reference (55) (50 to 60 of 55 results)

How do I use the filters and the search box?

How do I refine my search?

How does the search work?

How are the results ordered?

How are the results ordered?