Keyword search

Sort Help
entries

Results

All (36)

All (36) (0 to 10 of 36 results)

  • Public use microdata: 89M0017X
    Description: The public use microdata file from the 2010 Canada Survey of Giving, Volunteering and Participating is now available. This file contains information collected from nearly 15,000 respondents aged 15 and over residing in private households in the provinces.The public use microdata file provides provincial-level information about the ways in which Canadians donate money and in-kind gifts to charitable and nonprofit organizations; volunteer their time to these organizations; provide help directly to others. Socio-demographic, income and labour force data are also included on the file.
    Release date: 2024-07-24

  • Public use microdata: 95M0008X
    Description: Microdata files are unique among census products in that they give users access to unaggregated data. This makes the public use microdata files (PUMFs) powerful research tools. Each file contains anonymous individual responses on a large number of variables. The PUMF user can group and manipulate these variables to suit his/her own data and research requirements. Tabulations not included in other census products can be created or relationships between variables can be analysed by using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people. All subject-matter covered by the census is included in the microdata files. However, to ensure the anonymity of the respondents, geographic identifiers have been restricted to the provinces/territories and large metropolitan areas. Microdata files have traditionally been disseminated on magnetic tape, which required access to a mainframe computer. For the first time, the 1991 PUMFs will also be available on CD-ROM for microcomputer applications. This file contains data based on a 3% of the population enumerated in the 1991 Census. It provides information on the demographic, social and economic characteristics of the Canadian population. The Households and Housing File allows users to return to the base unit of the census, enabling them to group and manipulate the data to suit their own data and research requirements.

    This product provides two basic tools to assist users in accessing and using the 1991 Census Public Use Microdata File - Households and Housing CD-ROM.

    Release date: 2023-09-12

  • Articles and reports: 11-522-X200600110424
    Description:

    The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.

    Release date: 2008-06-26

  • Articles and reports: 12-001-X200800110612
    Description:

    Lehtonen and Veijanen (1999) proposed a new model-assisted generalized regression (GREG) estimator of a small area mean under a two-level model. They have shown that the proposed estimator performs better than the customary GREG estimator in terms of average absolute relative bias and average median absolute relative error. We derive the mean squared error (MSE) of the new GREG estimator under the two-level model and compare it to the MSE of the best linear unbiased prediction (BLUP) estimator. We also provide empirical results on the relative efficiency of the estimators. We show that the new GREG estimator exhibits better performance relative to the customary GREG estimator in terms of average MSE and average absolute relative error. We also show that, due to borrowing strength from related small areas, the EBLUP estimator exhibits significantly better performance relative to the customary GREG and the new GREG estimators. We provide simulation results under a model-based set-up as well as under a real finite population.

    Release date: 2008-06-26

  • Articles and reports: 11-522-X200600110432
    Description:

    The use of discrete variables having known statistical distributions in the masking of data on discrete variables has been under study for some time. This paper presents a few results from our research on this topic. The consequences of sampling with and without replacement from finite populations are one principal interest. Estimates of first and second order moments which attenuate or adjust for the additional variation due to masking of known type are developed. The impact of masking of the original data on the correlation structure of concomitantly measured discrete variables is considered and the need for the further development of results for analyses of multivariate data is discussed.

    Release date: 2008-03-17

  • Articles and reports: 12-001-X20060019262
    Description:

    Hidden human populations, the Internet, and other networked structures conceptualized mathematically as graphs are inherently hard to sample by conventional means, and the most effective study designs usually involve procedures that select the sample by adaptively following links from one node to another. Sample data obtained in such studies are generally not representative at face value of the larger population of interest. However, a number of design and model based methods are now available for effective inference from such samples. The design based methods have the advantage that they do not depend on an assumed population model, but do depend for their validity on the design being implemented in a controlled and known way, which can be difficult or impossible in practice. The model based methods allow greater flexibly in the design, but depend on modeling of the population using stochastic graph models and also depend on the design being ignorable or of known form so that it can be included in the likelihood or Bayes equations. For both the design and the model based methods, the weak point often is the lack of control in how the initial sample is obtained, from which link-tracing commences. The designs described in this paper offer a third way, in which the sample selection probabilities become step by step less dependent on the initial sample selection. A Markov chain "random walk" model idealizes the natural design tendencies of a link-tracing selection sequence through a graph. This paper introduces uniform and targeted walk designs in which the random walk is nudged at each step to produce a design with the desired stationary probabilities. A sample is thus obtained that in important respects is representative at face value of the larger population of interest, or that requires only simple weighting factors to make it so.

    Release date: 2006-07-20

  • Articles and reports: 12-002-X20060019253
    Description:

    Before any analytical results are released from the Research Data Centres (RDCs), RDC analysts must conduct disclosure risk analysis (or vetting). RDC analysts apply Statistics Canada's disclosure control guidelines, when reviewing all analytical output, as a means of ensuring the protection of survey respondents' confidentiality. For some data sets, such as the Aboriginal People's Survey (APS), Ethnic Diversity Survey (EDS), the Participation, Activity and Limitation Survey (PALS) and the Longitudinal Survey of Immigrants to Canada (LSIC), Statistics Canada has developed an additional set of guidelines that involve rounding analytical results, in order to ensure further confidentiality protection. This article will discuss the rationale for the additional rounding procedures used for these data sets, and describe the specifics of the rounding guidelines. More importantly, this paper will suggest several approaches to assist researchers in following these protocols more effectively and efficiently.

    Release date: 2006-07-18

  • Articles and reports: 75F0002M2006005
    Description:

    The Survey of Labour and Income Dynamics (SLID) is a longitudinal survey initiated in 1993. The survey was designed to measure changes in the economic well-being of Canadians as well as the factors affecting these changes.

    Sample surveys are subject to errors. As with all surveys conducted at Statistics Canada, considerable time and effort is taken to control such errors at every stage of the Survey of Labour and Income Dynamics. Nonetheless errors do occur. It is the policy at Statistics Canada to furnish users with measures of data quality so that the user is able to interpret the data properly. This report summarizes a set of quality measures that has been produced in an attempt to describe the overall quality of SLID data. Among the measures included in the report are sample composition and attrition rates, sampling errors, coverage errors in the form of slippage rates, response rates, tax permission and tax linkage rates, and imputation rates.

    Release date: 2006-04-06

  • Articles and reports: 11-522-X20040018740
    Description:

    The illegal immigration is difficult to sample in Italy since exhaustive sampling frames are generally unavailable. Sampling of centers is a strategy recently developed for surveying immigrant population.

    Release date: 2005-10-27

  • Articles and reports: 11-522-X20040018751
    Description:

    This paper examines how adaptive sampling methods might be used to extend current national health surveys to enable effective tracking and monitoring of new forms of health threats and trace exposed persons.

    Release date: 2005-10-27
Data (9)

Data (9) ((9 results))

  • Public use microdata: 89M0017X
    Description: The public use microdata file from the 2010 Canada Survey of Giving, Volunteering and Participating is now available. This file contains information collected from nearly 15,000 respondents aged 15 and over residing in private households in the provinces.The public use microdata file provides provincial-level information about the ways in which Canadians donate money and in-kind gifts to charitable and nonprofit organizations; volunteer their time to these organizations; provide help directly to others. Socio-demographic, income and labour force data are also included on the file.
    Release date: 2024-07-24

  • Public use microdata: 95M0008X
    Description: Microdata files are unique among census products in that they give users access to unaggregated data. This makes the public use microdata files (PUMFs) powerful research tools. Each file contains anonymous individual responses on a large number of variables. The PUMF user can group and manipulate these variables to suit his/her own data and research requirements. Tabulations not included in other census products can be created or relationships between variables can be analysed by using different statistical tests. PUMFs provide quick access to a comprehensive social and economic database about Canada and its people. All subject-matter covered by the census is included in the microdata files. However, to ensure the anonymity of the respondents, geographic identifiers have been restricted to the provinces/territories and large metropolitan areas. Microdata files have traditionally been disseminated on magnetic tape, which required access to a mainframe computer. For the first time, the 1991 PUMFs will also be available on CD-ROM for microcomputer applications. This file contains data based on a 3% of the population enumerated in the 1991 Census. It provides information on the demographic, social and economic characteristics of the Canadian population. The Households and Housing File allows users to return to the base unit of the census, enabling them to group and manipulate the data to suit their own data and research requirements.

    This product provides two basic tools to assist users in accessing and using the 1991 Census Public Use Microdata File - Households and Housing CD-ROM.

    Release date: 2023-09-12

  • Public use microdata: 56M0002G
    Description:

    This guide is for the Household Internet Use Survey microdata file. The Household Internet Use Survey is being conducted by Statistics Canada on behalf of Industry Canada. The information from this survey will assist the Science and Technology Redesign Project at Statistics Canada to fulfil a three-year contractual agreement between them and the Telecommunications and Policy Branch of Industry Canada. The Household Internet Use Survey is a voluntary survey. It will provide information on the use of computers for communication purposes, and households' access and use of the Internet from home.

    The objective of this survey is to measure the demand for telecommunications services by Canadian households. To assess the demand, we measure the frequency and intensity of use of what is commonly referred to as "the information highway" among other things. This was done by asking questions relating to the accessibility of the Internet to Canadian households both at home, the workplace and a number of other locations. The information collected will be used to update and expand upon previous studies done by Statistics Canada on the topic of the Information Highway.

    Release date: 2004-09-28

  • Public use microdata: 82M0011X
    Description:

    The main objective of the 2002 Youth Smoking Survey (YSS) is to provide current information on the smoking behaviour of students in grades 5 to 9 (in Quebec primary school grades 5 and 6 and secondary school grades 1 to 3), and to measure changes that occurred since the last time the survey was conducted in 1994. Additionally, the 2002 survey collected basic data on alcohol and drug use by students in grades 7 to 9 (in Quebec secondary 1 to 3). Results of the Youth Smoking Survey will help with the evaluation of anti-smoking and anti-drug use programs, as well as with the development of new programs.

    Release date: 2004-07-14

  • Public use microdata: 12M0014X
    Geography: Province or territory
    Description: This report presents a brief overview of the information collected in Cycle 14 of the General Social Survey (GSS). Cycle 14 is the first cycle to collect detailed information on access to and use of information communication technology in Canada. Topics include general use of technology and computers, technology in the workplace, development of computer skills, frequency of Internet and E-mail use, non-users and security and information on the Internet. The target population of the GSS is all individuals aged 15 and over living in a private household in one of the ten provinces.
    Release date: 2001-06-29

  • Public use microdata: 82M0010X
    Description:

    The National Population Health Survey (NPHS) program is designed to collect information related to the health of the Canadian population. The first cycle of data collection began in 1994. The institutional component includes long-term residents (expected to stay longer than six months) in health care facilities with four or more beds in Canada with the principal exclusion of the Yukon and the Northwest Teritories. The document has been produced to facilitate the manipulation of the 1996-1997 microdata file containing survey results. The main variables include: demography, health status, chronic conditions, restriction of activity, socio-demographic, and others.

    Release date: 2000-08-02

  • Public use microdata: 75M0007X
    Description:

    The Absence from Work Survey was designed primarily to fulfill the objectives of Human Resources Development Canada. They sponsor the qualified wage loss replacement plan which applies to employers who have their own private plans to cover employee wages lost due to sickness, accident, etc. Employers who fall under the plan are granted a reduction in their quotas payable to the Unemployment Insurance Commission. The data generated from the responses to the supplement will provide input to determine the rates for quota reductions for qualified employers.

    Although the Absence from Work Survey collects information on absences from work due to illness, accident or pregnancy, it does not provide a complete picture of people who have been absent from work for these reasons because the concepts and definitions have been developed specifically for the needs of the client. Absences in this survey are defined as being at least two weeks in length, and respondents are only asked the three reasons for their most recent absence and the one preceding it.

    Release date: 1999-06-29

  • Public use microdata: 12M0010X
    Description:

    Cycle 10 collected data from persons 15 years and older and concentrated on the respondent's family. Topics covered include marital history, common- law unions, biological, adopted and step children, family origins, child leaving and fertility intentions.

    The target population of the GSS (General Social Survey) consisted of all individuals aged 15 and over living in a private household in one of the ten provinces.

    Release date: 1997-02-28

  • Public use microdata: 82F0001X
    Description:

    The National Population Health Survey (NPHS) uses the Labour Force Survey sampling frame to draw a sample of approximately 22,000 households. The sample is distributed over four quarterly collection periods. In each household, some limited information is collected from all household members and one person, aged 12 years and over, in each household is randomly selected for a more in-depth interview.

    The questionnaire includes content related to health status, use of health services, determinants of health and a range of demographic and economic information. For example, the health status information includes self-perception of health, a health status index, chronic conditions, and activity restrictions. The use of health services is probed through visits to health care providers, both traditional and non-traditional, and the use of drugs and other medications. Health determinants include smoking, alcohol use, physical activity and in the first survey, emphasis has been placed on the collection of selected psycho-social factors that may influence health, such as stress, self-esteem and social support. The demographic and economic information includes age, sex, education, ethnicity, household income and labour force status.

    Release date: 1995-11-21
Analysis (27)

Analysis (27) (0 to 10 of 27 results)

  • Articles and reports: 11-522-X200600110424
    Description:

    The International Tobacco Control (ITC) Policy Evaluation China Survey uses a multi-stage unequal probability sampling design with upper level clusters selected by the randomized systematic PPS sampling method. A difficulty arises in the execution of the survey: several selected upper level clusters refuse to participate in the survey and have to be replaced by substitute units, selected from units not included in the initial sample and once again using the randomized systematic PPS sampling method. Under such a scenario the first order inclusion probabilities of the final selected units are very difficult to calculate and the second order inclusion probabilities become virtually intractable. In this paper we develop a simulation-based approach for computing the first and the second order inclusion probabilities when direct calculation is prohibitive or impossible. The efficiency and feasibility of the proposed approach are demonstrated through both theoretical considerations and numerical examples. Several R/S-PLUS functions and codes for the proposed procedure are included. The approach can be extended to handle more complex refusal/substitution scenarios one may encounter in practice.

    Release date: 2008-06-26

  • Articles and reports: 12-001-X200800110612
    Description:

    Lehtonen and Veijanen (1999) proposed a new model-assisted generalized regression (GREG) estimator of a small area mean under a two-level model. They have shown that the proposed estimator performs better than the customary GREG estimator in terms of average absolute relative bias and average median absolute relative error. We derive the mean squared error (MSE) of the new GREG estimator under the two-level model and compare it to the MSE of the best linear unbiased prediction (BLUP) estimator. We also provide empirical results on the relative efficiency of the estimators. We show that the new GREG estimator exhibits better performance relative to the customary GREG estimator in terms of average MSE and average absolute relative error. We also show that, due to borrowing strength from related small areas, the EBLUP estimator exhibits significantly better performance relative to the customary GREG and the new GREG estimators. We provide simulation results under a model-based set-up as well as under a real finite population.

    Release date: 2008-06-26

  • Articles and reports: 11-522-X200600110432
    Description:

    The use of discrete variables having known statistical distributions in the masking of data on discrete variables has been under study for some time. This paper presents a few results from our research on this topic. The consequences of sampling with and without replacement from finite populations are one principal interest. Estimates of first and second order moments which attenuate or adjust for the additional variation due to masking of known type are developed. The impact of masking of the original data on the correlation structure of concomitantly measured discrete variables is considered and the need for the further development of results for analyses of multivariate data is discussed.

    Release date: 2008-03-17

  • Articles and reports: 12-001-X20060019262
    Description:

    Hidden human populations, the Internet, and other networked structures conceptualized mathematically as graphs are inherently hard to sample by conventional means, and the most effective study designs usually involve procedures that select the sample by adaptively following links from one node to another. Sample data obtained in such studies are generally not representative at face value of the larger population of interest. However, a number of design and model based methods are now available for effective inference from such samples. The design based methods have the advantage that they do not depend on an assumed population model, but do depend for their validity on the design being implemented in a controlled and known way, which can be difficult or impossible in practice. The model based methods allow greater flexibly in the design, but depend on modeling of the population using stochastic graph models and also depend on the design being ignorable or of known form so that it can be included in the likelihood or Bayes equations. For both the design and the model based methods, the weak point often is the lack of control in how the initial sample is obtained, from which link-tracing commences. The designs described in this paper offer a third way, in which the sample selection probabilities become step by step less dependent on the initial sample selection. A Markov chain "random walk" model idealizes the natural design tendencies of a link-tracing selection sequence through a graph. This paper introduces uniform and targeted walk designs in which the random walk is nudged at each step to produce a design with the desired stationary probabilities. A sample is thus obtained that in important respects is representative at face value of the larger population of interest, or that requires only simple weighting factors to make it so.

    Release date: 2006-07-20

  • Articles and reports: 12-002-X20060019253
    Description:

    Before any analytical results are released from the Research Data Centres (RDCs), RDC analysts must conduct disclosure risk analysis (or vetting). RDC analysts apply Statistics Canada's disclosure control guidelines, when reviewing all analytical output, as a means of ensuring the protection of survey respondents' confidentiality. For some data sets, such as the Aboriginal People's Survey (APS), Ethnic Diversity Survey (EDS), the Participation, Activity and Limitation Survey (PALS) and the Longitudinal Survey of Immigrants to Canada (LSIC), Statistics Canada has developed an additional set of guidelines that involve rounding analytical results, in order to ensure further confidentiality protection. This article will discuss the rationale for the additional rounding procedures used for these data sets, and describe the specifics of the rounding guidelines. More importantly, this paper will suggest several approaches to assist researchers in following these protocols more effectively and efficiently.

    Release date: 2006-07-18

  • Articles and reports: 75F0002M2006005
    Description:

    The Survey of Labour and Income Dynamics (SLID) is a longitudinal survey initiated in 1993. The survey was designed to measure changes in the economic well-being of Canadians as well as the factors affecting these changes.

    Sample surveys are subject to errors. As with all surveys conducted at Statistics Canada, considerable time and effort is taken to control such errors at every stage of the Survey of Labour and Income Dynamics. Nonetheless errors do occur. It is the policy at Statistics Canada to furnish users with measures of data quality so that the user is able to interpret the data properly. This report summarizes a set of quality measures that has been produced in an attempt to describe the overall quality of SLID data. Among the measures included in the report are sample composition and attrition rates, sampling errors, coverage errors in the form of slippage rates, response rates, tax permission and tax linkage rates, and imputation rates.

    Release date: 2006-04-06

  • Articles and reports: 11-522-X20040018740
    Description:

    The illegal immigration is difficult to sample in Italy since exhaustive sampling frames are generally unavailable. Sampling of centers is a strategy recently developed for surveying immigrant population.

    Release date: 2005-10-27

  • Articles and reports: 11-522-X20040018751
    Description:

    This paper examines how adaptive sampling methods might be used to extend current national health surveys to enable effective tracking and monitoring of new forms of health threats and trace exposed persons.

    Release date: 2005-10-27

  • Articles and reports: 12-001-X20050018084
    Description:

    At national statistical institutes, experiments embedded in ongoing sample surveys are conducted occasionally to investigate possible effects of alternative survey methodologies on estimates of finite population parameters. To test hypotheses about differences between sample estimates due to alternative survey implementations, a design-based theory is developed for the analysis of completely randomized designs or randomized block designs embedded in general complex sampling designs. For both experimental designs, design-based Wald statistics are derived for the Horvitz-Thompson estimator and the generalized regression estimator. The theory is illustrated with a simulation study.

    Release date: 2005-07-21

  • Articles and reports: 12-001-X20050018087
    Description:

    In Official Statistics, data editing process plays an important role in terms of timeliness, data accuracy, and survey costs. Techniques introduced to identify and eliminate errors from data are essentially required to consider all of these aspects simultaneously. Among others, a frequent and pervasive systematic error appearing in surveys collecting numerical data, is the unity measure error. It highly affects timeliness, data accuracy and costs of the editing and imputation phase. In this paper we propose a probabilistic formalisation of the problem based on finite mixture models. This setting allows us to deal with the problem in a multivariate context, and provides also a number of useful diagnostics for prioritising cases to be more deeply investigated through a clerical review. Prioritising units is important in order to increase data accuracy while avoiding waste of time due to the follow up of non-really critical units.

    Release date: 2005-07-21
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: