5. Data quality

Skip to text

Text begins

Spatial data quality elements provide information on the fitness-for-use of a spatial database by describing why, when and how the data are created, and how accurate the data are. The quality elements include an overview reporting on the lineage, positional accuracy, attribute accuracy, logical consistency and completeness. This information is provided to users for all spatial data products disseminated.

Lineage

Lineage describes the history of the spatial data, including descriptions of the source material from which the data were derived, and the methods of derivation. It also contains the dates of the source material, and all transformations involved in producing the final digital files.

The National Geographic Database (NGD) is a joint Statistics Canada-Elections Canada initiative to develop and maintain a spatial database which serves the needs of both organizations. The focus of the NGD is the continual improvement of quality and currency of spatial coverage using updates from provinces, territories and local sources. The source files used for the creation of the road network file reside on Statistics Canada's Spatial Data Infrastructure (SDI) which was derived directly from data stored on the NGD.

The data in the 2014 Road Network File were derived from the SDI environment based on a copy of the NGD that contains the road network in Canada, as well as street attributes (name, type, direction, address ranges and class).

The files were verified for their spatial and attribute content, translated into French and English, and appropriately named according to the file naming convention. The geographic area unique identifier, name, type, and the relationships among the various geographic levels are found on the SDI.

Final data processing consisted of the conversion from the File Geo Database format, using FME® (Safe Software), into the following GIS file formats: ArcGIS® (.shp), Geography Markup Language (.gml) and MapInfo® (.tab).

Road information was incorporated from a variety of sources, including provincial datasets, municipal maps and field observation. The timeliness of the National Geographic Database varies from region to region depending on the source data.

Positional accuracy

Positional accuracy refers to the absolute and relative accuracy of the positions of geographic features. Absolute accuracy is the closeness of the coordinate values in a dataset to values accepted as or being true. Relative accuracy is the closeness of the relative positions of features to their respective relative positions accepted as or being true. Descriptions of positional accuracy include the quality of the final file or product after all transformations.

The Spatial Data Infrastructure is not Global Positioning Systems (GPS)-compliant. However, every possible attempt is made to ensure that the standard geographic area boundaries maintained in the Spatial Data Infrastructure respect the limits of the administrative entities that they represent (e.g., census division and census subdivision) or on which they are based (e.g., census metropolitan area or census agglomeration). The positional accuracy of these limits is dependent upon source materials used by Statistics Canada to identify the location of limits. In addition, due to the importance placed on relative positional accuracy, the positional accuracy of other geographic data (e.g., road network data and hydrographic data) that are stored within the Spatial Data Infrastructure is considered when positioning the limits of the standard geographic areas.

Absolute positional accuracy

Absolute positional accuracy describes the degree to which the position of features in a geographic database reflects their true position on the ground (i.e., the closeness of reported coordinate values to values accepted as true).

The 2014 Road Network File includes updates to the road network that were made using the following provincially-sourced data:

  • Ontario Road Network (ORN) in four census divisions in Ontario: Cochrane (3556), Thunder Bay (3558), Rainy River (3559) and Kenora (3560)
  • Alberta in nine census divisions: Division No. 1 (4801), Division No. 2 (4802), Division No. 7 (4807), Division No. 8 (4808), Division No. 9 (4809), Division No. 12 (4812), Division No. 13 (4813), Division No. 15 (4815) and Division No. 18 (4818).

The result of this effort is an improvement in the representation of the road network.

The information present in the Spatial Data Infrastructure road layer was developed for the purposes of statistical analysis and census operations. The absolute position of roads in the Spatial Data Infrastructure varies with the source files and documents used to build and maintain the database. Therefore, the road layer is not suitable for high precision measurement applications such as engineering, property transfers, or other uses that might require highly accurate measurements of the earth's surface.

Absolute positional accuracy is not a requirement for census processes.

Relative positional accuracy

Relative positional accuracy describes the degree to which the position of features in a geographic database reflects their true ground relationships.

For the National Geographic Database, relative positional accuracy is important. A road must appear in the proper position relative to other roads and physical features; however, no formal assessment of relative positional accuracy has been undertaken.

Attribute accuracy

Attribute accuracy refers to the accuracy of quantitative attributes and the correctness of non-quantitative attributes. No explicit testing for attribute accuracy is done; however, results from internal operations suggest a high degree of accuracy.

During maintenance operations data entry goes through a data control process to ensure the proper association of attributes to a specific geometric feature. This includes the association as well as its accuracy.

As noted under Lineage, the attributes (names, types and unique identifiers) for all standard geographic areas are sourced from Statistics Canada's Spatial Data Infrastructure. The names and types of administrative standard geographic areas have been updated using source materials from provincial and territorial authorities.

The class attribute is not updated on a regular basis, as such quality checks are not performed to verify its accuracy.

Logical consistency

Logical consistency describes the fidelity of relationships encoded in the data structure of the digital spatial data. For example, a street arc that does not have a street name should not have a street type.

The 2014 Road Network File was verified against data in the Spatial Data Infrastructure and found to be logically consistent.

Consistency with other products

The position of the arcs in the 2014 Road Network File are not necessarily consistent with previous editions of boundary files or road network files as a result of updates made using provincially and territorial sourced data.

Topology checks were performed with the 2014 Road Network File and the 2014 Census Subdivision Boundary File to measure the degree of integration amongst these products. The results indicated the degree of integration was within the default tolerance parameters as defined below.

Tolerance: 0.001 metres
Resolution: 0.0001 metres

Completeness

Completeness refers to the degree to which geographic features, their attributes and their relationships are included or omitted in a dataset. It also includes information on selection criteria, definitions used, and other relevant mapping rules.

New road features have been added to the National Geographic Database in order to create a more complete road layer and are present in this edition of the road network file.

Date modified: