Integrating Machine Learning into Coding of the 2021 Canadian Census Using fastText - ARCHIVED
Articles and reports: 11-522-X202100100010
As part of processing for the 2021 Canadian Census, the write-in responses to 31 census questions must be coded. Up until, and including, 2016, this was a three stage process, including an “interactive (human) coding” step as the second stage. This human coding step is both lengthy and expensive, spanning many months and requiring the hiring and training of a large number of temporary employees. With this in mind, for 2021, this stage was either augmented with or replaced entirely by machine learning models using the "fastText" algorithm. This presentation will discuss the implementation of this algorithm and the challenges and decisions taken along the way.
Key Words: Natural Language Processing, Machine Learning, fastText, Coding
Format | Release date | More information |
---|---|---|
November 5, 2021 |
Related information
Subjects and keywords
Subjects
Keywords
- Date modified: