Data Terminology
Data vs. Statistics
Data are numeric files created and organized for further analysis. There are two types of data – aggregate data and microdata.
Aggregate data are statistical summaries organized in a specific data file structure that permits further analysis. Aggregate data are delivered in a variety of formats, including CANSIM tables, Beyond 2020 files, and Excel spreadsheets.
Microdata consist of the raw data observed or collected from a specific unit of observation (e.g. individual respondents, households, families, etc.). The microdata file is composed of individual records consisting of a row of numbers. Columns are present which describe the data. Microdata require processing before they become ready for interpretation. Microdata in <odesi> are available for download and subsetting for a variety of statistical analysis software. Statistical analysis software is a comprehensive system for analysing data. There are many different types of statistical analysis software. Three of the more common software used by researchers include SPSS (Statistical Package for the Social Sciences), SAS (Statistical Analysis System), and STATA.
This is an example of microdata from the Canadian Tobacco Use Monitoring Survey (2008). Each row (left to right) represents a responder. Each column (top to bottom) represents responses to a particular question. The questions are coded along the top.
Statistics are the summarized tables and cross-tabulations that have been formulated from the raw data files. Statistics are often produced for ready-use and published in the form of e-publications, e-tables, and databases. (adapted from the Statistics Canada DLI Survival Guide)
What is a Variable?
“A variable is a characteristic of a statistical unit being observed that may assume more than one of a set of values to which a numerical measure or a category from a classification can be assigned.”
(from Statistics Canada “Definitions, data sources, and methods”)
More generally, a variable is a set of factors, traits, or conditions that make sense together as a unit of analysis. For example, in this question, the variable is “marital status,” and it is made up of the conditions divorced, legally married and not separated, separate and legally married, never legally married, and widowed.
Data Terminology Resources
- Statistics Canada’s definitions, data sources, and methods: The information is provided to ensure an understanding of the basic concepts that define the data, including variables and classifications; the underlying statistical methods and surveys; and key aspects of the data quality. Direct access to questionnaires is also provided.
- Statistics Canada Power from the Data! Glossary: The definitions provided here are, in some cases, oversimplifications of highly complex concepts. They provide information for those who have questions about statistics but who do not need highly technical explanations.
- StatSoft Statistics Glossary: Statsoft has freely provided the Electronic Statistics Textbook as a public service for more than 12 years. This textbook offers training in the understanding and application of statistics. The material was developed at the Statsoft R&D department based on many years of teaching undergraduate and graduate statistics courses and covers a wide variety of applications.
Citing <odesi> Data in APA
Author. (Year of publication/production). Title (Version number if relevant) [Data type]. Name of Producer if different from author [Producer]. Name of Distributor [Distributor]. Retrieved from URL
Microdata example:
Statistics Canada. (1993). Survey of Persons Not in the Labour Force, 1992 [Data file]. Data Liberation Initiative [Distributor]. Retrieved from http://search1.odesi.ca/details/view.html?q=survey+of+persons+not+in+the+labour+force&field=TI&coll=odesi&date-gt=1871&date-lt=2011&uri=/odesi/spnlf_71M0014_E_1992.xml
Aggregate data example:
Statistics Canada. (2008). Census of Population, 2006: Profile for Canada, Provinces, Territories, Census Divisions, Census Subdivisions and Dissemination Areas, Profile Series [Table]. Data Liberation Initiative, [Distributor]. Retrieved from http://odesi.scholarsportal.info/documentation/CENSUS/2006/PROFILES/B2020/RAWDATA/CUMM/CAN/94-581-XCB2006002.IVT
Data Citation Resources
- How to Cite Statistics Canada Products: Citing data from Statistics Canada, step-by-step guidelines
- ICPSR Citing Data Guide: How and why should I cite data from ICPSR?
- IASSIST Quick Guide to Data Citation (PDF): A quick guide to citing data in APA, MLA, and Chicago.
- Citing Data Guide from MSU Libraries: Includes information about citing microdata and statistical tables in APA, ASA, and APSA.
Data Literacy Resources
More Ways to Find Data
- Open Data Portal (Canada)
- Data.gov (USA)
- ICPSR (International)
- IPUMS (International)
- OECD (International)
- UN Data (International)
PDF Handouts
- Searching for variables
- Workshop 1: Women’s Appearance
- Workshop 2: Work Stress
- <odesi> promotional handout
- <odesi> quick guide
Other Scholars Portal Data Resources
Scholars GeoPortal: The Scholars GeoPortal tool provides access to geospatial datasets, including land-based vector data, census geography, and orthophotography. From the GeoPortal, you can download geospatial data such as the census boundary files for a census metropolitan area, download the associated census data from <odesi>, and pull these datasets together in a GIS tool to create a map based on your data.
Dataverse: The Scholars Portal Dataverse network is a repository for research data collected by individuals and organizations associated with subscribing Canadian universities. The Dataverse platform allows researchers to deposit data, create appropriate metadata, and version documents. Access to data and supporting documentation can be controlled down to the file level, and researchers can choose to make content available publicly, only to select individuals, or to keep it completely locked.
Get Help at Your School
Each OCUL school has a local data services librarian who can help you search for and identify data sets, work with microdata files, and get in touch with Statistics Canada liasons as necessary.
To get help at your institution find and email your local data librarian.
If you are having technical problems, you can email ODESI technical support.