Working with Data

Working with Data

Each dataset page contains the following two options: “Explore Variables” and “View All Files”. 

 

Using Data Explorer 

Disclaimer about using Data Explorer: We always recommend that you consult the original data files and documentation when working with data in Odesi and conducting data analysis for your research. 

To view a dataset in the Data Explorer tool, select the “Explore Variables” button to open a new window and begin working with data.

Data Explorer allows users to download, subset, view, visualize, and analyze the data in an online web browser. You can view variable and question information, summary frequencies, statistics, as well as perform cross tabulations of two or more variables. 

 

Full View

 

Variable Groups

Variable groups provide a way for you to find variables of interest by topic or category. Clicking on a variable group in the left-hand column will display only variables belonging to that particular group.

 

Exploring Variables

You have the option to view summary frequencies, categories, and statistics, for each variable contained in the dataset. 

 

Summary Statistics 

To view summary statistics for individual or multiple variables, start by selecting the checkboxes  to the left of the variables for which you would like to see the summary statistics.

To view the summary statistics for a single or multiple variables, select the eye icon situated to the right side of its corresponding row, under the “Summary Statistics” column.

A pop-up window will appear, showcasing the variable’s median, standard deviation, minimum, maximum, mean, total valid count, and total invalid count. 

 

Frequencies and Categories

To view the summary frequencies or categories for a single or multiple variables, select the eye icon situated to the right side of its corresponding row, under the “View Categories” column.

A pop-up window will appear showcasing a horizontal bar graph; a table with the following categories: values, categories, count, and weighted count; and a summary of statistics for the variables. 

 

Cross Tabulations

Cross tabulating is a simple way to understand the relationship between two different variables.

Cross tabulation refers to a table with two or more dimensions that displays the frequency or number of respondents who possess the specific characteristics described in the cells of the table. 

To access this feature, select “Cross Tabulation” located above the table, on the upper right side.

A pop-up window will appear with all the variables located in the left column.

When performing a cross tabulation, the variable can be added to either the horizontal rows or vertical columns. Add the corresponding variables using the arrows located between the “All variables column” and the “Rows” and “Columns” fields. The pink arrow at the top corresponds with the “Rows” field, while the pink arrow at the bottom corresponds with the “Columns” field.

Once you are ready to perform cross tabulation, select the “OK” button.

 

Understanding Types of Variables

A variable is a measurable characteristic that can assume different values. Variables can be either numeric or categorical. 

 

Numeric Variables 

Numeric variables, or quantitative variables, consist of numbers. Numeric variables are  classified into two subcategories: continuous and discrete.

Continuous variables can assume an infinite number of real values within a given interval. For instance, let’s take the example of a student’s height which cannot be negative or greater than three metres, but can assume an infinite number of values within this range.

Discrete variables can assume only a finite number of real values within a given interval. For example, consider the score given by a judge to a gymnast in competition, where the score ranges from 0 to 10. The range of possible values can be enumerated (0, 0.1, 0.2, etc.), and the number of possible values is finite.

 

Categorical Variables

Categorical variables, or qualitative variables, consist of categories representing data characteristics. Categorical variables are classified into two subcategories: nominal and ordinal.

A nominal variable refers to a category, label, or name that lacks a natural order. For example, the mode of transportation taken by commuters.

An ordinal variable refers to values established based on an order relation between different categories. The variable “Behaviour” is ordinal because the category “Excellent” is positioned above the category “Very good,” which is positioned above the category “Good,” and so on. Although there is a natural ordering, it is limited since we cannot determine the degree to which “Excellent” exceeds “Very good.” 

 

Odesi Terms and Licensing 

Odesi provides access to data from a number of sources, including Statistics Canada Data Liberation Initiative (DLI). Licensing agreements are established through support from academic libraries and library consortia and have unique Terms and Conditions for use. Please review the licensing conditions for each dataset, to ensure appropriate use of Odesi’s data collections. 

More information about Terms of Use can be found on the Odesi site.

 

Odesi APIs

More information on Odesi APIs will be available soon.