### RDM Services at Your Library

Need help managing your research data? Staff at your local institution’s library can provide assistance with all phases of the data lifecycle, which may include:

- developing data management plans
- documenting of research data
- sharing and long-term preservation of data
- using online research data repositories (including Scholars Portal Dataverse)
- publishing options
- author rights

Please select your institution below to contact local research data services:

#### Eastern Canada

Cape Breton University

Dalhousie University

Memorial University

St. Francis Xavier University

#### Ontario

Algoma University

Brock University

Carleton University

Lakehead University

Laurentian University

McMaster University

Nipissing University

OCAD University

Ontario Tech University

Queen’s University

Royal Military College

Ryerson University

Trent University

University of Guelph

University of Ottawa

University of Toronto

University of Waterloo

University of Windsor

Western University

Wilfrid Laurier University

York University

#### Quebec

Bishop’s University

Concordia University

École de technologie supérieure

École nationale d’administration publique

HEC Montréal

Institut national de la recherche scientifique

McGill University

Polytechnique Montréal

Université Laval

Université de Montréal

Université du Québec à Chicoutimi

Université du Québec à Montréal

Université du Québec à Rimouski

Université du Québec à Trois-Rivières

Université du Québec en Abitibi-Témiscamingue

Université du Québec en Outaouais

Université de Sherbrooke

Université TÉLUQ

#### Western Canada

Brandon University

Mount Royal University

Royal Roads University

Trinity Western University

University of British Columbia

University of Calgary

University of Northern British Columbia

University of Regina

University of Victoria

Vancouver Island University

### Terminology

**Cases** – These are the units of analysis, things that have certain characteristics or properties. For example, the cases could be individuals in a statistics class, all residents of Chicago, hamburgers, cities, countries, organizations, or lakes. We want to reach a conclusion about their characteristics.

**Data** refers to quantitative data *and* research files broadly (i.e. field notes, ethnographic descriptive text, images, etc.). Dataverse accepts all kinds of data and files.

**Tabular data**is quantitative data (numbers) arranged in a table. Dataverse can only run statistical analyses on tabular data files. Accepted file formats are: SPSS/POR, SPSS/SAV, Strata, CSV (w/SPSS card), and TAB (w/DDI). Dataverse will maintain usability of tabular data files over time. For example, if .sav files become obsolete, Dataverse will republish deposited data in new useable formats.

**Network data**is represented in XML files. These files contain information about network properties (nodes, edges). Network data is used for network analysis (i.e. social network analysis). Dataverse can visualize network data from GraphML files.

**Frequency** – The frequency for a value is the number of cases that fall into the category and is also called a “count”.

**Metadata **– text that describes your research study. Metadata fields include the abstract, keywords, and data collection mode (among others). All metadata fields in Dataverse are defined on the site itself and are compliant with the DDI standard schema version 2. For an overview of DDI standards, visit ddialliance.org. To view a complete list of DDI fields in Dataverse, see this document (PDF).

**Values** – These are the possible outcomes for a single variable. They are different for the different cases. Values can be numbers or named categories. For example the variable GENDER traditionally has two values, “man” and “woman”. Some people (cases) are men, and some are women.

**Variable** – This is the characteristic or property in which we are interested. It is a characteristic that pertains to the cases. A variable must be able to take on different values for different cases. Variables include characteristics like people’s GENDER, people’s HEIGHT, the DEPTHS of lakes, lake TEMPERATURES, organizations’ REVENUES, and whether a hamburger is COOKED rare, medium, or well-done. Often we look at two or more different variables at a time and ask whether they are related for a specific set of cases. For example, we might want to know if GENDER is related to HEIGHT among human beings or it TEMPERATURE is related to DEPTH for lakes.

**Variable: Character** – In this level of measurement, the values of the variable are “qualities” or categoric pigeonholes, which may or may not be orderable. These categoric values can be given code numbers, but the numbers do not refer to an equal-interval scale or to real quantities. Generally, we cannot compute a mean or other quantitative summary measures for the variable. These categories should be exhaustive and mutually exclusive.

**Variable: Continuous** – This level of measurement is like the interval-ratio level. The values of the variable are quantitative, definite meaningful numbers on a scale. Furthermore, we can think of them as points along a continuum that can be subdivided forever. Measuring length or distance with a rule is a simple example of collecting data at a continuous level of measurement. Most researchers treat percentages and other kinds of proportions as continuous data. It makes sense to compute a mean and other quantitative summary measures for these data.

**Variable: Discrete** – These are quantitative variables whose values fall along a scale or metric, often with a true 0, but they are not really continuous. The units of measurement are whole numbers, and it makes little sense to indefinitely subdivide the units. Only a whole number makes sense for the value. For instance, generally people don’t have a fraction of a sibling or a fractional number of body pierces–only whole numbers. These data are discrete, but notice that the numbers do refer to a real scale (not just code numbers), and most researchers end up treating them as if they were continuous data. It makes sense to computer a mean (and other quantitative summary measures). For example, we can talk about the “mean number of children born to women in the Yukon” and come up with a fractional amount although each woman has only a whole number of children.

from Garner, R. (2005). *The joy of stats: A short guide to introductory statistics in the social sciences*. Peterborough, Ont: Broadview Press.

### Other Guides

- Scholars Portal: <odesi>
- Harvard Dataverse Project: Advanced User Guide
- Portage Training Resources

### Resources for Statistics and Data

**Introductory:**

Garner, R. (2010). *The joy of stats: A short guide to introductory statistics in the social sciences*. Toronto, Ont: University of Toronto Press.

Rowntree, D. (2000). *Statistics without tears: A primer for non-mathematicians. *London: Penguin.

**Intermediate:**

Blaikie, N. W. H. (2003). *Analyzing quantitative data: From description to explanation*. London: Sage Publications.

Erickson, B. H., & Nosanchuk, T. A. (1992). *Understanding data *(2nd ed). Toronto, ON: University of Toronto Press.