The Scholars Portal Dataverse is a repository primarily for research data collected by researchers and organizations affiliated with Ontario universities, although anyone in the world is welcome to use Scholars Portal Dataverse to deposit, share, and archive data.
Dataverse is an open-source tool developed by the Institute for Quantitative Social Sciences (IQSS) at Harvard University. Scholars Portal Dataverse is provided as a shared Ontario Council of University Libraries service.
Researchers can use Dataverse to directly deposit data, create metadata, release and share data openly or privately, visualize and explore data, and search for data.
For more information about Scholars Portal Dataverse, please contact email@example.com.
You can find the Advanced Dataverse User Guide here.
What is a Dataverse?
A Dataverse is a container for one or more Datasets or Dataverses. Each Ontario University has a Dataverse that contains many Dataverses and Datasets. Researchers can create Dataverses for their own research data and projects, and/or directly deposit Datasets within their Institutional Dataverse.
A Dataverse accepts all kinds of data files: tabular, text, image, etc. All file formats are accepted.
What is a Dataset?
A Dataset is a container for a particular research data set (this can include research data, code, and documentation).
Datasets have an associated metadata record (also referred to as cataloging information or data documentation). This metadata provides contextual information on the dataset. Please see here for more information on creating metadata for datasets.
Why use Dataverse?
Some key benefits to using Dataverse to manage your research data include:
- Effective sharing. Dataverse is a convenient way to disseminate your data, and can facilitate your research team’s collaboration within a secure space.
- Track changes. Dataverse provides increased control over managing changes to a project without overwriting any part of that project, an especially useful feature when working on a team.
- Long-term access and preservation. Persistent identification to your data ensures reliable protection and prevention from data obsolescence.
- Organization and compatibility. Create your own personal web data archive that conforms to metadata standards to maximize system compatibility and searchability.
- Save time. Dataverse has an easy to use interface for uploading and searching through your data.
- Increase research visibility. Increase scholarly recognition for your work beyond your research publications.
- Meet grant requirements. Many funding agencies now require that researchers deposit data which collected as part of their research project into an archive.
Crosas, M. The Dataverse Network: An Open-source Application for Sharing, Discovering and Preserving Data. D-Lib Magazine. 2011;Volume 17(1/2).
King, Gary. 2007. An Introduction to the Dataverse Network as an Infrastructure for Data Sharing. Sociological Methods and Research 36: 173–199.
Available at http://j.mp/iHJcAa
Who can deposit data?
At this time, anyone can sign up for an account and deposit data with Dataverse, although the service is primarily for those affiliated with an Ontario University, including students, researchers, faculty, and staff.
Dataverse also supports groups of researchers to upload and collaborate on a study or Dataverse together. This is useful for large projects with multiple researchers in different locations. For more information about how to use group permissions and assign account roles, see the Permission sections of this guide or visit the Advanced Dataverse User Guide created by Harvard.
Does this service cost anything?
The Scholars Portal Dataverse service is paid for by the Ontario Council of University Libraries (OCUL). There is no direct cost to researchers or other users.
Are my data protected and safe?
Dataverse allows you to upload data and store it in case your local copy is lost or destroyed. It is generally a good idea to keep a copy of your data in Dataverse.
Any data uploaded to SP Dataverse can be restricted to only authorized users. You can easily manage the restrictions of your Dataverse and studies to be private, available to only certain IPs, to individual account(s) or to specific groups (see the Permissions sections of this Guide for further information). Security is in place to protect your data from others who wish to exploit or access data that they are not authorized to, including sensitive data.
Sharing data openly is considered good scientific practice and in most academic disciplines it is actively encouraged, if not required. Please do your due diligence and attempt to make the outputs of your research open for all to see and reuse.
Scholars Portal makes backup copies of the data you upload to Dataverse regularly in the event of a server or system malfunction, malicious attack, or other technical issue.
Dataverse allows you to upload and share your data with anyone in the world. It uses the DOI system, an international standard for simple, persistent identifiers. Even if the location or version of your data changes, the same URL will always point to the most current version.
Creating a Private URL for your dataset allows you to share your dataset (for viewing and downloading of files) before it is published. It is not necessary to have a Dataverse user account to view the dataset; simply share the link with groups or individuals to allow them to view your data. Please see the Advanced User Guide here for more information on creating your Private URL.
You are also able to provide groups and individuals access to your data using dataset-level and file-level permissions. This approach works for both published and unpublished datasets.
I’d like to use Dataverse to collaborate with my research team, but don’t want to release my data. Does my dataset or dataverse need to be published before I can share it?
No, datasets or dataverses do not need to be published in order to share them. Please see the question above for more details.
Should I deposit my data into an institutional dataverse or into the Scholars Portal root dataverse?
Researchers may choose to deposit their data in an institutional dataverse if they want to share their work with their institution (e.g. to facilitate collaboration, meet grant requirements, etc.).
The Scholars Portal root dataverse may be used to deposit data if researchers are not affiliated with one of the listed institution, if they are conducting personal research, or if the research being conducted is multi-institutional.
Can I cite my data in a research publication?
Dataverse generates a standard data citation that can be used in any research publication, so you can always be given credit. Users can download the citation in various formats by using the “Cite Data” button. The citation includes a persistent URL that will point people quickly and easily to your study in Dataverse.
Datasets and dataverses deposited after the upgrade to Dataverse 4 will now be assigned a Digital Object Identifier (DOI) that conforms to the DataCite Canada principles for identifiers for datasets
Does Scholars Portal’s Dataverse share information about my data with any third-party providers? (e.g. Google)
By default, any data that are uploaded and released openly are harvestable by other Dataverses, repositories, search engines etc. The Dataverse platform supports the Open Archives Initiative (OAI) harvesting protocol OAI-PMH, which allows published and public files to be harvested by other systems for the purposes of data discovery.
The Data Sharing API is another way developers can access publicly shared data and metadata for use in other systems or applications.
Scholars Portal has been working with the Shared Access Research Ecosystem (SHARE) to integrate publicly released Dataverse studies into their beta SHARE Notification System. The SHARE Notification System is a higher education-based initiative to strengthen efforts to identify, discover, and track research outputs. Its goal is to aggregate and make discoverable many types of research outputs (including data). Scholars Portal’s Dataverse will be the first instance of Dataverse feeding metadata into the Notification System. This does not affect the workflow for contributors to the Scholars Portal Dataverse.
What is the maximum file size I can upload?
The maximum file size you can upload to Dataverse through the user interface is 2GB per file. This is typical of most online repository systems. Please note that this is the upper limit, and performance will depend on your system and network settings.
For data that are uploaded as “subsettable” files through the UI (for exploring, visualization, and subsetting from the “Data & Analysis” tab), file size limits depend on the number of columns in your dataset (for tabular datasets), as well as, the number of cases or rows in your file. The 2GB restriction does not apply for subsettable files. It is typical that uploads will fail if your file is larger than 400 to 500MB in size.
If you have larger datasets, there is also the Dataverse SWORD API, which can support upload of up to 2GB files.
In all cases, it is recommended that file size be considered when assessing data reuse, as larger files are more difficult for researchers to handle and work with. Preparing and structuring your data for reuse requires careful consideration of the end-users, software required for analysis, download, and space considerations. Nevertheless, some datasets are just large and it is true that some disciplines have larger sized datasets. Dataverse is always developing to support more disciplines and data types, so it is possible that in the future larger datasets will be better managed by Dataverse.
If you have questions about file size limits and upload of data, please contact us at firstname.lastname@example.org
What if my dataset is too large to upload? Is there support?
The Dataverse SWORD API, which can support upload of up to 2GB files, is another way to upload data to the system. If you have larger files consider zipping them up, or breaking them down into logical parts while still maintaining the ability to reproduce data (by year, topic, etc.) so that others may easily reuse them.
If this is not possible, please contact us at email@example.com for support.
What can I upload?
Dataverse is ideal for depositing and sharing research data, and supports a variety of research data types and formats. You are welcome to upload research data, supplementary tables and documentation, publications associated with data, presentations, etc.
What types of files can I upload?
Dataverse supports the uploading of any file type. This includes tabular data files (Stata, SPSS, R, Excel (xlsx) and CSV), shapefiles, Flexible Image Transport System files, and compressed files.
How much data can I download from Dataverse?
You can download as much openly available data as you would like from Dataverse; there are no restrictions. You can also harvest the metadata and data of all open and released Dataverses and studies using the Dataverse Data Sharing API.
There are some restrictions for downloading certain files and selecting multiple files in Dataverse to download. Dataverse allows a user to select multiple files or grouped file downloads so that you can download all the files associated with a particular study at once. The grouped file download is limited at 500MB file size, which is represented as a .zip file package download. If the study contains data that totals are larger than 500MB, a log is generated in your download that tells you which files were successfully downloaded given this limit.
If you have questions about downloading please contact us at firstname.lastname@example.org
How much storage space can I have for my Dataverse?
There is no storage limit currently enforced, however if you are depositing a lot of data that in total exceed 10GB it is appropriate to contact your library so that we are aware of the storage needs and can approve further storage should you need it.
To schedule an appointment or to get support with storage, contact your local library. For technical questions contact email@example.com
How can I link to another study I found from within my Dataverse?
Currently, the ability to link a dataverse to another dataverse or a dataset to a dataverse is a super user only feature.
If you have questions about linking within Dataverse, please contact us at firstname.lastname@example.org
Who owns the data I deposit?
Since Dataverse is a service of your university library, you and your institution retain control of the data at all times. Scholars Portal Dataverse is hosted completely on Canadian servers, so there’s no question about jurisdiction or what laws apply.
<odesi> is a web-based data exploration, extraction, and analysis tool. It lets you search for survey questions (variables) across thousands of datasets held in a growing number of collections, and supports basic tabulation and analysis online. It also allows for the downloading of most datasets into statistical software for further analysis. Statistics Canada’s public-use survey data forms the core of <odesi>’s social survey data holdings, but we are expanding its survey data to include other national and international data sources. Key polling data collections include the Canadian Opinion Research Archive (CORA), Canadian Gallup, and Ipsos Reid.
We are in the process of making public datasets and dataverses searchable on <odesi>. Stay tuned for future announcements!
The Scholars GeoPortal tool provides access to geospatial datasets, including land-based vector data, census geography, and orthophotography. From the GeoPortal, you can download geospatial data such as the census boundary files for a census metropolitan area, download the associated census data from <odesi>, and pull these datasets together in a GIS tool to create a map based on your data.
- The OSF provides cloud-based management for your projects, enabling you to keep all project files, data and protocols in one centralized location. The OSF can also be connected to several third-party storage services, including Dataverse 4.0. See the OSF Guide for further information.
- DMP Assistant is a bilingual tool used to assist researchers with the preparation of data management plans (DMPs). The tool follows best practices in data stewardship and walks researchers step-by-step through key questions about data management.