- What is a Dataverse?
- Who can deposit data?
- How do I get started?
- Does this service cost anything?
- What institutions are affiliated with SP Dataverse? How does my institution become a member?
- What types of files can I upload?
- Are there limitations to Dataverse?
- What is the maximum file size I can upload?
- What if my dataset is too large to upload? Is there support?
- Does Dataverse support confidential or sensitive data sharing?
- How much data can I download from Dataverse?
- How can I collaborate on a Dataverse with my research colleagues?
- Are my data protected and safe?
- How can I share my data?
- I’d like to use Dataverse to collaborate with my research team, but don’t want to release my data. Does my dataset or dataverse need to be published before I can share it?
- Should I deposit my data into an institutional dataverse or into the Scholars Portal root dataverse?
- Can I cite my data in a research publication?
- Does Scholars Portal Dataverse share information about my data with any third-party providers? (e.g. Google)
- How much storage space can I have for my Dataverse?
- Is it possible to move or link to a dataset(s) from one Dataverse into another?
- Who owns the data I deposit?
- Can Dataverse integrate with other research tools?
- Are there other Scholars Portal resources related to data?
- Are there other national RDM tools available?
What is a Dataverse?
A Dataverse is a container for one or more Datasets or Dataverses. Each participating institution has a Dataverse that may contain multiple Dataverses and Datasets. A Dataset is a container for a particular research data set (this can include research data, code, and documentation).
Who can deposit data?
Scholars Portal Dataverse is provided by Scholars Portal on behalf of the Ontario Council of University Libraries (OCUL) and other participating institutions. Institutions that use the service are provided with an Institutional Dataverse that is available for use by affiliated students, researchers, faculty, and staff. Permissions for deposit are managed by the individual institutions.
How do I get started?
Dataverse provides flexible options for a variety of data management and sharing use cases.
Demo Dataverse is open to anyone to get started and try out the features.
Sign up to affiliate and deposit in your institution’s Dataverse using the Scholars Portal Dataverse
Does this service cost anything?
This Dataverse service is provided as a shared service of Scholars Portal on behalf of OCUL and participating institutions. There is no direct cost to researchers or other users.
What institutions are affiliated with SP Dataverse? How does my institution become a member?
Please see the “Help and Support” section of the guide for a list of participating institutions who have Institutional Dataverses.
If your university library is interested in using the SP Dataverse, please contact us.
What types of files can I upload?
Dataverse is ideal for depositing and sharing research data, and supports a variety of research data types and formats. You are welcome to upload research data, supplementary tables and documentation, publications associated with data, presentations, etc.
Dataverse supports the uploading of any file type, including shapefiles, images, Flexible Image Transport System files, and compressed files. However to make use of the Data Explorer feature, you need to upload tabular data files (e.g. Stata, SPSS, R, Excel (xlsx) and CSV).
See the Advanced Dataverse User Guide for more information.
Are there limitations to Dataverse?
SP Dataverse may not be a suitable choice for your research data if any of the following are applicable:
My data requires:
- High volume computational processing infrastructure to share or access;
- Data encryption due to its sensitivity (i.e., personal or protected information stored);
- An automated embargo period before any data can be released (at this time, embargoes can only be assigned and managed manually).
- Large file size upload / download capability:
- At this time, Scholars Portal Dataverse supports upload of files that are less than 2.5GB in size. Note, you may experience further limitations based on network connectivity and connection speed;
- Dataverse provides additional processing on tabular file formats (see Advanced Dataverse User Guide). Given the additional processing required for this feature, files that are larger than 500MB will fail to process using tabular ingest, files will still load and be saved up to the 2.5GB limit;
- Downloading multiple files may require preparation processing on the Dataverse side. If files are larger than 1GB, downloading files individually from a fast internet connection is recommended.
The Scholars Portal development team is actively working with the Dataverse community to improve our infrastructure to support big data. Please contact us if you have any questions about the above limitations or use cases.
What is the maximum file size I can upload?
Please review the following limitations for large file size upload in SP Dataverse:
Single file upload:
- The maximum file size limit for individual file upload is 2.5GB. The experience for each user may vary based on the stability of the internet connection, speed of connection, file complexity, file format and type;
- For tabular data files, additional processing by Dataverse means that more time is required to upload and this may result in upload failures depending on the file size, number of columns, cases or rows, in your file. We recommend you upload files less than 500MB in size that require this feature (see Advanced Dataverse User Guide);
Multiple file upload:
- Dataverse will accept uploads of .zip or .tar file bundles. Automatically Dataverse will unpack your single .zip package file and present individual files contained within to users without any hierarchy or organization maintained. Alternatively, .tar files or double zipped files, can be uploaded directly to Dataverse and will preserve your original file organization. Given the additional processing required for .zip files, we recommend you upload zip files that contain less than 500 files (see Advanced Dataverse User Guide);
- If you have many separate data files and/or metadata that you would like to upload to create or populate a Dataset or Dataverse, consider using the SWORD API (see Advanced Dataverse User Guide), which can support batch upload of files (same size limits as described above).
It is recommended that file size be factored in when considering data reuse. Larger files can be more difficult for researchers to handle and work with. Preparing and structuring your data for reuse requires careful consideration of potential use cases, software required for analysis, computer system and file disk space. Dataverse is always developing to support more disciplines and data types, we anticipate that in the future larger datasets will be better supported and managed in Dataverse.
If you have questions about file size limits, please contact us.
What if my dataset is too large to upload? Is there support?
If you have large size files consider zipping them up individually or breaking them down into logical parts while still maintaining the ability to reproduce data (by year, topic, etc.), so that others may more easily download and reuse them in Dataverse.
If this is not possible, please contact us for support.
Does Dataverse support confidential or sensitive data sharing?
Scholars Portal Dataverse does NOT accept content that contains confidential or sensitive information. Dataverse can be used to share de-identified and non-confidential data only. Contributors are required to remove, replace, or redact such information from datasets prior to upload.
If you have sensitive or confidential data that you need to store or share (containing personally identifiable information such as social security or credit card numbers, health records, etc.), please contact your institution to find out about the policies and options for managing this kind of data outside of Dataverse.
How much data can I download from Dataverse?
You can download as much openly available data as you would like from Dataverse, there are no restrictions.
However, there are some limitations in Dataverse when downloading large files or when selecting multiple files for download using regular HTTP based interactions. Downloading large size files and/or large volumes of data from fast and stable internet connections may produce better and more successful results than those attempting from slower or intermittent internet connections.
Grouped file download based on multiple file selection can be limited to 100MB to 500MB size total, this is based on the current DV server load and your personal internet connection, because the DV system provides a streamed .zip file package download based on your selection. If the data totals are larger than these limits, a log is generated in your download that tells you which files were successfully downloaded given this limit at that time.
If you have questions about downloading data please contact us.
How can I collaborate on a Dataverse with my research colleagues?
Dataverse supports groups of researchers with collaborative data sharing. This is useful for large projects with multiple researchers in different locations. For more information about how to use group permissions and assign account roles, see the “Permissions” section of this guide or visit the Advanced Dataverse User Guide.
Are my data protected and safe?
Dataverse allows you to upload and store backup copies of your research data in case your local copy is lost or destroyed. It is generally a good idea to keep a copy of your research data in Dataverse.
Any data uploaded to Scholars Portal Dataverse can be restricted to only authorized users. You can easily manage the restrictions of your Dataverse and studies to be private, available to only certain IPs, to individual account(s), or to specific groups (see the Permissions sections of this guide or the Advanced Dataverse User Guide for further information).
Security is in place to protect your data from others who wish to exploit or access data that they are not authorized to. Scholars Portal makes backup copies of the data you upload regularly in the event of a server or system malfunction, malicious attack, or other technical issue.
We encourage you to share data openly as a good scientific practice. Please do your due diligence and attempt to make the outputs of your research open for all to see and reuse.
Dataverse allows you to share your data with anyone in the world. It uses the DOI system, an international standard for simple, persistent identification for access to data on the web. Even if the location or version changes, the same URL will always point to the most current version of your data.
Creating a Private URL for your dataset allows you to share privately (for viewing and downloading of files) before it is published. It is not necessary to have a Dataverse user account to view the dataset; simply share the link with groups or individuals to allow them to view your data.
You are also able to provide groups and individuals access to your data using dataset-level and file-level permissions. This approach works for both published and unpublished datasets.
Please see the Advanced Dataverse User Guide for more information on sharing data and permissions.
No, Datasets or Dataverses do not need to be published in order to share them. Please see the question above or the Advanced Dataverse User Guide for more details.
Should I deposit my data into an institutional dataverse or into the Scholars Portal root dataverse?
Researchers may choose to deposit their data in an institutional dataverse if they want to share their work with their institution (e.g. to facilitate collaboration, meet grant requirements, etc.).
The Scholars Portal root dataverse may be used to deposit data if the research being conducted is multi-institutional. If you are not affiliated with a participating institution, please contact us for more information about depositing in the root dataverse.
Can I cite my data in a research publication?
Dataverse automatically generates a standard data citation that can be used in any research publication, so you can easily be given credit. Users can download the citation in various formats by using the “Cite Data” button. The citation includes a persistent URL that will point people quickly and easily to your study in Dataverse.
Datasets will be assigned a Digital Object Identifier (DOI) that conforms to the DataCite Canada service. All datasets published in Dataverse with a DOI will be automatically indexed in DataCite’s Global Search Service.
By default, any metadata about published data in Dataverse are harvestable by other Dataverses, repositories, global search engines, etc. Dataverse is regularly crawled by Google and all of the open metadata is indexed and searchable via the Google search service. The Dataverse platform supports open APIs and the Open Archives Initiative (OAI) harvesting protocol (OAI-PMH), which allows published and public files to be harvested by other systems for the purposes of global data discovery.
Scholars Portal has been working with the Shared Access Research Ecosystem (SHARE) and the Canadian Federated Research Data Repository (FRDR) to integrate publicly released datasets into open web discovery services. The SHARE Notification System is a higher education-based initiative to strengthen efforts to identify, discover, and track research outputs. Its goal is to aggregate and make discoverable many types of research outputs (including data). Scholars Portal Dataverse was the first instance of a Dataverse feeding metadata into the SHARE Notification System. This does not affect the workflow for contributors to the Scholars Portal Dataverse.
How much storage space can I have for my Dataverse?
There is no storage limit currently enforced; however, if you are depositing a lot of data that in total exceeds 10GB, it is best to contact your library so that we are aware of the storage needs and can approve further storage should you need it.
To schedule an appointment or to get support with storage, contact your local library.
For technical questions contact us.
Currently, the ability to move a dataset or to link a dataset or dataverse to another dataverse is a super user only feature.
If you have questions, please contact us.
Who owns the data I deposit?
Since Dataverse is a service of your university library, you as the researcher, and in some cases your institution and/or collaborators retain control and ownership of the data. Scholars Portal Dataverse is hosted completely on Canadian servers in Toronto, ON, so there’s no question about jurisdiction or what laws apply.
Can Dataverse integrate with other research tools?
Dataverse is integrated with the Open Science Framework (OSF), which provides cloud-based management for your projects, enabling you to keep all project files, data and protocols in one centralized location. The OSF can also be connected to several third-party storage services, including Dataverse. See the OSF Guide for further information. This feature is especially useful for collaborating with team members in remote areas.
Other integrations are listed here.
<odesi> is a web-based data exploration, extraction, and analysis tool. It lets you search for survey questions (variables) across thousands of datasets held in a growing number of collections, and supports basic tabulation and analysis online. It supports downloading of datasets into statistical software for further analysis. Statistics Canada’s public-use survey data forms the core of <odesi>’s social survey data holdings, other major collections include the Canadian Opinion Research Archive (CORA), Canadian Gallup opinion polls, and Ipsos Reid opinion polls.
Publicly available datasets in Dataverse are searchable in <odesi> using the “Canadian Dataverses” collection search option.
The Scholars GeoPortal repository provides access to geospatial datasets, including land-based vector data, census geography, and orthophotography. From the GeoPortal, you can download geospatial data by predefined or customized areas, to create a map based on your study area and combine with your own data.
Are there other national RDM tools available?
DMP Assistant is a bilingual tool used to assist researchers with the preparation of data management plans (DMPs). The tool follows best practices in data stewardship and walks researchers step-by-step through key questions about data management.
Federated Research Data Repository (FRDR)—a system for Canadian researchers to deposit and share research data and to facilitate discovery of research data in Canadian repositories. Scholars Portal Dataverse is harvested and indexed in FRDR.