Search Handbook

Datasets and DOIs

Datasets are virtual copies of a set of records that users can create when they want to analyze a subset of records over an extended period of time, submit some, but not all records in a project to GenBank, publish a subset of the data, and request a DOI on their dataset. By creating a dataset, a user has the ability to select records from a variety of projects on BOLD and group them together in one permanent list without having to move the records out of the home projects.

For instance, a researcher working on Lepidoptera may want to create a Dataset for one genus due to an upcoming publication on those records, but wishes to leave these records in their home projects. When adding records to a dataset, the actual records remain in the original projects where they can continue to be edited as necessary. Any changes to the actual records will appear simultaneously on the records in the dataset.

Creating a new Dataset

Once logged into BOLD, select the New Dataset button in "Your Datasets" section of the User Console. By choosing to make the dataset public, the user is provided with the opportunity to request a DOI, now available via a partnership between BOLD and DataCite. The assigned DOI can be used in manuscripts in place of supplementary tables, and can be referenced when used by others.

Dataset Properties The BOLD Dataset creation form with DOI request pop-up highlighted.

Fields in the BOLD Dataset properties form. (* denotes required fields)
Dataset Title* A descriptive name describing the scope of the Dataset.
Dataset Code* A 4-8 alphanumeric code that is unique in BOLD for quick reference. (This can not be modified after a Dataset is created).
Dataset Description* Summary of the use and intention of the Dataset. This description will be displayed on the Dataset Console.
Dataset Access Makes the Dataset and records publicly visible to all BOLD users. Dataset managers can request a DOI to be registered in association with their datasets once these are made public.
Bounding Box The bounding box of the collection area covered by the dataset using 2 pairs of GPS coordinates for top left and bottom right position.

Most of the information provided during the dataset creation can be edited by the dataset manager by clicking on Modify Project Properties from within the Dataset Console.

Adding and Removing records

Users can add any records they have access to, to a dataset in batches of up to 2,500 records at a time. Datasets can be overlapping, that is, the same record can be utilized in multiple Datasets by multiple users. A single Dataset can contain up to 25,000 records.

To remove records from a Dataset, select them from the Record List and click Remove Records from the menu on the left hand side.

  • tag_genbank
  • tag_publication
  • tag_account
  • tag_user
  • tag_download
  • tag_search
  • tag_submission
  • tag_analysis
  • tag_annotation

Back to Top