Data publishing
Most institutions publishing data to GBIF need to convert their data into a format suitable for GBIF to process, typically Darwin Core Archive. Datasets shared with GBIF must be formatted as one of the supported dataset classes or data packages, and adhere to the data quality requirements and recommendations.
Tools including the GBIF IPT and BioCASe can convert data stored in spreadsheets and databases to the appropriate formats. The IPT is the most common way to publish data to GBIF.
Some institutional collections management systems, such as Symbiota or EarthCape, can export all or part of their data to GBIF.
Users or institutional systems (custom software) which can generate Darwin Core Archives and make them available on a webserver have two options:
-
For occasional datasets (one or two per year) contact the GBIF helpdesk, who will register the dataset on your behalf.
-
If new datasets will be registered more frequently, you may register the datasets directly using the API.
Further discussion of the options can be found in this blogpost.
Tools to quality check your publication
Dataset validator
The dataset validator can be used to validate zipped Darwin Core Archive datasets.
Species matching
The species matching tool can be used to normalize species names from a CSV file against the GBIF backbone.
Species API (link to API topic)
Name usage, search and parsing can be carried out with the species API.
Flags and issues
When records are published to GBIF, they may receive various data quality flags and issues. The meaning and how to deal with the different issues are documented for occurrence and checklist datasets.