Analysing and addressing gaps and biases in primary biodiversity data
The GBIF Ebbe Nielsen Challenge is an annual competition that seeks to inspire innovative applications of open-access biodiversity data.
GBIF—the Global Biodiversity Information Facility—is an open-data research infrastructure funded by the world's governments. We operate through a worldwide network that includes dozens of member states and hundreds of publishing organizations, and a coordinating Secretariat based in Copenhagen.
Through our website, GBIF.org, and associated web services, we provide free and open online access to species occurrence data, sometimes also called primary biodiversity data. Whether drawn from field observations, museum specimens, scientific literature or other sources, these occurrence records represent evidence of an organism observed or collected at a specific time and place.
In 2016, the Challenge will focus on the question of data gaps and completeness, seeking tools, methods and mechanisms to help analyse the fitness-for-use of GBIF-mediated data and/or guide priority setting for biodiversity data mobilization. We expect both data users and data holders to benefit from this year’s emphasis on gaps and completeness.
So: how well do data about life on earth—and specifically, biodiversity data mobilized through GBIF—allow us to understand how the world works? The answer can be different based on the research question.
Data users need help in assessing whether available data is suitable and sufficient to address their research investigations or to model biodiversity patterns of interest. If it is, the data are often described as ‘fit for use’.
Meanwhile, data holders (and, more importantly, their funders) can profit from an increased understanding capable of guiding and prioritizing mobilization and digitization efforts toward significant known temporal, spatial or other use-specific gaps.
Three useful (complementary) ways to think about and approach this issue:
- Information: How much detail can be seen of real-world biodiversity patterns through available data
- Gaps: Where are the major weaknesses in the data aggregated through GBIF.org?
- Ignorance: What are the aspects of biodiversity that we wish we could see but which have not been measured?
Each of these perspectives may be applied to the geographic, taxonomic, temporal, environmental and other aspects of available data.
In this challenge, entrants should explore and demonstrate how the approach used in their tools, methods and mechanisms apply to one or more of the following cases:
- Determining the completeness and consistency of data coverage for any taxonomic group at continental or global levels
- Providing a national view of the coverage and detail of available data for all taxa or particular taxonomic groups throughout the country.
- Providing a view of coverage and detail for other area types, such as environments (e.g., marine) or geographies (e.g., protected areas)
- Evaluating the data for an individual species to assess confidence in the completeness of coverage for that species
While the intent of the Challenge is for entries to improve users’ and/or publishers' understanding of GBIF-mediated data, entries may make use of external data (e.g., species traits, environmental information, protected areas, etc.) as appropriate.
Summarized from the Official Rules
The Challenge is open to individuals, teams of individuals, companies and their employees, and governmental agencies and their employees.
The Challenge is NOT open to:
- Members of the GBIF Secretariat
- Individuals currently under an external contract issued by the GBIF Secretariat
- Members of the GBIF Science Committee
- Heads of Delegation to GBIF
Submissions will consist of three main elements:
- Entry details, including the names of all team members; identification of a lead team representative; the taxonomic, temporary, geographic and other ‘dimensions’ of gap(s) considered; key audience(s) and user groups served (see GBIF Communications Strategy), and the objective of the entry
- Narrative description, which defines the gap(s); justifies or provides evidence of its relevance; describes the tool, method or mechanism for addressing it; and the solution’s relevance to GBIF and/or GBIF.org
- Results, in the form of a prototype, demo, video or slides, along with any relevant technical requirements or implementation details.
How to enter
- Register for the Challenge. Registrants must either create a ChallengePost account or log in with an existing ChallengePost account. There is no charge for creating a ChallengePost account, and doing so will ensure that you receive updates and can access the “Enter a Submission” page. Note that all team members must also create a ChallengePost account in order to be added to a Submission.
- Review the suggested reading, which catalogues previous investigations into existing gaps and biases in GBIF-mediated data.
- Familiarize yourself with your methods for accessing GBIF-mediated data, whether through GBIF.org, GBIF web services, or other tools like rgbif.
- Consider existing GBIF analyses of global data trends. If appropriate, see also national mobilization trends, national publishing trends, and the national reports available on each of the detail pages listed here.
- Produce a concept, prototype or method that a) defines and analyses a geographic, taxonomic, temporal, environmental or other gap or bias in GBIF-mediated data, b) identifies the audience(s) affected by this gap and how addressing it improves their use of or access to GBIF-mediated data, c) outlines how identifying these gaps or biases can help set priorities for data mobilization or enhancements to GBIF
- Complete and enter all of the required fields on the “Enter a Submission” page of the Challenge Website (each a “Submission”) by the end of the Challenge Submission Period—that is, by midnight CEST (UTC +1) on 30 Sept 2016.
Co-founder and partner, OpenCore
Head of Plants Division, Natural History Museum, London
Co-founder and full-stack developer, Datafable
Professor of Taxonomy, University of Glasgow
How relevant is the submission for addressing particular gaps or measures of completeness?
How creative and effective is the proposed tool, method or mechanism?
How well does the proposed solution work? How easily and reliably can it be integrated or linked with tools, services and/or workflows connected to GBIF.org and GBIF API?