Type
Guideline

Data standards in Dandjoo

Summary

Making biodiversity data more searchable and usable.

Hierarchy
Part of Dandjoo

About data standards

Dandjoo brings data from many different people and organisations together. This means that when data arrives at BIO for ingestion it may contain a variety of different fields and be structured quite differently from submission to submission.

Incoming data needs to be standardised so it can be represented consistently in Dandjoo and be combined and searched by data users. Dandjoo does this by allowing key fields in data submitted to be mapped to widely recognised biodiversity data standards.

 

The Darwin Core standard

Darwin Core is the first biodiversity data standard to be incorporated into Dandjoo, and BIO is currently reviewing other standards (such as VegX) for inclusion in future releases.

Darwin Core is an internationally recognised standard that supports management and sharing of biodiversity data, and defines a glossary of terms in a flat structure to represent taxon, occurrences, specimens, and samples.

Resources:

 

When users export search results from Dandjoo, the export will contain the Darwin Core fields shown in the table below. 

Some fields may be blank for some records - this means that the submitter who provided a record did not collect or upload data for that field, or may not have mapped the field during data submission. However, all submitters must provide certain core fields to show an organism’s name, and to indicate where and when it was observed.

 

Dandjoo Field Darwin Core Property Name Darwin Core Property Description
Recognised Scientific Name dwc:acceptedNameUsage The full name, with authorship and date information if known, of the currently valid (zoological) or accepted (botanical) taxon. (Note: In Dandjoo, this is BIO’s determination of the most recently-known current name of the organism observed.)
Field Scientific Name dwc:scientificName The full scientific name. When forming part of an Identification, this should be the full name, including the lowest level taxonomic rank that can be determined. (Note: In Dandjoo, this is the taxonomic name originally provided by the data submitter, after curation to address any errors.)
Date observed/Date collected dwc:eventDate The date-time on which an Event occurred. For occurrences, this is the date-time when the event was observed. Not suitable for a time in a geological context.
Submitter dwc:institutionCode The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. (Note: In Dandjoo, this is the organisation that had custody of the information in the record, and submitted it to Dandjoo.)
Rights Holder dwc:rightsHolder The person or organization owning or managing rights over the record. (Note: In Dandjoo, this is the submitter in most cases.)
Latitude dwc:decimalLatitude The geographic latitude (in decimal degrees, using the WGS84 (EPSG:4326) system) of the geographic centre of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive.
Longitude dwc:decimalLongitude The geographic longitude (in decimal degrees, using the WGS84 (EPSG:4326) system) of the geographic centre of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive.
Dataset

dc:title  (Note: from the Dublin Core standard)

A name given to the resource. (Note: In Dandjoo, this is the dataset name provided by the submitter.)
Count dwc:individualCount The number of individuals present at the time of the Occurrence.
Method/Protocol dwc:samplingProtocol The names of, references to, or descriptions of the methods or protocols used during an Event.
Conservation Status (authorized users only) threatStatus  (Note: from the GBIF Darwin Core Extension: Species Distribution) Conservation status of a species.  (Note: This is populated and updated in Dandjoo based on the most recent threatened and priority species lists maintained by the Western Australian Government.)
Identification basis (e.g. fossil, live specimen) dwc:basisOfRecord The specific nature of the data record. e.g. Fossil, live specimen etc.
Field identification (original field name) dwc:verbatimIdentification A string representing the taxonomic identification as it appeared in the original record.
Date identified (if different to occurrence date) dwc:dateIdentified The date on which the subject was determined as representing the Taxon.
Identification Ambiguity dwc:identificationQualifier A brief phrase or a standard term ("cf.", "aff.") to express the determiner's doubts about the Identification.
Identification notes dwc:identificationRemarks Comments or notes about the Identification.
Collector dwc:recordedBy A list (concatenated and separated) of the globally unique identifier for the person, people, groups, or organizations responsible for recording the original occurrence. Recommended best practice is to separate the values in a list with “space vertical bar space” ( | ).
Identified by dwc:identifiedBy A list (concatenated and separated) of names of people, groups, or organizations who assigned the Taxon to the subject. Recommended best practice is to separate the values in a list with “space vertical bar space” ( | ).
Human observation ID dwc:occurrenceID An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique.
Specimen ID dwc:materialSampleID A physical result of a sampling (or subsampling) event. In biological collections, the material sample is typically collected, and either preserved or destructively processed.
Dataset ID dwc:datasetID An identifier for the set of data. May be a global unique identifier or an identifier specific to a collection or institution. Free text
Scientific name publisher dwc:scientificNameAuthorship The authorship information for the scientific name formatted according to the conventions of the applicable nomenclatural code.
Taxonomic Rank (e.g. species, subspecies, variety) dwc:taxonRank The taxonomic rank of the most specific name in the scientificName.
Organism Remarks dwc:organismRemarks Comments or notes about the Organism instance.
Presence/Absence dwc:occurrenceStatus A statement about the presence or absence of a Taxon at a Location.
Preparations (e.g. ethanol, dried) dwc:preparations A list (concatenated and separated) of preparations and preservation methods for a specimen.
Genomic sequence information dwc:associatedSequences A list (concatenated and separated) of identifiers (publication, global unique identifier, URI) of genetic sequence information associated with the Occurrence.
Life Stage (e.g. juvenile, nymph) dwc:lifeStage The age class or life stage of the Organism(s) at the time the Occurrence was observed.
Reproductive State (e.g. pregnant, flowering) dwc:reproductiveCondition The reproductive condition of the biological individual(s) represented in the Occurrence.
Native/introduced/feral dwc:establishmentMeans Statement about whether an organism or organisms have been introduced to a given place and time through the direct or indirect activity of modern humans.
Geographic uncertainty (m) dwc:coordinateUncertaintyInMeters The horizontal distance (in meters) from the given decimal-Latitude and decimal-Longitude describing the smallest circle containing the whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term.
Area/locality of occurrence (e.g. 200km north of Perth) dwc:locality The specific description of the place. (Note: This is a free text field that does not correspond to a specific geography or list of regions.)
Habitat dwc:habitat A category or description of the habitat in which the Event occurred.

Frequently asked questions

Why is my species missing from Dandjoo?

If you are unable to find any data about a particular species, it may be restricted if the species is a threatened or priority one, in line with the Biodiversity Conservation Act 2016.

For more information, see How can I see data about threatened and other sensitive species?

Does Dandjoo accept data for taxa that can’t be identified to genus level?

At present, we’re only ingesting records that relate to organisms that have been identified to a genus level. However, we’re aware that this poses some limitations for invertebrate observations, and it’s something we’re keen to enhance in future releases.

How do I attribute data I’ve sourced from Dandjoo?

When citing information retrieved via Dandjoo, you should attribute it to the Rights Holder. The Rights Holder for each record is identified any data extracts downloaded from Dandjoo.

You should also reference Dandjoo as the source, citing DBCA as the publisher – for instance ‘Biodiversity, Conservation and Attractions [current year] Dandjoo search accessed on the [date of search]’.

How is the data in Dandjoo licensed?

The data in Dandjoo is generally provided under a CC BY 4.0 licence, except where:

  • the record indicates it has been provided by the Department of Water and Environmental Regulation’s Index of Biodiversity Surveys for Assessment (IBSA) program (which allows for bespoke licensing arrangements); or
  • the data relates to a threatened species or ecological community under the Biodiversity Conservation Act 2016, where limitations to data sharing apply.

If you’re uncertain about the licensing conditions that apply, contact us and we’ll help you out.

What do I do if I have a question about a specific record or dataset?

We’ll be happy to help you out if you send us a message. Make a note of the Record ID or dataset name you’re asking about and we’ll look into it for you. If we can’t give you an answer right away, we’ll get in touch with the original data provider on your behalf.

How can I see data about threatened and other sensitive species?

Information relating to threatened species and ecological communities is not publicly available via BIO. BIO is trialling the delivery of this functionally for approved internal users, but at the current time threatened species information still needs to be requested via DBCA’s Species and Communities Branch.

BIO is also working with other States and Territories to develop a national best-practice approach to sharing threatened species data to the public with reduced geographic precision. When complete, this approach will be implemented in Dandjoo, safely allowing public users to view threatened species records.

Who can submit data to Dandjoo?

At launch, we’re prioritising datasets collected from industry surveys and by the research sector. We recognise the value of all data sources, including citizen science data, and as Dandjoo matures we’ll explore ways to ingest data from a wider variety of sources while allowing users more control over the types of data they want to see.

Do get in touch if you’re interested in providing data – we’re keen to talk to you.

Does Dandjoo contain data from the Department of Water and Environmental Regulation’s Index of Biodiversity Surveys for Assessment (IBSA)?

Dandjoo has been pre-populated with data provided directly by the private sector - this data is considerably richer than that submitted for IBSA and covers a longer time period.

The BIO team is currently working on ingestion of the entire collection of historical IBSA datasets, and these will appear in Dandjoo as each is processed.

Is Dandjoo’s data the same data that DBCA used to provide on the NatureMap platform?

The datasets previously provided via NatureMap are now available in Dandjoo. (We’ve also updated some of these datasets where refreshed data is available, and will continue to work with data custodians to update them periodically.)

Dandjoo’s collection is considerably larger than that previously available in NatureMap, as it also includes new datasets from industry, researchers, and regulatory agencies.

Does Dandjoo contain both terrestrial and marine data?

Most records in Dandjoo relate to terrestrial species, since much of the data is generated by industry processes - for example surveys undertaken for regulatory approvals. However, marine data is not entirely absent - for example, many marine species are represented in records from the Western Australian Museum.

What kinds of data can I find in Dandjoo?

Dandjoo currently accepts three types of data:

Species occurrence data: This is data about where a species was observed. When these datasets are provided to BIO, they contain a list of records by species, with information about the date and place each was observed. (Each record in a dataset may refer to one individual of the species, or may include a count to indicate how many individuals were observed. 

Systematic survey data: This is data that relates to observations of multiple species in a systematic survey. When these datasets are provided to BIO, they generally contain a list of plots, and include information about all the species observed in each plot. In the leadup to the platform’s launch, we worked with data providers to restructure systematic survey data into species occurrence data where feasible. We appreciate that this approach results in the loss of rich site information - one of our priorities for the future is to enhance Dandjoo’s ability to ingest and visualise systematic survey data.

Vegetation association data: These datasets contain polygons that define the boundaries of vegetation associations. Currently we’re providing these as a simple overlay that can be viewed in the map interface. As with systematic survey data, we’re planning to enhance the way in which this data is presented in future releases.

Can I connect to Dandjoo via an API?

Yes, check out our API documentation for details, and do tell us about what you're working on - we’re keen to hear about how you’re using the platform, and how we can support your project.

I have an idea for a new feature – can you implement it in the next version of Dandjoo?

We want to make sure future development is informed by users, and are keen to have your input. You can also contact us to find out more about our forthcoming User Consultation Committee, and how your sector is represented.

How is Dandjoo different to other data sharing platforms?

For data providers, we’ve taken an approach that you shouldn’t need to use a template, provide a set number of fields, or - where possible - reformat date and location information data in your dataset to meet a prescribed format. We want to make it as easy as possible for you to submit data - if you’re providing species occurrence data, you can even use our self-service quality assurance tools to map columns in your dataset to those recognised by Dandjoo.

We’re also committed to maintaining the integrity of your data; if we have any questions about specific records in your dataset, we’ll let you know so you can decide whether you’d like us to make a correction or redact a record.

For data users our map-based interface is designed to be user-friendly and provide a familiar experience for those who have used other biodiversity data platforms. In addition, it is underpinned by a number of data quality innovations.

Data is reviewed by our team of curatorial staff prior to publication, and mapped to 33 key fields from the Darwin Core data standard. Dandjoo also retains all the original data fields submitted by the data provider, so we can extend those mappings in the future and even extend the platform to include additional standards.

The platform also contains a number of data sets that have never been released before, including data from the private sector, and the data undergoes routine curation to ensure that taxonomic name and conservation code information is kept up-to-date.

Does my business have to submit data to Dandjoo as part of a regulatory process?

There are no requirements to submit data directly to Dandjoo.

We’re currently working with the Environment Online team at the Department of Water and Environmental Regulation on the implementation of an integrated data environment. This will mean that data submitted as part of a regulatory process will flow seamlessly into the platform. However, if your organisation has a collection of historical biodiversity data and would like to provide it to BIO, please do let us know.

Join the BIO newsletter and get updated first

Sign up for access to the latest developments at the Biodiversity Information Office, upcoming Dandjoo features, and our newest datasets.

 

Get the BIO newsletter

Image
Map of Western Australia with location points plotted