Merge pull request #374 from tdwg/rfc2119

Rfc2119
This commit is contained in:
John Wieczorek 2021-08-06 12:43:15 -03:00 committed by GitHub
commit 686fb33dd8
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
7 changed files with 1048 additions and 260 deletions

View File

@ -43,13 +43,17 @@ The Darwin Core RDF Guide is targeted toward those who wish to share biodiversit
Sections of this document are explicitly identified as either normative or non-normative. All numbered examples are non-normative, even if they fall within sections designated as normative. Tables may be designated as non-normative, even if they fall within sections designated as normative. Sections of this document are explicitly identified as either normative or non-normative. All numbered examples are non-normative, even if they fall within sections designated as normative. Tables may be designated as non-normative, even if they fall within sections designated as normative.
#### 1.1.1 RFC 2119 key words
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119).
### 1.2 Rationale (non-normative) ### 1.2 Rationale (non-normative)
Darwin Core is a vocabulary which provides terms that can be used to describe the properties and types of entities (known in RDF as "resources") in the biodiversity realm. Darwin Core is a general purpose vocabulary because its terms can be used as part of a number of data transfer systems. RDF differs in several important ways from other data transfer systems for which DwC usage guides exist ([Text Guide](http://rs.tdwg.org/dwc/terms/guides/text/) and [XML Guide](http://rs.tdwg.org/dwc/terms/guides/xml/)). By its nature, RDF is a distributed system. It is assumed that data from one provider will be linked to data from other providers. This also implies that it is always possible to discover new data properties about a particular resource and that those properties may be described using unfamiliar terms. This differs significantly from other data transfer systems where there must be a pre-existing agreement (in the form of a federation schema or human-understandable document) between the sender and receiver about the format of the data and the organization and interpretation of the terms within records. Because RDF is intended to facilitate data and metadata discovery by machines (actually computer programs known as semantic clients or just "clients"), the meaning and use of terms must be well-defined and discoverable by clients without human intervention. To facilitate cross-referencing of resources among different data providers, resources must be identified using standardized, machine-understandable, and globally unique identifiers known as internationalized resource identifiers (IRIs). Finally, because anyone can make statements about a resource without agreeing to a pre-determined schema, RDF by its nature is a highly normalized network of relationships, in contrast to typical database tables which are by their nature "flat". Because of these differences, effective use of RDF requires that its users adhere to what are essentially evolving social conventions about identifiers, data transfer protocols, and application of vocabularies. Some of these conventions will be described in the following sections. Darwin Core is a vocabulary that provides terms that can be used to describe the properties and types of entities (known in RDF as "resources") in the biodiversity realm. Darwin Core is a general purpose vocabulary because its terms can be used as part of a number of data transfer systems. RDF differs in several important ways from other data transfer systems for which DwC usage guides exist ([Text Guide](http://rs.tdwg.org/dwc/terms/guides/text/) and [XML Guide](http://rs.tdwg.org/dwc/terms/guides/xml/)). By its nature, RDF is a distributed system. It is assumed that data from one provider will be linked to data from other providers. This also implies that it is always possible to discover new data properties about a particular resource and that those properties may be described using unfamiliar terms. This differs significantly from other data transfer systems where there needs to be a pre-existing agreement (in the form of a federation schema or human-understandable document) between the sender and receiver about the format of the data and the organization and interpretation of the terms within records. Because RDF is intended to facilitate data and metadata discovery by machines (actually computer programs known as semantic clients or just "clients"), the meaning and use of terms needs to be well-defined and discoverable by clients without human intervention. To facilitate cross-referencing of resources among different data providers, resources need to be identified using standardized, machine-understandable, and globally unique identifiers known as internationalized resource identifiers (IRIs). Finally, because anyone can make statements about a resource without agreeing to a pre-determined schema, RDF by its nature is a highly normalized network of relationships, in contrast to typical database tables which are by their nature "flat". Because of these differences, effective use of RDF requires that its users adhere to what are essentially evolving social conventions about identifiers, data transfer protocols, and application of vocabularies. Some of these conventions will be described in the following sections.
### 1.3 Features of RDF (non-normative) ### 1.3 Features of RDF (non-normative)
This section describes some of the basic features of RDF. It is not intended as a tutorial on RDF, but rather to provide enough information about the features of RDF to explain why specific guidelines for the use Darwin Core in RDF are necessary and why an additional Darwin Core namespace has been created. This section describes some of the basic features of RDF. It is not intended as a tutorial on RDF, but rather to provide enough information about the features of RDF to explain why specific guidelines for the use of Darwin Core in RDF are necessary and why an additional Darwin Core namespace has been created.
#### 1.3.1 Serialization and syntax (non-normative) #### 1.3.1 Serialization and syntax (non-normative)
@ -91,27 +95,27 @@ In this document, the following formatting conventions will be used. Full IRIs w
<http://bioimages.vanderbilt.edu/contact/kirchoff#coblea> <http://bioimages.vanderbilt.edu/contact/kirchoff#coblea>
``` ```
Abbreviated UIRIs will be shown as `inline code` in the form `namespace:localName`, e.g., `rdf:type`. Namespace abbreviations when shown by themselves will also be shown in italics, e.g., `dwc:` . Examples will be displayed in code blocks. Abbreviated UIRIs will be shown as `inline code` in the form `namespace:localName`, e.g., `rdf:type`. Namespace abbreviations when shown by themselves will also be shown as `inline code`, e.g., `dwc:`. Examples will be displayed in code blocks.
XML is a widely understood form of RDF serialization. Therefore, all examples given here will be shown as RDF/XML. In most cases, they will also be shown in Turtle. For more detailed information about RDF serialization, see [part 3 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners3RDFbasics.md) and the references cited there. XML is a widely understood form of RDF serialization. Therefore, all examples given here will be shown as RDF/XML. In most cases, they will also be shown in Turtle. For more detailed information about RDF serialization, see [part 3 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners3RDFbasics.md) and the references cited there.
#### 1.3.2 Internationalized Resource Identifier (IRI) (non-normative) #### 1.3.2 Internationalized Resource Identifier (IRI) (non-normative)
Data providers make use of a variety of identifiers to refer to resources they wish to provide. These identifiers may be locally unique within the provider's database, or they may be globally unique. Providers have sought to make their identifiers globally unique through such means as "Darwin Core Triplets" (institutionCode:collectionCode:catalogNumber) and creation of [UUIDs](http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=62795). However, only identifiers in the form of [IRIs](http://tools.ietf.org/html/rfc3987) can be valid subjects of statements (known as RDF triples) in RDF, so neither "Darwin Core Triples" nor UUIDs can be used in unmodified form for that purpose. IRIs are a superset of a narrower form of identifiers known as Uniform Resource Identifiers (URIs) that can be used in place of [IRIs](http://tools.ietf.org/html/rfc3986). This document will refer exclusively to IRIs with the understanding that URIs may be used in place of IRIs. Data providers make use of a variety of identifiers to refer to resources they wish to provide. These identifiers might be locally unique within the provider's database, or they might be globally unique. Providers have sought to make their identifiers globally unique through such means as "Darwin Core Triplets" (institutionCode:collectionCode:catalogNumber) and creation of [UUIDs](http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=62795). However, only identifiers in the form of [IRIs](https://datatracker.ietf.org/doc/html/rfc3987) can be valid subjects of statements (known as RDF triples) in RDF, so neither "Darwin Core Triples" nor UUIDs can be used in unmodified form for that purpose. IRIs are a superset of a narrower form of identifiers known as Uniform Resource Identifiers ([URIs](https://datatracker.ietf.org/doc/html/rfc3986)) that can be used in place of IRIs. This document will refer exclusively to IRIs with the understanding that URIs may be used in place of IRIs.
The most familiar form of IRI is a Uniform Resource Locator (URL) which not only identifies a resource, but provides information about retrieving an information resource (i.e., a resource that can be transmitted in electronic form) such as text in the form of an HTML web page. However, in general IRIs may identify non-information resources (physical or conceptual entities) that are not transmittable electronically, e.g., `<http://bioimages.vanderbilt.edu/contact/kirchoff#coblea>`, a person. If a client attempts to retrieve a non-information resource by dereferencing its HTTP IRI, a process called [content negotiation](http://tools.ietf.org/html/rfc2616#section-12) is used to refer the client to the IRI of an information resource representation of the non-information resource. For humans, this is usually a web page, while for semantic clients (machines) the representation is a document in the form of RDF/XML. For more detailed information about IRIs see [part 1 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners1URIs.md) and the references cited there. The most familiar form of IRI is a Uniform Resource Locator (URL) which not only identifies a resource, but also provides information about retrieving an information resource (i.e., a resource that can be transmitted in electronic form), such as text in the form of an HTML web page. In general, IRIs can also identify non-information resources (physical or conceptual entities) that are not transmittable electronically, e.g., `<http://bioimages.vanderbilt.edu/contact/kirchoff#coblea>`, a person. If a client attempts to retrieve a non-information resource by dereferencing its HTTP IRI, a process called [content negotiation](https://datatracker.ietf.org/doc/html/rfc2616#section-12) is used to refer the client to the IRI of an information resource representation of the non-information resource. For humans, this is usually a web page, while for semantic clients (machines) the representation is a document in the form of RDF/XML. For more detailed information about IRIs see [part 1 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners1URIs.md) and the references cited there.
##### 1.3.2.1 Persistent Identifiers (normative) ##### 1.3.2.1 Persistent Identifiers (normative)
Best practices dictate that identifiers (known as persistent identifiers or globally unique identifiers: GUIDs) which are used to identify resources of permanent interest be globally unique, referentially consistent, and persistent ([TDWG Globally Unique Identifiers (GUID) applicability statement:](http://www.tdwg.org/standards/150/)). If those identifiers are to be used to identify subject resources in RDF, they must also be in the form of an IRI. This has two implications for data providers. Best practices dictate that identifiers (known as persistent identifiers or globally unique identifiers: GUIDs), which are used to identify resources of permanent interest, SHOULD be globally unique, referentially consistent, and persistent ([TDWG Globally Unique Identifiers (GUID) applicability statement:](http://www.tdwg.org/standards/150/)). If those identifiers are to be used to identify subject resources in RDF, they MUST also be in the form of an IRI. This has two implications for data providers.
First, if a non-IRI globally unique identifier is used to identify a subject resource, it must be converted to an IRI by making it conform to a well-known IRI scheme (e.g., a [URN or HTTP IRI](http://www.iana.org/assignments/uri-schemes.html)). For example, a UUID can be transformed into a [URN](http://tools.ietf.org/html/rfc4122) by prefixing its string representation with `urn:uuid:` as in First, if a non-IRI globally unique identifier is used to identify a subject resource, it MUST be converted to an IRI by making it conform to a well-known IRI scheme (e.g., a [URN or HTTP IRI](http://www.iana.org/assignments/uri-schemes.html)). For example, a UUID can be transformed into a [URN](https://datatracker.ietf.org/doc/html/rfc8141) by prefixing its string representation with `urn:uuid:` as in
``` ```
<urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6> <urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>
``` ```
Similarly, [ISBNs can be converted to URNs](http://tools.ietf.org/html/rfc3187) as shown in Examples 13 and 14. Although these URNs are valid IRIs, they have the disadvantage that they are not actionable (see [Section 1.3.2.2](#1322-http-iris-as-self-resolving-guids-normative)). In contrast, the "Darwin Core triplet" Similarly, [ISBNs can be converted to URNs](https://www.ietf.org/rfc/rfc3187.txt) as shown in Examples 13 and 14. Although these URNs are valid IRIs, they have the disadvantage that they are not actionable (see [Section 1.3.2.2](#1322-http-iris-as-self-resolving-guids-normative)). In contrast, the "Darwin Core triplet"
``` ```
MVZ:Mamm:165861 MVZ:Mamm:165861
@ -123,47 +127,47 @@ has been turned into an actionable IRI by appending it to the base string `http:
<http://arctos.database.museum/guid/MVZ:Mamm:165861> <http://arctos.database.museum/guid/MVZ:Mamm:165861>
``` ```
Second, if a provider refers to a resource using an URL that provides data about the resource, the provider should take care to ensure that the URL does not change over time (e.g., if the content moves to a different server, if the directory structure changes, or if a new version of the database is introduced). It may be preferable to use [content negotiation](http://www.w3.org/TR/cooluris/) to redirect the user from a persistent IRI which refers to the resource, to the URL of a web page which describes the resource. Second, if a provider refers to a resource using an URL that provides data about the resource, the provider SHOULD ensure that the URL does not change over time (e.g., if the content moves to a different server, if the directory structure changes, or if a new version of the database is introduced). To improve stability, the information provider MAY use [content negotiation](http://www.w3.org/TR/cooluris/) to redirect the user from a persistent IRI that refers to the resource to the URL of a web page which describes the resource.
For a more detailed introduction to persistent identifiers, see the [GBIF Beginner's Guide to Persistent Identifiers](https://www.gbif.org/document/80575). For a more detailed introduction to persistent identifiers, see the [GBIF Beginner's Guide to Persistent Identifiers](https://www.gbif.org/document/80575).
Based on the precedent set by the [TDWG LSID Applicability Statement standard](http://www.tdwg.org/standards/150/), it is recommended that URN-based IRIs be related to HTTP-proxied equivalents (if they exist) as described in [Section 2.2.3](#223-associating-a-urn-with-its-http-proxied-equivalent-normative). Based on the precedent set by the [TDWG LSID Applicability Statement standard](http://www.tdwg.org/standards/150/), it is RECOMMENDED that URN-based IRIs be related to HTTP-proxied equivalents (if they exist) as described in [Section 2.2.3](#223-associating-a-urn-with-its-http-proxied-equivalent-normative).
##### 1.3.2.2 HTTP IRIs as self-resolving GUIDs (normative) ##### 1.3.2.2 HTTP IRIs as self-resolving GUIDs (normative)
Advocates of principles of [Linked Data](http://linkeddata.org/) prefer to use identifiers which follow the [HTTP IRI scheme](http://tools.ietf.org/html/rfc2616), known as "HTTP IRIs". In addition to being globally unique, such identifiers have the advantage of being dereferenceable using a widely implemented protocol . As such, it is possible to implement HTTP IRIs so that a human end user can obtain information about the resource using a conventional web browser. It is also possible to implement HTTP IRIs so that data (in the form of RDF) describing the resource can be discovered by a semantic client. In recognition of the advantages conferred by HTTP IRIs, the [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies in Recommendation 2 that "HTTP GET resolution must be provided for non-self-resolving GUIDs". For this reason, providers of biodiversity information who intend to make data available through RDF should plan to implement GUIDs that are persistent HTTP IRIs. This is not to the exclusion of other forms of non-self-resolving globally unique identifiers that can be associated with the HTTP IRI using the methods described in [Section 2](#2-implementation-guide) of this guide. Advocates of principles of [Linked Data](https://www.w3.org/standards/semanticweb/data) prefer to use identifiers that follow the [HTTP IRI scheme](https://datatracker.ietf.org/doc/html/rfc2616/), known as "HTTP IRIs". In addition to being globally unique, such identifiers have the advantage of being dereferenceable using a widely implemented protocol. As such, it is possible to implement HTTP IRIs so that a human end user can obtain information about the resource using a conventional web browser. It is also possible to implement HTTP IRIs so that data (in the form of RDF) describing the resource can be discovered by a semantic client. In recognition of the advantages conferred by HTTP IRIs, the [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies in Recommendation 2 that "HTTP GET resolution must be provided for non-self-resolving GUIDs". For this reason, providers of biodiversity information who intend to make data available through RDF SHOULD plan to implement GUIDs that are persistent HTTP IRIs. This is not to the exclusion of other forms of non-self-resolving globally unique identifiers that can be associated with the HTTP IRI using the methods described in [Section 2](#2-implementation-guide) of this guide.
### 1.4 Use of terms in RDF ### 1.4 Use of terms in RDF
#### 1.4.1 Well-known vocabularies (non-normative) #### 1.4.1 Well-known vocabularies (non-normative)
Because RDF assumes no pre-existing agreement between data providers and consumers about the terms used as properties to describe resources, the likelihood that a consuming client will "understand" the meaning of an RDF triple will be increased if the provider uses terms from a well-known vocabulary. Some well-known general and biodiversity-related vocabularies are listed in the [Introduction to the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners.md#037-biodiversity-related-and-general-vocabularies-and-ontologies). If no well-known term exists to represent a property needed to describe a resource, a data provider may "mint" its own term. In that case, the provider should assign the term an IRI, define the term in RDF, provide clear human-understandable documentation of how the term should be used, provide for dereferencing of the IRI, and commit to the long-term stability of the term IRI and definition. Because RDF assumes no pre-existing agreement between data providers and consumers about the terms used as properties to describe resources, the likelihood that a consuming client will "understand" the meaning of an RDF triple will be increased if the provider uses terms from a well-known vocabulary. Some well-known general and biodiversity-related vocabularies are listed in the [Introduction to the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners.md#037-biodiversity-related-and-general-vocabularies-and-ontologies). If no well-known term exists to represent a property needed to describe a resource, a data provider MAY "mint" its own term. In that case, the provider SHOULD assign the term an IRI, define the term in RDF, provide clear human-understandable documentation of how the term should be used, provide for dereferencing of the IRI, and commit to the long-term stability of the term IRI and definition.
#### 1.4.2 Appropriate use of terms (normative) #### 1.4.2 Appropriate use of terms (normative)
Because of the machine-oriented nature of RDF, a provider must assume that a consuming client will not infer any meaning from a statement other than what is directly stated or what can be inferred logically from other statements made about the resources and terms involved in the statement. For example, a provider might use the name of a resource in a triple with the intention that the name represent the resource itself. However, if the term used as the predicate in the triple is designed to refer to resources themselves rather than names of resources, the client may fail to make the connection between the name and the resource itself. Depending on how the term is defined, the client may detect an inconsistency or draw unintended conclusions as described below. Inappropriate use of terms as RDF predicates can have unintended consequences because unlike text-based data transfer protocols, RDF is designed to allow clients to infer additional facts based on information contained in the definitions of the terms. For example, the definition of the term [`foaf:depicts`](http://xmlns.com/foaf/spec/) contains a statement declaring its domain to be `foaf:Image` . Thus a provider which describes a resource using a `foaf:depicts` property is also implicitly (and perhaps unknowingly) declaring the resource to be an image. Terms which are defined using a form of RDF known as [Web Ontology Language (OWL)](http://www.w3.org/TR/owl2-overview/) may have restrictions placed on their use. For example, declaring a term to be an `owl:ObjectProperty` indicates that it is inconsistent for the value of that property to be a string literal (i.e., the value should be an IRI). Because of the machine-oriented nature of RDF, a provider MUST assume that a consuming client will not infer any meaning from a statement other than what is directly stated or what can be inferred logically from other statements made about the resources and terms involved in the statement. For example, a provider might use the name of a resource in a triple with the intention that the name represent the resource itself. However, if the term used as the predicate in the triple is designed to refer to resources themselves rather than names of resources, the client may fail to make the connection between the name and the resource itself. Depending on how the term is defined, the client may detect an inconsistency or draw unintended conclusions as described below. Inappropriate use of terms as RDF predicates can have unintended consequences because, unlike text-based data transfer protocols, RDF is designed to allow clients to infer additional facts based on information contained in the definitions of the terms. For example, the definition of the term [`foaf:depicts`](http://xmlns.com/foaf/spec/) contains a statement declaring its domain to be `foaf:Image`. Thus a provider that describes a resource using a `foaf:depicts` property is also implicitly (and perhaps unknowingly) declaring the resource to be an image. Terms that are defined using a form of RDF known as [Web Ontology Language (OWL)](http://www.w3.org/TR/owl2-overview/) can have restrictions placed on their use. For example, declaring a term to be an `owl:ObjectProperty` indicates that it is inconsistent for the value of that property to be a string literal (i.e., the value is expected be an IRI).
For these reasons, terms should be used as predicates in RDF only after the data provider has carefully examined the documentation and usage guidelines associated with the vocabulary or ontology which defines the term and has determined that use of that term is consistent with the meaning which the provider intends to impart to the triple in which the term is to be used as a predicate. For these reasons, terms SHOULD be used as predicates in RDF only after the data provider has carefully examined the documentation and usage guidelines associated with the vocabulary or ontology that defines the term and has determined that use of that term is consistent with the meaning the provider intends to impart to the triple in which the term is to be used as a predicate.
For more detailed information about the implications of using terms that have range, domain, and subproperty declarations in RDF, see [part 4 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners4Vocabularies.md). For more detailed information about how OWL is used to define complex properties of terms in RDF, see [part 7 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners7OWL.md). For more detailed information about the implications of using terms that have range, domain, and subproperty declarations in RDF, see [part 4 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners4Vocabularies.md). For more detailed information about how OWL is used to define complex properties of terms in RDF, see [part 7 of the Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners7OWL.md).
#### 1.4.3 Use of Darwin Core terms in RDF (normative) #### 1.4.3 Use of Darwin Core terms in RDF (normative)
The general Darwin Core vocabulary, whose terms are in the `dwc:` namespace (`http://rs.tdwg.org/dwc/terms/`), is designed primarily to facilitate the transfer of text-based records from relatively "flat" database tables. Because of this, the term recommendations associated with the general vocabulary suggest using text strings to refer to physical and conceptual entities, i.e., names to represent people, citations to represent articles, codes to represent institutions, etc. (The several kinds of roles intended for text strings are detailed in [Section 1.5](#15-roles-of-text-strings-as-values-of-properties-in-dwc-namespace-non-normative).) When a record has multiple values for a property, the general Darwin Core term definitions recommend that the multiple strings be concatenated and delineated in a single field to avoid forcing the creation of a more normalized data structure. The general Darwin Core vocabulary, whose terms are in the `dwc:` namespace (`http://rs.tdwg.org/dwc/terms/`), is designed primarily to facilitate the transfer of text-based records from relatively "flat" database tables. Because of this, the term recommendations associated with the general vocabulary suggest using text strings to refer to physical and conceptual entities, i.e., names to represent people, citations to represent articles, codes to represent institutions, etc. The several kinds of roles intended for text strings are detailed in [Section 1.5](#15-roles-of-text-strings-as-values-of-properties-in-dwc-namespace-non-normative). When a record has multiple values for a property, the general Darwin Core term definitions specify that the multiple strings be concatenated and delineated in a single field to avoid forcing the creation of a more normalized data structure.
However, in RDF, identification of physical and conceptual entities is generally accomplished by IRI references rather than literals. If there are multiple values for a property, each value should be referenced in a separate triple. Thus we face a situation where it is desirable in the context of RDF for certain DwC properties to be represented by one to many triples having non-literal (IRI reference) objects, while the actual definitions of the terms in the general DwC vocabulary which represent those properties specify that a single literal (text string) object be provided. However, in RDF, identification of physical and conceptual entities is generally accomplished by IRI references rather than literals. If there are multiple values for a property, each value MUST be referenced in a separate triple. Thus we face a situation where it is desirable in the context of RDF for certain DwC properties to be represented by one to many triples having non-literal (IRI reference) objects, while the actual definitions of the terms in the general DwC vocabulary that represent those properties specify that a single literal (text string) object be provided.
In a perfect world, all data providers wishing to serve RDF would immediately replace literal references to physical and conceptual resources (names, citations, codes, etc.) with IRI GUIDs that identify those resources and which are reused by other members of the community. However, it is more realistic to assume that at least initially many providers will have few (if any) GUIDs available to use as IRI references, so requiring non-literal resources to be referenced exclusively by IRIs would impede the exposure of data in the form of RDF. Therefore, it would be advantageous to provide an alternative set of DwC terms intended for use in RDF with IRI-referenced objects, while continuing to use the general DwC terms for literal objects where providers are unable to convert their existing (string) database fields to IRI references. In a perfect world, all data providers wishing to serve RDF would immediately replace literal references to physical and conceptual resources (names, citations, codes, etc.) with IRI GUIDs that identify those resources and that are reused by other members of the community. However, it is more realistic to assume that, at least initially, many providers will have few (if any) GUIDs available to use as IRI references, so requiring non-literal resources to be referenced exclusively by IRIs would impede the exposure of data in the form of RDF. Therefore, it would be advantageous to provide an alternative set of DwC terms intended for use in RDF with IRI-referenced objects, while continuing to use the general DwC terms for literal objects where providers are unable to convert their existing (string) database fields to IRI references.
This guide introduces the namespace **`dwciri:`** (`http://rs.tdwg.org/dwc/iri/`) whose terms are intended for use with non-literal objects. If a term in the `dwciri:` namespace has an analogue in the `dwc:` namespace having the same local name, the `dwciri:` term will have the same meaning as its `dwc:` counterpart. For example, `dwciri:recordedBy` has the same meaning as `dwc:recordedBy`, but as an RDF predicate `dwciri:recordedBy` is intended to be repeatable and have an IRI-reference object. Providers whose databases include the field `dwc:recordedBy` with records containing concatenated lists of names can publish those values immediately as single RDF triples using `dwc:recordedBy` as the predicate and containing one literal object which is the concatenated list string. In this manner, publication of data as RDF can begin immediately using the `dwc:` terms without the requirement that every resource have an assigned IRI GUID. As the community develops mechanisms for discovering and reusing IRIs, data providers can make the shift to `dwciri:` terms. As a part of their data updating and cleaning process providers or aggregators may eventually parse the strings, search a community IRI repository, and match strings with existing IRIs or create new ones if they do not already exist. This guide introduces the namespace **`dwciri:`** (`http://rs.tdwg.org/dwc/iri/`) whose terms are intended for use with non-literal objects. If a term in the `dwciri:` namespace has an analogue in the `dwc:` namespace with the same local name, the `dwciri:` term SHALL have the same meaning as its `dwc:` counterpart. For example, `dwciri:recordedBy` has the same meaning as `dwc:recordedBy`, but as an RDF predicate `dwciri:recordedBy` is intended to be repeatable and have an IRI-reference object. Providers whose databases include the field `dwc:recordedBy` with records containing concatenated lists of names MAY publish those values immediately as single RDF triples using `dwc:recordedBy` as the predicate and containing one literal object, which is the concatenated list string. In this manner, publication of data as RDF can begin immediately using the `dwc:` terms without the requirement that every resource have an assigned IRI GUID. As the community develops mechanisms for discovering and reusing IRIs, data providers can make the shift to `dwciri:` terms. As a part of their data updating and cleaning process providers or aggregators may eventually parse the strings, search a community IRI repository, and match strings with existing IRIs or create new ones if they do not already exist.
#### 1.4.4 Limitations of this guide (non-normative) #### 1.4.4 Limitations of this guide (non-normative)
This guide provides general guidance about how Darwin Core property terms should be used as RDF predicates and specifies that Darwin Core class terms should be used in `rdf:type` declarations ([Section 2.3.1.5](#2315-classes-to-be-used-for-type-declarations-of-resources-described-using-darwin-core-normative)). However, the Darwin Core standard does not specify precisely which resources should be included as instances of its classes nor does it declare domains for its property terms. Although the [Darwin Core Quick Reference Guide](http://rs.tdwg.org/dwc/terms/index.htm) suggests which properties might be applied to instances of classes by organizing those property terms under class headings, Darwin Core leaves specific decisions about type declaration and property assignment to community consensus. Some examples which show varying approaches to assigning resources to Darwin Core classes and connecting them with object properties defined outside Darwin Core are provided in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). This guide provides general guidance about how Darwin Core property terms can be used as RDF predicates and specifies that Darwin Core class terms should be used in `rdf:type` declarations ([Section 2.3.1.5](#2315-classes-to-be-used-for-type-declarations-of-resources-described-using-darwin-core-normative)). However, the Darwin Core standard does not specify precisely which resources are to be included as instances of its classes nor does it declare domains for its property terms. Although the [Darwin Core Quick Reference Guide](https://dwc.tdwg.org/terms/) suggests which properties might be applied to instances of classes by organizing those property terms under class headings, Darwin Core leaves specific decisions about type declaration and property assignment to community consensus. Some examples that show varying approaches to assigning resources to Darwin Core classes and connecting them with object properties defined outside Darwin Core are provided in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md).
### 1.5 Roles of text strings as values of properties in dwc: namespace (non-normative) ### 1.5 Roles of text strings as values of properties in dwc: namespace (non-normative)
When humans communicate in written language, they use strings of text characters to impart meaning. In some cases text strings may have a relatively unambiguous meaning. However, in many cases the exact meaning of a text string will depend on the context in which it is used. Take for example the text string "Germany". That string may be intended to refer to a location bounded by some agreed-upon geographical borders. It may be intended to refer to a political entity, i.e., "the government of Germany". It may be intended to refer to an entity that includes all of the inhabitants living with certain geographical borders. It may also be intended to be a name recorded in a certain language. Is "Germany" the same as "Deutschland"? Does "Germany" mean the same thing as the code "DE"? Is "Germany" the same as "Federal Republic of Germany"? The answers to all of these questions depends on what one means by "Germany". When humans communicate in written language, they use strings of text characters to impart meaning. In some cases text strings may have a relatively unambiguous meaning. However, in many cases the exact meaning of a text string will depend on the context in which it is used. Take, for example, the text string "Germany". That string might be intended to refer to a location bounded by some agreed-upon geographical borders. It may be intended to refer to a political entity, i.e., "the government of Germany". It might be intended to refer to an entity that includes all of the inhabitants living with certain geographical borders, i.e., "the people of Germany". It might also be intended to be a name recorded in a certain language. Is "Germany" the same as "Deutschland"? Does "Germany" mean the same thing as the code "DE"? Is "Germany" the same as "Federal Republic of Germany"? The answers to all of these questions depends on what one means by "Germany".
In text-based data transfer systems, text strings are the predominant means by which information is conveyed. It would be a relatively simple matter to simply "translate" existing Darwin Core text-based data into RDF by making every string value be the literal object of a predicate that is the Darwin Core property. But that would not result in RDF that conveys the kind of "meaning" that RDF was designed to impart. In text-based data transfer systems, text strings are the predominant means by which information is conveyed. It would be a relatively simple matter to simply "translate" existing Darwin Core text-based data into RDF by making every string value be the literal object of a predicate that is the Darwin Core property. But that would not result in RDF that conveys the kind of "meaning" that RDF was designed to impart.
@ -171,25 +175,25 @@ Because text-based systems depend on predetermined understandings about the mean
#### 1.5.1 Situations where a string is the standard means for encoding a resource (non-normative) #### 1.5.1 Situations where a string is the standard means for encoding a resource (non-normative)
Numbers, dates, and titles are resources which are essentially conceptual, but whose meaning can be imparted rather completely and concisely by a string. This is particularly true if an datatype encoding scheme is specified for the string or if a language attribute is used to indicate the language of the title. In this situation, there is virtually no need to provide additional information about the resource other than the string itself. A literal object is sufficient in itself. Numbers, dates, and titles are resources that are essentially conceptual, but whose meaning can be imparted rather completely and concisely by a string. This is particularly true if an datatype encoding scheme is specified for the string or if a language attribute is used to indicate the language of the title. In this situation, there is virtually no need to provide additional information about the resource other than the string itself. A literal object is sufficient in itself.
#### 1.5.2 Situations where a string value serves as a proxy for a non-information resource (non-normative) #### 1.5.2 Situations where a string value serves as a proxy for a non-information resource (non-normative)
Humans commonly use name strings to represent resources that are physical or conceptual (i.e., non-information) resources. For example, the string "Vincent van Gogh" is used to represent the person whose name was Vincent van Gogh. If we made the statement "Starry Night" createdBy "Vincent van Gogh", we do not mean that "Starry Night" was created by the name "Vincent van Gogh", but rather that "Starry Night" was created by the person whose name was Vincent van Gogh. For that matter "Starry Night" is the name for a painting and we probably actually mean that the person Vincent van Gogh created the painting rather than the name of the painting (although he probably created both!). Unlike the first situation, such name strings themselves do not contain nearly all of the information that one might want to know about that non-information resource (e.g., date of creation, location at during a certain period of time, etc.) but they can serve as an identifier for the resource. In RDF, machine-processable IRIs are preferred over string names as identifiers for resources. Humans commonly use name strings to represent resources that are physical or conceptual (i.e., non-information) resources. For example, the string "Vincent van Gogh" is used to represent the person whose name was Vincent van Gogh. If we made the statement "Starry Night" createdBy "Vincent van Gogh", we do not mean that "Starry Night" was created by the name "Vincent van Gogh", but rather that "Starry Night" was created by the person whose name was Vincent van Gogh. For that matter "Starry Night" is the name for a painting and we probably actually mean that the person Vincent van Gogh created the painting rather than the name of the painting (although he probably created both!). Unlike the first situation, such name strings themselves do not contain nearly all of the information that one might want to know about that non-information resource (e.g., date of creation, location during a certain period of time, etc.), but they can serve as an identifier for the resource. In RDF, machine-processable IRIs are preferred over string names as identifiers for resources.
#### 1.5.3 Situations where a string value serves as a keyword to enable searching (non-normative) #### 1.5.3 Situations where a string value serves as a keyword to enable searching (non-normative)
Imagine that a person identifies an oak tree as "Quercus alba". The data associated with that identification may provide the property/value pair `dwc:scientificName`=`"Quercus alba"`. This implies that the person asserted that the tree was a representative of a taxon associated with the name _Quercus alba_. The data associated with the identification may also provide the property/value pair `dwc:order`=`"Fagales"`. One might think that this would imply that the person who asserted the identification also asserted that the tree was included in the order Fagales. However, it is likely that the person did not make such an assertion and in fact may have never even heard of the order Fagales. Rather, a database manager subscribing to a particular taxonomic hierarchy asserted that all identifications with a `dwc:scientificName` value of "Quercus alba" should also have a property/value pair of `dwc:order`=`"Fagales`" in order to allow users of the database to search for identifications that were related because they shared the common value for that `dwc:order` property. Imagine that a person identifies an oak tree as "Quercus alba". The data associated with that identification may provide the property/value pair `dwc:scientificName`=`"Quercus alba"`. This implies that the person asserted that the tree was a representative of a taxon associated with the name _Quercus alba_. The data associated with the identification might also provide the property/value pair `dwc:order`=`"Fagales"`. One might think that this would imply that the person who asserted the identification also asserted that the tree was included in the order Fagales. However, it is likely that the person did not make such an assertion and in fact may have never even heard of the order Fagales. Rather, a database manager subscribing to a particular taxonomic hierarchy asserted that all identifications with a `dwc:scientificName` value of "Quercus alba" will also have a property/value pair of `dwc:order`=`"Fagales`" in order to allow users of the database to search for identifications that were related because they shared the common value for that `dwc:order` property.
The point is that in order to more accurately describe the real situation, there should be two separate sets of information: one which asserts that the person identified the tree as a representative of a taxon for which the scientific name "Quercus alba" is applied, and one which asserts the relationship between that taxa and higher taxa such as one to which the name "Fagales" is applied. The point is that in order to more accurately describe the real situation, there ought to be two separate sets of information: one that asserts that the person identified the tree as a representative of a taxon for which the scientific name "Quercus alba" is applied, and one that asserts the relationship between that taxon and higher taxa such as one to which the name "Fagales" is applied.
#### 1.5.4 Situations where a string value serves as an identifier (non-normative) #### 1.5.4 Situations where a string value serves as an identifier (non-normative)
A number of Darwin Core properties specify that their values should be an identifier. There is significant ambiguity in the use of these properties because depending on the situation the value may be an identifier for the subject resource itself or it may be the identifier for a resource that is related in some way to the subject resource. In addition, Darwin Core sometimes recommends (but does not require) a GUID as a value, but does not require that the GUID be either an HTTP IRI nor an IRI in general. A number of Darwin Core properties specify that their values ought be an identifier. There is significant ambiguity in the use of these properties because, depending on the situation, the value may be an identifier for the subject resource itself or it may be the identifier for a resource that is related in some way to the subject resource. In addition, Darwin Core sometimes suggests (but does not require) a GUID as a value, but does not require that the GUID be either an HTTP IRI nor an IRI in general.
#### 1.5.5 Implications for expressing Darwin Core string values as RDF (non-normative) #### 1.5.5 Implications for expressing Darwin Core string values as RDF (non-normative)
To facilitate achieving the clarity that RDF makes possible, this guide provides different approaches for each of these four situations in which string values are provided. In the first three situations, the existing term from Darwin Core namespace `dwc:` can be used with a literal value to expose the string value as it currently exists in a text-based database. This allows for the rapid deployment of RDF described in [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and is all that is required in the first situation ([Section 1.5.1](#151-situations-where-a-string-is-the-standard-means-for-encoding-a-resource-non-normative)). In the second situation ([Section 1.5.2](#152-situations-where-a-string-value-serves-as-a-proxy-for-a-non-information-resource-non-normative)), analogues of the existing `dwc:` terms have been created in the `dwciri:` namespace which are intended to be used with IRI-references rather than names. In the third situation ([Section 1.5.3](#153-situations-where-a-string-value-serves-as-a-keyword-to-enable-searching-non-normative)), new `dwciri:` terms have been created to relate subject resources to IRI-identified object resources which form part of a hierarchy. If such a hierarchy already exists, the need is eliminated for separate terms ("convenience terms") which relate the subject resource to all parts of the hierarchy, although those terms can still be used if they are convenient for facilitating string searches. The last situation ([Section 1.5.4](#154-situations-where-a-string-value-serves-as-an-identifier-non-normative)) is more complex and a significant part of the implementation guide is devoted to the ways in which RDF should be structured to handle various kinds of identifiers. To facilitate achieving the clarity that RDF makes possible, this guide provides different approaches for each of these four situations in which string values are provided. In the first three situations, the existing term from Darwin Core namespace `dwc:` can be used with a literal value to expose the string value as it currently exists in a text-based database. This allows for the rapid deployment of RDF described in [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and is all that is needed in the first situation ([Section 1.5.1](#151-situations-where-a-string-is-the-standard-means-for-encoding-a-resource-non-normative)). In the second situation ([Section 1.5.2](#152-situations-where-a-string-value-serves-as-a-proxy-for-a-non-information-resource-non-normative)), analogues of the existing `dwc:` terms have been created in the `dwciri:` namespace, which are intended to be used with IRI-references rather than names. In the third situation ([Section 1.5.3](#153-situations-where-a-string-value-serves-as-a-keyword-to-enable-searching-non-normative)), new `dwciri:` terms have been created to relate subject resources to IRI-identified object resources that form part of a hierarchy. If such a hierarchy already exists, the need is eliminated for separate terms ("convenience terms") that relate the subject resource to all parts of the hierarchy, although those terms can still be used if they are convenient for facilitating string searches. The last situation ([Section 1.5.4](#154-situations-where-a-string-value-serves-as-an-identifier-non-normative)) is more complex and a significant part of the implementation guide is devoted to the ways in which RDF ought to be structured to handle various kinds of identifiers.
## 2 Implementation guide ## 2 Implementation guide
@ -255,7 +259,7 @@ Subject | Predicate | Object
<http://dbpedia.org/resource/Starry_night> | <http://xmlns.com/foaf/0.1/maker> | <http://viaf.org/viaf/9854560> <http://dbpedia.org/resource/Starry_night> | <http://xmlns.com/foaf/0.1/maker> | <http://viaf.org/viaf/9854560>
dbres:Starry_night | foaf:maker | viaf:9854560 dbres:Starry_night | foaf:maker | viaf:9854560
In the second row of Table 2, the full IRIs are given. In the third row namespace abbreviations are used to shorten the IRIs. The following fragments of RDF shows the triple in RDF/XML and Turtle serializations of the triple shown in Table 2: In the second row of Table 2, the full IRIs are given. In the third row, namespace abbreviations are used to shorten the IRIs. The following fragments of RDF show the triple in RDF/XML and Turtle serializations of the triple shown in Table 2:
**Example 1:** **Example 1:**
@ -274,11 +278,11 @@ Turtle
foaf:maker <http://viaf.org/viaf/9854560>. foaf:maker <http://viaf.org/viaf/9854560>.
``` ```
The [Dublin Core Metadata Initiative (DCMI) Abstract Model](http://dublincore.org/documents/abstract-model/), which was designed to be compatible with RDF, describes subject resources using property-value pairs, which correspond to pairs of predicates and objects. When referring to Dublin Core terms (as well as Darwin Core, which is modeled on Dublin Core) "property" is used synonymously with "predicate" and "value" is used synonymously with "object". In Example 1, `foaf:maker` is a property and `viaf:9854560` is the value associated with that property. Predicates must be identified by IRIs. Objects of triples may be identified in three ways: 1) the object resource can be identified by an IRI reference, 2) the object can be identified by a non-IRI string, in which case it is called a literal, and 3) the object resource can also be left unidentified, in which case it is called a blank node or an anonymous node. Blank nodes are undesirable if it is important that other data providers be able to refer to the resource they represent. However, blank nodes may be preferable if external references to the resource are not relevant, or if the data provider is unable or unwilling to provide a stable IRI to identify the resource. IRI references and blank nodes can be the subjects of RDF triples, but literals cannot. The [Dublin Core Metadata Initiative (DCMI) Abstract Model](https://www.dublincore.org/specifications/dublin-core/abstract-model/), which was designed to be compatible with RDF, describes subject resources using property-value pairs, which correspond to pairs of predicates and objects. When referring to Dublin Core terms (as well as Darwin Core, which is modeled on Dublin Core) "property" is used synonymously with "predicate" and "value" is used synonymously with "object". In Example 1, `foaf:maker` is a property and `viaf:9854560` is the value associated with that property. Predicates have to be identified by IRIs. Objects of triples can be identified in three ways: 1) the object resource can be identified by an IRI reference, 2) the object can be identified by a non-IRI string, in which case it is called a literal, and 3) the object resource can also be left unidentified, in which case it is called a blank node or an anonymous node. Blank nodes are undesirable if it is important that other data providers be able to refer to the resource they represent. However, blank nodes might be preferable if external references to the resource are not relevant, or if the data provider is unable or unwilling to provide a stable IRI to identify the resource. IRI references and blank nodes can be the subjects of RDF triples, but literals cannot.
### 2.2 Subject resources (normative) ### 2.2 Subject resources (normative)
If the subject of an RDF triple is identified (i.e., not an anonymous node), it must be referenced by an IRI. This section describes how IRI identifiers are referenced in RDF and how non-IRI identifiers should be associated with the subject of the triple. If the subject of an RDF triple is identified (i.e., not an anonymous node), it MUST be referenced by an IRI. This section describes how IRI identifiers are referenced in RDF and how non-IRI identifiers SHOULD be associated with the subject of the triple.
#### 2.2.1 Identifying subject resources using IRIs (normative) #### 2.2.1 Identifying subject resources using IRIs (normative)
@ -290,7 +294,7 @@ The `rdf:about` attribute of the `rdf:Description` element is used in RDF/XML to
#### 2.2.2 Associating a string identifier with a subject resource (normative) #### 2.2.2 Associating a string identifier with a subject resource (normative)
The Dublin Core term `dcterms:identifier` should be used to associate a string literal identifier (e.g., UUID, "Darwin Core Triplet", or ARK) with an IRI-identified resource as shown here in RDF/XML: The Dublin Core term `dcterms:identifier` SHOULD be used to associate a string literal identifier (e.g., UUID, "Darwin Core Triplet", or ARK) with an IRI-identified resource as shown here in RDF/XML:
```xml ```xml
<dcterms:identifier>58D31D52-713D-44B4-9FE9-CB2D9249C422</dcterms:identifier> <dcterms:identifier>58D31D52-713D-44B4-9FE9-CB2D9249C422</dcterms:identifier>
@ -304,13 +308,13 @@ The Dublin Core term `dcterms:identifier` should be used to associate a string l
<dcterms:identifier>ark:/12025/654xz321</dcterms:identifier> <dcterms:identifier>ark:/12025/654xz321</dcterms:identifier>
``` ```
If an HTTP IRI is considered to be the identifier for a subject resource, it is acceptable to present it as a string literal value for `dcterms:identifier` in addition to using it in the `rdf:about` attribute of the subject resource, as in Example 2: If an HTTP IRI is considered to be the identifier for a subject resource, it MAY be provided as a string literal value for `dcterms:identifier` in addition to using it in the `rdf:about` attribute of the subject resource, as in Example 2:
**Example 2:** **Example 2:**
```xml ```xml
<rdf:Description rdf:about="http://bioimages.vanderbilt.edu/kirchoff/b5161"> <rdf:Description rdf:about="http://bioimages.vanderbilt.edu/kirchoff/b5161">
     <rdf:type rdf:resource ="http://purl.org/dc/dcmitype/StillImage" />      <rdf:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/>
     <dcterms:identifier>http://bioimages.vanderbilt.edu/kirchoff/b5161</dcterms:identifier>      <dcterms:identifier>http://bioimages.vanderbilt.edu/kirchoff/b5161</dcterms:identifier>
</rdf:Description> </rdf:Description>
``` ```
@ -336,21 +340,21 @@ Turtle
owl:sameAs <urn:lsid:biocol.org:col:35115>. owl:sameAs <urn:lsid:biocol.org:col:35115>.
``` ```
Since LSIDs follow the URN IRI scheme, they can serve as the subject of any RDF triple. However, it is better to use the http-proxied form as the subject (i.e., the value of the `rdf:about` attribute) in the description of the resource. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information about implementing LSIDs. Since LSIDs follow the URN IRI scheme, they MAY serve as the subject of any RDF triple. However, it is better to use the http-proxied form as the subject (i.e., the value of the `rdf:about` attribute) in the description of the resource. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information about implementing LSIDs.
This practice can be extended to any URN. For example, `owl:sameAs` can be used to relate the URN `<urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>` to its HTTP-proxied equivalent `<http://provider.org/f81d4fae-7dec-11d0-a765-00a0c91e6bf6>` in a manner analogous to Example 3. This practice MAY be extended to any URN. For example, `owl:sameAs` can be used to relate the URN `<urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>` to its HTTP-proxied equivalent `<http://provider.org/f81d4fae-7dec-11d0-a765-00a0c91e6bf6>` in a manner analogous to Example 3.
### 2.3 Predicates (normative) ### 2.3 Predicates (normative)
Most terms in the Darwin Core vocabulary can be used as predicates in triples to represent properties of subject resources. The full term IRI must be used, although with an appropriate namespace declaration, the namespace can be abbreviated ([Section 2.1.1](#211-namespace-abbreviations-used-in-xml-qualified-names-qnames-in-this-document-non-normative)). RDF does not restrict the source of predicates, therefore Darwin Core terms can be mixed with terms from other vocabularies. This includes the important predicate `rdf:type` which is used to indicate the class of which the subject resource is an instance. There is no prohibition in RDF against repeating properties. Most property terms in the Darwin Core vocabulary can be used as predicates in triples to represent properties of subject resources. The full term IRI MUST be used, although with an appropriate namespace declaration the namespace MAY be abbreviated ([Section 2.1.1](#211-namespace-abbreviations-used-in-xml-qualified-names-qnames-in-this-document-non-normative)). RDF does not restrict the source of predicates, therefore Darwin Core terms MAY be mixed with terms from other vocabularies. This includes the important predicate `rdf:type` which is used to indicate the class of which the subject resource is an instance. There is no prohibition in RDF against repeating properties.
#### 2.3.1 Declaring the type of the resource (non-normative) #### 2.3.1 Declaring the type of the resource (non-normative)
In RDF, a resource may be characterized by declaring that it is an instance of a class. Indicating that a resource is an instance of a class provides several benefits. It allows a consumer to narrow the results of a search by limiting the search to certain types of resources. It suggests to data providers what sorts of properties should be used to describe a resource. It allows consumers to anticipate what sorts of properties they might expect to be provided for that resource and allows developers to build applications that exploit those expectations. Because of these benefits, RDF provides several built-in mechanisms for asserting class membership, most notably the [`rdf:type` property](http://www.w3.org/TR/rdf-schema/#ch_type) which is used to state that a resource is an instance of a class. There is nothing that prohibits assigning more than one `rdf:type` property to a resource. In fact, there may be a benefit in describing a resource as a member of both a class which has specific meaning within a narrow community and a more well-known class which has a broader meaning and is therefore more likely to be understood by generic clients. For instance, a resource may be typed as both a `dwc:PreservedSpecimen` and a `dcmitype:PhysicalObject`. In RDF, a resource can be characterized by declaring that it is an instance of a class. Indicating that a resource is an instance of a class provides several benefits. It allows a consumer to narrow the results of a search by limiting the search to certain types of resources. It suggests to data providers what sorts of properties should be used to describe a resource. It allows consumers to anticipate what sorts of properties they might expect to be provided for that resource and allows developers to build applications that exploit those expectations. Because of these benefits, RDF provides several built-in mechanisms for asserting class membership, most notably the [`rdf:type` property](http://www.w3.org/TR/rdf-schema/#ch_type), which is used to state that a resource is an instance of a class. There is nothing that prohibits assigning more than one `rdf:type` property to a resource. In fact, there can be a benefit in describing a resource as a member of both a class that has specific meaning within a narrow community and a more well-known class that has a broader meaning and is therefore more likely to be understood by generic clients. For instance, a resource can be typed as both a `dwc:PreservedSpecimen` and a `dcmitype:PhysicalObject`.
##### 2.3.1.1 rdf:type statement (normative) ##### 2.3.1.1 rdf:type statement (normative)
The predicate `rdf:type` is defined to have an object that is a class. The class should be identified by an IRI reference (not by a literal) as in Example 4: The predicate `rdf:type` is defined to have an object that is a class. The class MUST be identified by an IRI reference (not by a literal) as in Example 4:
**Example 4:** **Example 4:**
@ -371,7 +375,7 @@ Turtle:
     dcterms:created "2002-06-11T09:37:33"^^xsd:dateTime.      dcterms:created "2002-06-11T09:37:33"^^xsd:dateTime.
``` ```
In Turtle serialization, `rdf:type` can be abbreviated as "`a`" (Example 4). In XML serialization, the RDF specification provides an abbreviated way to specify the type of a described resource. This method is called a [typed node element](http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-typed-nodes). The `rdf:Description` element is replaced by an element whose name is an XML qualified name that identifies a class of which the described resource is an instance as in Example 5: In Turtle serialization, `rdf:type` MAY be abbreviated as "`a`" (Example 4). In XML serialization, the RDF specification provides an abbreviated way to specify the type of a described resource. This method is called a [typed node element](http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-typed-nodes). The `rdf:Description` element is replaced by an element whose name is an XML qualified name that identifies a class of which the described resource is an instance as in Example 5:
**Example 5:** **Example 5:**
@ -390,13 +394,13 @@ The [RDF Schema (RDFS) specification](http://www.w3.org/TR/rdf-schema/) defines
P rdfs:domain C P rdfs:domain C
``` ```
is used to describe a subject resource, a client can infer that the subject resource is an instance of class `C`. For example, the term `dcterms:bibliographicCitation` is assigned the property is used to describe a subject resource, a client MAY infer that the subject resource is an instance of class `C`. For example, the term `dcterms:bibliographicCitation` is assigned the property
```xml ```xml
<rdfs:domain rdf:resource="http://purl.org/dc/terms/BibliographicResource"/> <rdfs:domain rdf:resource="http://purl.org/dc/terms/BibliographicResource"/>
``` ```
in its definition. If that term were used as the property of a specimen, a client could infer that the specimen had `rdf:type` `dcterms:BibliographicResource`: in its definition. If that term were used as the property of a specimen, a client MAY infer that the specimen had `rdf:type` `dcterms:BibliographicResource`:
**Example 6:** **Example 6:**
@ -421,13 +425,13 @@ When a predicate `P` having the property
P rdfs:range C P rdfs:range C
``` ```
is used with a value, a client can infer that the value is an instance of class `C`. The term `dcterms:language` is assigned the property is used with a value, a client MAY infer that the value is an instance of class `C`. The term `dcterms:language` is assigned the property
```xml ```xml
<rdfs:range rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/> <rdfs:range rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
``` ```
in its definition. If the object of that term in an RDF triple is a reference to the IRI for English assigned by the [MARC ISO 639-2 Codes for the Representation of Names of Languages](http://id.loc.gov/vocabulary/iso639-2) (Example 7), then it can be inferred that `<http://id.loc.gov/vocabulary/iso639-2/eng>` is a `dcterms:LingisticSystem` even though the MARC description in RDF does not assert that directly in its definition. in its definition. If the object of that term in an RDF triple is a reference to the IRI for English assigned by the [MARC ISO 639-2 Codes for the Representation of Names of Languages](http://id.loc.gov/vocabulary/iso639-2) (Example 7), then it MAY be inferred that `<http://id.loc.gov/vocabulary/iso639-2/eng>` is a `dcterms:LingisticSystem` even though the MARC description in RDF does not assert that directly in its definition.
**Example 7:** **Example 7:**
@ -446,11 +450,11 @@ Turtle
dcterms:language <http://id.loc.gov/vocabulary/iso639-2/eng>. dcterms:language <http://id.loc.gov/vocabulary/iso639-2/eng>.
``` ```
No terms defined within the Darwin Core namespace have range or domain declarations. However, some terms imported into Darwin Core from Dublin Core do have domain or range declarations. [Sections 3.2](#32-imported-dublin-core-terms-for-which-only-literal-objects-are-appropriate-normative) and [3.3](#33-imported-dublin-core-terms-that-have-non-literal-objects-and-corresponding-terms-that-have-literal-objects-normative) of this guide gives the declared ranges and domains when they are asserted for such terms. No terms defined within the Darwin Core namespace have range or domain declarations. However, some terms imported into Darwin Core from Dublin Core do have domain or range declarations. [Sections 3.2](#32-imported-dublin-core-terms-for-which-only-literal-objects-are-appropriate-normative) and [3.3](#33-imported-dublin-core-terms-that-have-non-literal-objects-and-corresponding-terms-that-have-literal-objects-normative) of this guide give the declared ranges and domains when they are asserted for such terms.
##### 2.3.1.3 Explicit vs. inferred type declarations (normative) ##### 2.3.1.3 Explicit vs. inferred type declarations (normative)
Because the use of a predicate having a range or domain declaration implies the `rdf:type` of a resource, data providers should exercise caution in using any such term in a non-standard way. For example, if the property `foaf:familyName` were used with a specimen (e.g., to indicate the taxonomic family), that use would imply that the specimen was a `foaf:Person` . However, it cannot be assumed that all clients will perform the reasoning necessary to infer the `rdf:type` declarations implied by range and domain declarations. Therefore, if a data provider feels that it is important for a consumer to know that a resource is an instance of a particular class, the provider should type the resource using an explicit `rdf:type` triple even if that asserts the same information that could be inferred from a domain or range declaration. For example, if providers of images want to assure that an image will be found in a query for resources having `rdf:type` `foaf:Image`, they should not assume that describing the image using the property `foaf:depicts` will accomplish that because of the range declaration of `foaf:depicts`. It would be safer to include Because the use of a predicate having a range or domain declaration implies the `rdf:type` of a resource, data providers SHOULD exercise caution in using any such term in a non-standard way. For example, if the property `foaf:familyName` were used with a specimen (e.g., to indicate the taxonomic family), that use would imply that the specimen was a `foaf:Person`. However, it cannot be assumed that all clients will perform the reasoning necessary to infer the `rdf:type` declarations implied by range and domain declarations. Therefore, if a data provider feels that it is important for a consumer to know that a resource is an instance of a particular class, the provider SHOULD type the resource using an explicit `rdf:type` triple even if that asserts the same information that could be inferred from a domain or range declaration. For example, if providers of images want to assure that an image will be found in a query for resources having `rdf:type` `foaf:Image`, they SHOULD NOT assume that describing the image using the property `foaf:depicts` will accomplish that because of the range declaration of `foaf:depicts`. It would be safer to include
```xml ```xml
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Image"/> <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Image"/>
@ -466,7 +470,7 @@ in the description so that clients searching for instances of either `foaf:Image
##### 2.3.1.4 Other predicates used to indicate type (normative) ##### 2.3.1.4 Other predicates used to indicate type (normative)
Both the Dublin Core and Darwin Core define terms that can be used to describe the nature of a resource: `dcterms:type` and `dwc:basisOfRecord` respectively. However, using these terms to describe the nature of the subject resource is not a substitute for use of `rdf:type`. The [DCMI notes on RDF semantics](http://dublincore.org/documents/dc-rdf/#sect-5) recommend that "applications implementing this specification primarily use and understand `rdf:type` in place of `dcterms:type` when expressing Dublin Core metadata in RDF, as most RDF processors come with built-in knowledge of `rdf:type`." A similar argument could be made for the use of `rdf:type` over `dwc:basisOfRecord`. Including `dc:type`, `dcterms:type`, and `dwc:basisOfRecord` in an RDF description should be considered optional, while including `rdf:type` should be considered highly recommended. A `dwciri:` analogue ([Section 2.5](#25-terms-in-the-dwciri-namespace-normative)) of `dwc:basisOfRecord` should not be used. Use `rdf:type` instead when the object is an IRI reference. Here is an example that describes a specimen using several of the terms that define the nature of a resource explicitly, including multiple `rdf:type` declarations: Both the Dublin Core and Darwin Core define terms that can be used to describe the nature of a resource: `dcterms:type` and `dwc:basisOfRecord` respectively. However, using these terms to describe the nature of the subject resource is not a substitute for use of `rdf:type`. The [DCMI notes on RDF semantics](https://www.dublincore.org/specifications/dublin-core/dc-rdf/2008-01-14/#sect-5) RECOMMEND that "applications implementing this specification primarily use and understand `rdf:type` in place of `dcterms:type` when expressing Dublin Core metadata in RDF, as most RDF processors come with built-in knowledge of `rdf:type`." A similar argument could be made for the use of `rdf:type` over `dwc:basisOfRecord`. Including `dc:type`, `dcterms:type`, and `dwc:basisOfRecord` in an RDF description should be considered OPTIONAL, while including `rdf:type` should be considered strongly RECOMMENDED. A `dwciri:` analogue ([Section 2.5](#25-terms-in-the-dwciri-namespace-normative)) of `dwc:basisOfRecord` SHOULD NOT be used. `rdf:type` SHOULD be used instead when the object is an IRI reference. Here is an example that describes a specimen using several of the terms that define the nature of a resource explicitly, including multiple `rdf:type` declarations:
**Example 8:** **Example 8:**
@ -493,13 +497,13 @@ Turtle
     dwc:basisOfRecord "PreservedSpecimen".      dwc:basisOfRecord "PreservedSpecimen".
``` ```
Refer to [Sections 2.4.3](#243-object-resources-that-have-been-previously-represented-by-literals-but-which-are-actually-non-literal-resources-non-normative) and [2.5](#25-terms-in-the-dwciri-namespace-normative) for an explanation of the distinction between terms in the `dc:`, `dcterms:`, `dwc:`, and `dwciri:` namespaces. Refer to [Sections 2.4.3](#243-object-resources-that-have-been-previously-represented-by-literals-but-which-are-actually-non-literal-resources-non-normative) and [2.5](#25-terms-in-the-dwciri-namespace-normative) for explanations of the distinctions between terms in the `dc:`, `dcterms:`, `dwc:`, and `dwciri:` namespaces.
##### 2.3.1.5 Classes to be used for type declarations of resources described using Darwin Core (normative) ##### 2.3.1.5 Classes to be used for type declarations of resources described using Darwin Core (normative)
The [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies that an object in the biodiversity domain that is identified by a GUID should be typed using a well-known vocabulary. With this recommendation in mind, it should be considered a best practice to provide information about the type (i.e., class membership) of any resource that is assigned a persistent identifier in the form of an IRI. Since Darwin Core is a well-known vocabulary and a ratified TDWG standard, its classes should be used for typing in preference to classes in parts of the TDWG ontology which are not ratified standards and are effectively deprecated. The human-readable definitions of the Darwin core classes provide guidance for deciding the types to assign to resources, although community consensus may be necessary to classify some of the more complex kinds of resources. ([Section 1.4.4](#144-limitations-of-this-guide-non-normative)) The [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies that an object in the biodiversity domain that is identified by a GUID SHOULD be typed using a well-known vocabulary. With this prescription in mind, it SHOULD be considered a best practice to provide information about the type (i.e., class membership) of any resource that is assigned a persistent identifier in the form of an IRI. Since Darwin Core is a well-known vocabulary and a ratified TDWG standard, its classes SHOULD be used for typing in preference to classes in parts of the [TDWG ontology](https://github.com/tdwg/ontology), which are not ratified standards and are effectively deprecated. The human-readable definitions of the Darwin Core classes provide guidance for deciding the types to assign to resources, although community consensus may be necessary to classify some of the more complex kinds of resources. ([Section 1.4.4](#144-limitations-of-this-guide-non-normative))
Any Darwin Core class IRI may be used as a value for `rdf:type`, although it is not clear whether `dwc:ResourceRelationship` instances make sense in the context of RDF. The following list summarizes classes included in the Dublin Core type vocabulary (but which are not part of Darwin Core) that should also be used for typing biodiversity-related resources: Any Darwin Core class IRI MAY be used as a value for `rdf:type`, although it is not clear whether `dwc:ResourceRelationship` instances make sense in the context of RDF. The following list summarizes classes included in the Dublin Core type vocabulary that MAY also be used for typing biodiversity-related resources:
```xml ```xml
dcmitype:StillImage dcmitype:StillImage
@ -517,13 +521,17 @@ dcmitype:Sound
dcmitype:PhysicalObject dcmitype:PhysicalObject
``` ```
```xml
dcmitype:Text
```
### 2.4 Object resources (non-normative) ### 2.4 Object resources (non-normative)
[Section 1.3.1](#131-serialization-and-syntax-non-normative) of the Introduction to this guide shows how the object of an RDF triple can be a expressed as either a string literal or an IRI reference. It is also possible to have non-literal objects that are not identified by an IRI. These are known as blank or anonymous nodes. This section describes how to express objects in each of these three forms. [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and [Section 1.5](#15-roles-of-text-strings-as-values-of-properties-in-dwc-namespace-non-normative) of the Introduction explains the issues involved in exposing data for which values as string literals (e.g., names, citations, and codes) are used as proxies for non-literal resources. This section also discusses strategies for expressing such data as RDF. [Section 1.3.1](#131-serialization-and-syntax-non-normative) of the Introduction to this guide shows how the object of an RDF triple can be a expressed as either a string literal or an IRI reference. It is also possible to have non-literal objects that are not identified by an IRI. These are known as blank or anonymous nodes. This section describes how to express objects in each of these three forms. [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and [Section 1.5](#15-roles-of-text-strings-as-values-of-properties-in-dwc-namespace-non-normative) of the Introduction explain the issues involved in exposing data for which values as string literals (e.g., names, citations, and codes) are used as proxies for non-literal resources. This section also discusses strategies for expressing such data as RDF.
#### 2.4.1 Literal object resources (normative) #### 2.4.1 Literal object resources (normative)
Some resources such as titles, dates, and numbers can be intrinsically expressed as strings. In cases where it is appropriate for the object of a triple to be a string, in RDF/XML the string is placed in a container element whose qualified name is the property: Some resources such as titles, dates, and numbers MAY be intrinsically expressed as strings. In cases where it is appropriate for the object of a triple to be a string, in RDF/XML the string is placed in a container element whose qualified name is the property:
```xml ```xml
<dwc:catalogNumber>s1987-00397</dwc:catalogNumber> <dwc:catalogNumber>s1987-00397</dwc:catalogNumber>
@ -541,7 +549,7 @@ RDF/XML
```xml ```xml
<rdf:Description rdf:about="http://bioimages.vanderbilt.edu/hessd/e5240#loc"> <rdf:Description rdf:about="http://bioimages.vanderbilt.edu/hessd/e5240#loc">
     <rdf:type rdf:resource ="http://purl.org/dc/terms/Location" />      <rdf:type rdf:resource="http://purl.org/dc/terms/Location"/>
     <dwc:decimalLatitude rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">35.857959</dwc:decimalLatitude>      <dwc:decimalLatitude rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">35.857959</dwc:decimalLatitude>
     <dwc:decimalLongitude rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">-86.298055</dwc:decimalLongitude>      <dwc:decimalLongitude rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">-86.298055</dwc:decimalLongitude>
     <dwc:coordinateUncertaintyInMeters rdf:datatype="http://www.w3.org/2001/XMLSchema#int">20</dwc:coordinateUncertaintyInMeters>      <dwc:coordinateUncertaintyInMeters rdf:datatype="http://www.w3.org/2001/XMLSchema#int">20</dwc:coordinateUncertaintyInMeters>
@ -558,7 +566,7 @@ Turtle
     dwc:coordinateUncertaintyInMeters "20"^^xsd:int.      dwc:coordinateUncertaintyInMeters "20"^^xsd:int.
``` ```
If the string expresses information in a particular language, a provider should include an [`xml:lang` attribute](http://www.w3.org/TR/REC-xml/#sec-lang-tag) to indicate the language of the string through the [RFC 4646 language code](http://www.ietf.org/rfc/rfc4646.txt) for that language. In addition to specifying the language of the string, providing a language tag entails that the described resource has the type `rdf:langString`. If the string expresses information in a particular language, a provider SHOULD include an [`xml:lang` attribute](http://www.w3.org/TR/REC-xml/#sec-lang-tag) to indicate the language of the string through the [RFC 4646 language code](http://www.ietf.org/rfc/rfc4646.txt) for that language. In addition to specifying the language of the string, providing a language tag entails that the described resource has the type `rdf:langString`.
**Example 10:** **Example 10:**
@ -579,23 +587,23 @@ Turtle
                                   "S. Claramunt, et al. 2009. Polifilia de Campylorhamphus y la Descripción de un Nuevo Género para C. pucherani (Dendrocolaptinae). The Awk 127(2):430-439."@es.                                    "S. Claramunt, et al. 2009. Polifilia de Campylorhamphus y la Descripción de un Nuevo Género para C. pucherani (Dendrocolaptinae). The Awk 127(2):430-439."@es.
``` ```
In the [RDF 1.1 Semantics](http://www.w3.org/TR/rdf11-mt/#literals-and-datatypes) specification, datatype D-entailment is a direct extension to basic RDF. The [entailment rules in that specification](http://www.w3.org/TR/rdf11-mt/#entailment-rules-informative) establish that literals without explicit datatype attributes or language tags have an implicit datatype `xsd:string`. That also entails that the `rdf:type` of those literals is `xsd:string`. The practical implication of this is that literals that are exposed without datatype attributes or language tags should be interpreted by clients to be a sequence of characters, and not some other abstract or non-information resource that a human might interpret the sequence of characters to represent. This has practical implications in [Section 2.4.3](#243-object-resources-that-have-been-previously-represented-by-literals-but-which-are-actually-non-literal-resources-non-normative) (where untyped literals are value strings intended to represent non-literal resources) and [Section 2.7](#27-darwin-core-convenience-terms-non-normative) (where untyped literals provide a convenient means for facilitating string-based searches). Although it is likely that many providers may initially choose to expose literals without datatype attributes, they should move towards replacing them with URIs or datatyped literals that accurately represent the type and properties of the resource that the untyped literals are intended to represent. In the [RDF 1.1 Semantics](http://www.w3.org/TR/rdf11-mt/#literals-and-datatypes) specification, datatype D-entailment is a direct extension to basic RDF. The [entailment rules in that specification](http://www.w3.org/TR/rdf11-mt/#entailment-rules-informative) establish that literals without explicit datatype attributes or language tags have an implicit datatype `xsd:string`. That also entails that the `rdf:type` of those literals is `xsd:string`. The practical implication of this is that literals that are exposed without datatype attributes or language tags SHOULD be interpreted by clients to be a sequence of characters, and not some other abstract or non-information resource that a human might interpret the sequence of characters to represent. This has practical implications in [Section 2.4.3](#243-object-resources-that-have-been-previously-represented-by-literals-but-which-are-actually-non-literal-resources-non-normative) (where untyped literals are value strings intended to represent non-literal resources) and [Section 2.7](#27-darwin-core-convenience-terms-non-normative) (where untyped literals provide a convenient means for facilitating string-based searches). Although it is likely that many providers may initially choose to expose literals without datatype attributes, they should move towards replacing them with URIs or datatyped literals that accurately represent the type and properties of the resource that the untyped literals are intended to represent.
[Section 3.4](#34-terms-defined-by-darwin-core-that-are-expected-to-be-used-only-with-literal-values-normative) indicates which Darwin Core terms would be appropriately used with values having datatype or language attributes. [Section 3.4](#34-terms-defined-by-darwin-core-that-are-expected-to-be-used-only-with-literal-values-normative) indicates which Darwin Core terms would be appropriately used with values having datatype or language attributes.
##### 2.4.1.2 Terms intended for use with literal objects (normative) ##### 2.4.1.2 Terms intended for use with literal objects (normative)
The definitions of some terms make it clear that they should be used with literal objects. Darwin Core specifically "imports" several [Dublin Core terms](http://dublincore.org/documents/dcmi-terms/) into its vocabulary for use in describing biodiversity data. In some cases, terms in the `dcterms:` namespace have range declarations of `rdfs:Literal` and are therefore understood to be intended for use with literal objects (strings). In some vocabularies, certain terms are required to have literal objects because in their definitions they are declared to be `owl:Datatype` properties. In the case of Darwin Core terms in the `dwc:` (`http://rs.tdwg.org/dwc/terms/`) namespace, the normative term definitions in RDF do not include any declarations that indicate whether the terms should be used with literal or IRI reference objects. (Exceptions to this are the various date-related terms, which inherit the range `rdfs:Literal` because they are `rdfs:subPropertyOf` `dcterms:date`.) However, because the `dwc:` terms were originally designed to accommodate text and XML data transfer, their definitions generally specify how term values should be expressed as string literals. This guide establishes the convention that terms in the `dwc:` namespace should be restricted to use with literal objects so that their use in RDF will be consistent with their definitions. As discussed in [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and [Section 2.5](#25-terms-in-the-dwciri-namespace-normative), this guide introduces a separate namespace `http://rs.tdwg.org/dwc/iri/` (abbreviated as `dwciri:`) for additional Darwin Core terms which are intended to have objects that are IRI references. The definitions of some terms make it clear that they SHOULD be used with literal objects. Darwin Core specifically "imports" several [Dublin Core terms](http://dublincore.org/documents/dcmi-terms/) into its vocabulary for use in describing biodiversity data. In some cases, terms in the `dcterms:` namespace have range declarations of `rdfs:Literal` and are therefore understood to be intended for use with literal objects (strings). In some vocabularies, certain terms are understood to have literal objects because in their definitions they are declared to be `owl:Datatype` properties. In the case of Darwin Core terms in the `dwc:` (`http://rs.tdwg.org/dwc/terms/`) namespace, the normative term definitions in RDF do not include any declarations that indicate whether the terms are intended to be used with literal or IRI reference objects. However, because the `dwc:` terms were originally designed to accommodate text and XML data transfer, their definitions generally specify how term values ought to be expressed as string literals. This guide establishes the convention that terms in the `dwc:` namespace SHOULD be restricted to use with literal objects so that their use in RDF will be consistent with their definitions. As discussed in [Section 1.4.3](#143-use-of-darwin-core-terms-in-rdf-normative) and [Section 2.5](#25-terms-in-the-dwciri-namespace-normative), this guide introduces a separate namespace `http://rs.tdwg.org/dwc/iri/` (abbreviated as `dwciri:`) for additional Darwin Core terms that are intended to have objects that are IRI references.
#### 2.4.2 Non-literal object resources (normative) #### 2.4.2 Non-literal object resources (normative)
Resources that are physical or conceptual often cannot be intrinsically represented as string literals and if identified, they are referenced in RDF by IRIs. Digital resources (e.g., images, web pages, etc.) could be represented as literals (the encoded content of the resource), but because many characters would be required to do that, they are usually referenced as independent entities through IRIs. In RDF/XML an IRI reference to a non-literal object can be made using the attribute `rdf:resource` in an empty XML element: Resources that are physical or conceptual often cannot be intrinsically represented as string literals and if identified, they are referenced in RDF by IRIs. Digital resources (e.g., images, web pages, etc.) could be represented as literals (the encoded content of the resource), but because many characters would be required to do that, they are usually referenced as independent entities through IRIs. In RDF/XML an IRI reference to a non-literal object MAY be made using the attribute `rdf:resource` in an empty XML element:
```xml ```xml
<dcterms:rightsHolder rdf:resource="http://biocol.org/urn:lsid:biocol.org:col:15666"/> <dcterms:rightsHolder rdf:resource="http://biocol.org/urn:lsid:biocol.org:col:15666"/>
``` ```
A description of the referenced non-literal object may be found within the same document, among data from another provider, or there may be no description of the object. If the RDF document will describe further properties of the non-literal, IRI-identified resource, those properties can be placed within an `rdf:Description` container element having an `rdf:about` attribute whose value is the IRI of the resource: A description of the referenced non-literal object might be found within the same document, among data from another provider, or there might be no description of the object. If the RDF document will describe further properties of the non-literal, IRI-identified resource, those properties MAY be placed within an `rdf:Description` container element having an `rdf:about` attribute whose value is the IRI of the resource:
**Example 11:** **Example 11:**
@ -606,7 +614,7 @@ A description of the referenced non-literal object may be found within the same
<dcterms:rightsHolder> <dcterms:rightsHolder>
``` ```
If the non-literal object is not identified by an IRI (i.e., it is a blank node), its properties can be placed within an `rdf:Description` container element that has no `rdf:about` attribute and which is itself within a container element for the property: If the non-literal object is not identified by an IRI (i.e., it is a blank node), its properties MAY be placed within an `rdf:Description` container element that has no `rdf:about` attribute and which is itself within a container element for the property:
**Example 12:** **Example 12:**
@ -620,7 +628,7 @@ If the non-literal object is not identified by an IRI (i.e., it is a blank node)
##### 2.4.2.1 When should non-literal object resources be described within the same document? (non-normative) ##### 2.4.2.1 When should non-literal object resources be described within the same document? (non-normative)
There are positive and negative aspects to describing a resource within the same document that references it. If the IRI is not dereferenceable, either because the IRI is not a type that can be dereferenced by HTTP (e.g., a URN) or because the issuer of the IRI is temporarily or permanently failing to respond to HTTP calls, then providing minimal information about the resource in the referencing document may be beneficial. For example, if a property referred to a printed book which had an ISBN but no HTTP IRI, as in: There are positive and negative aspects to describing a resource within the same document that references it. If the IRI is not dereferenceable, either because the IRI is not a type that can be dereferenced by HTTP (e.g., a URN) or because the issuer of the IRI is temporarily or permanently failing to respond to HTTP calls, then providing minimal information about the resource in the referencing document may be beneficial. For example, if a property referred to a printed book that had an ISBN but no HTTP IRI, as in:
**Example 13:** **Example 13:**
@ -654,7 +662,7 @@ could be included as part of the document that references the ISBN in URN form.
     dc:creator "Gleason, Henry A. and Arthur Cronquist".      dc:creator "Gleason, Henry A. and Arthur Cronquist".
``` ```
However, if the IRI references an object resource whose data are being actively managed by another provider, then any data which are included in the referencing document may become outdated. In that case, it is probably better to simply link the IRI, and let the consumer of the referring document dereference the IRI to retrieve the most up-to-date data about the object resource. In this example: However, if the IRI references an object resource whose data are being actively managed by another provider, then any data which are included in the referencing document might become outdated. In that case, it is probably better to simply link the IRI, and let the consumer of the referring document dereference the IRI to retrieve the most up-to-date data about the object resource. In this example:
**Example 15:** **Example 15:**
@ -678,7 +686,7 @@ the rights holder object IRI is managed by an institution other than the image o
###### 2.4.2.1.1 Objects identified by LSIDs (normative) ###### 2.4.2.1.1 Objects identified by LSIDs (normative)
In the previous example, the HTTP IRI used as the object of the `dcterms:rightsHolder` property was an HTTP-proxied form of an LSID. Because an LSID is a URN and therefore a type of IRI, the RDF specification does not prohibit the use of an LSID as an IRI referenced object. However, the TDWG LSID Applicability Guide standard dictates that LSIDs must not be used as the object of RDF triples (Recommendation 31 of the [GUID applicability statement](http://www.tdwg.org/standards/150/)) because a client would not necessarily be able to dereference the LSID to discover additional information about the object resource. The HTTP-proxied version of the LSID should be used instead (see [Section 2.2.3](#223-associating-a-urn-with-its-http-proxied-equivalent-normative)). See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information about using LSIDs in RDF. In the previous example, the HTTP IRI used as the object of the `dcterms:rightsHolder` property was an HTTP-proxied form of an LSID. Because an LSID is a URN and therefore a type of IRI, the RDF specification does not prohibit the use of an LSID as an IRI referenced object. However, the TDWG LSID Applicability Guide standard dictates that LSIDs MUST NOT be used as the object of RDF triples (Recommendation 31 of the [GUID applicability statement](http://www.tdwg.org/standards/150/)) because a client would not necessarily be able to dereference the LSID to discover additional information about the object resource. The HTTP-proxied version of the LSID SHOULD be used instead (see [Section 2.2.3](#223-associating-a-urn-with-its-http-proxied-equivalent-normative)). See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information about using LSIDs in RDF.
##### 2.4.2.2 Objects which are blank (anonymous) nodes (non-normative) ##### 2.4.2.2 Objects which are blank (anonymous) nodes (non-normative)
@ -713,7 +721,7 @@ the described subject resource refers to an object resource that is not identifi
#### 2.4.3 Object resources that have been previously represented by literals but which are actually non-literal resources (non-normative) #### 2.4.3 Object resources that have been previously represented by literals but which are actually non-literal resources (non-normative)
In databases the names of entities have frequently been used to represent the entities themselves. For example, the name of a person is often used as a proxy for the person, or an abbreviation for a language has been used to represent the language itself. Prior to the creation of the DCMI Abstract Model, many data which were described using terms in the legacy Dublin Core namespace `dc:` (`http://purl.org/dc/elements/1.1/`) followed the historical practice of using the name of an entity to represent a non-literal entity. Extending this practice to RDF would result in representations such as this: In databases the names of entities have frequently been used to represent the entities themselves. For example, the name of a person is often used as a proxy for the person, or an abbreviation for a language has been used to represent the language itself. Prior to the creation of the DCMI Abstract Model, many data that were described using terms in the legacy Dublin Core namespace `dc:` (`http://purl.org/dc/elements/1.1/`) followed the historical practice of using the name of an entity to represent a non-literal entity. Extending this practice to RDF would result in representations such as this:
**Example 17:** **Example 17:**
@ -736,7 +744,7 @@ However, over time, the community of RDF users has come to consider it a best pr
##### 2.4.3.1 Literal values for non-literal resources in Dublin Core (normative) ##### 2.4.3.1 Literal values for non-literal resources in Dublin Core (normative)
The introduction of the DCMI Abstract Model (DCAM) and subsequent guidelines for the use of [Dublin Core terms in RDF](http://dublincore.org/documents/dc-rdf/#sect-4) were intended to clarify the [use of Dublin Core terms with literal and non-literal objects](http://dublincore.org/documents/2008/01/14/dc-rdf-notes/#sect-3). In particular, ranges were declared for terms in the `dcterms:` namespace (`http://purl.org/dc/terms/`) with the intention of clarifying whether each term was intended for use with a literal or a non-literal value. For example, `dcterms:bibliographicCitation` has the range `rdfs:Literal`, while `dcterms:creator` has range `dcterms:Agent`. Because the term `dcterms:creator` has a non-literal range, it should be used with an object that is an IRI reference as illustrated in the following example: The introduction of the DCMI Abstract Model (DCAM) and subsequent guidelines for the use of [Dublin Core terms in RDF](https://www.dublincore.org/specifications/dublin-core/dc-rdf/2008-01-14/#sect-4) were intended to clarify the [use of Dublin Core terms with literal and non-literal objects](https://www.dublincore.org/specifications/dublin-core/dc-rdf-notes/2008-01-14/#sect-3). In particular, ranges were declared for terms in the `dcterms:` namespace (`http://purl.org/dc/terms/`) with the intention of clarifying whether each term was intended for use with a literal or a non-literal value. For example, `dcterms:bibliographicCitation` has the range `rdfs:Literal`, while `dcterms:creator` has range `dcterms:Agent`. Because the term `dcterms:creator` has a non-literal range, it SHOULD be used with an object that is an IRI reference as illustrated in the following example:
**Example 18:** **Example 18:**
@ -754,11 +762,11 @@ Turtle
<http://dbpedia.org/resource/Starry_night>dcterms:creator <http://viaf.org/viaf/9854560>. <http://dbpedia.org/resource/Starry_night>dcterms:creator <http://viaf.org/viaf/9854560>.
``` ```
The [Dublin Core RDF guidelines](http://dublincore.org/documents/dc-rdf/#sect-4) provided a mechanism using the term `rdf:value` to permit legacy string literal data to be associated with Dublin Core terms in the `dcterms:` namespace that were not intended for use with literal objects. Using this mechanism, a non-literal resource could be represented by a blank node having an `rdf:value` property whose value was the legacy string literal. This value is known as a "value string". However, the mechanism which involves using `rdf:value` as a predicate has not been widely implemented. At the time when the `dcterms:` terms were defined, terms in the `dc:` namespace were left without range declarations. Thus it has been considered [acceptable in the Dublin Core community](http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#Legacy_namespace) to use the `dc:` namespace terms with legacy string literals (i.e., value strings) as shown in Example 15. Many providers of non-RDF data may have used literal values for terms in the `dcterms:` namespace that have non-literal ranges. Note that all terms in the `dcterms:` namespace that have corresponding terms in the `dc:` namespace (i.e., terms with identical local names sensu [Best Practices Recipes](http://www.w3.org/TR/swbp-vocab-pub/#naming)) are [declared to be `rdfs:subPropertyOf` those `dc:` namespace terms](http://dublincore.org/usage/decisions/2008/dcterms-changes/#sect-2). So if a data provider's non-RDF database contains string values for terms in the `dcterms:` namespace having non-literal ranges, it is appropriate to expose those literals in RDF as values of corresponding `dc:` terms. The [Dublin Core RDF guidelines](https://www.dublincore.org/specifications/dublin-core/dc-rdf/2008-01-14/#sect-4) provided a mechanism using the term `rdf:value` to permit legacy string literal data to be associated with Dublin Core terms in the `dcterms:` namespace that were not intended for use with literal objects. Using this mechanism, a non-literal resource could be represented by a blank node having an `rdf:value` property whose value was the legacy string literal. This value is known as a "value string". However, the mechanism that involves using `rdf:value` as a predicate has not been widely implemented. At the time when the `dcterms:` terms were defined, terms in the `dc:` namespace were left without range declarations. Thus it has been considered [acceptable in the Dublin Core community](http://web.archive.org/web/20161120174206/http://wiki.dublincore.org/index.php/User_Guide/Publishing_Metadata#Legacy_namespace) to use the `dc:` namespace terms with legacy string literals (i.e., value strings) as shown in Example 15. Many providers of non-RDF data may have used literal values for terms in the `dcterms:` namespace that have non-literal ranges. Note that all terms in the `dcterms:` namespace that have corresponding terms in the `dc:` namespace (i.e., terms with identical local names sensu [Best Practices Recipes](http://www.w3.org/TR/swbp-vocab-pub/#naming)) are [declared to be `rdfs:subPropertyOf` those `dc:` namespace terms](http://web.archive.org/web/20190303000113/http://dublincore.org/usage/decisions/2008/dcterms-changes/#sect-2). So if a data provider's non-RDF database contains string values for terms in the `dcterms:` namespace having non-literal ranges, they SHOULD expose those literals in RDF as values of corresponding `dc:` terms.
##### 2.4.3.2 Literal values for non-literal resources in Darwin Core (normative) ##### 2.4.3.2 Literal values for non-literal resources in Darwin Core (normative)
Because there are many legacy data composed of string values of `dwc:` namespace (`http://rs.tdwg.org/dwc/terms/`) terms whose objects actually represent non-literal entities (i.e., value strings sensu DCAM), it is likely that many providers will at least initially expose such data in RDF as string literals served directly from their existing databases, e.g. [R2RML](http://www.w3.org/TR/r2rml/). To make it possible for the legacy data to be exposed as RDF while also ensuring that the meaning of those data is preserved, in RDF the literal value of an existing Darwin Core term in the `dwc:` namespace should have the same structure as that described in the term's description, as in Example 19. Because there are many legacy data composed of string values of `dwc:` namespace (`http://rs.tdwg.org/dwc/terms/`) terms whose objects actually represent non-literal entities (i.e., value strings sensu DCAM), it is likely that many providers will at least initially expose such data in RDF as string literals served directly from their existing databases, e.g., [R2RML](http://www.w3.org/TR/r2rml/). To make it possible for the legacy data to be exposed as RDF while also ensuring that the meaning of those data is preserved, in RDF the literal value of an existing Darwin Core term in the `dwc:` namespace SHOULD have the same structure as that described in the term's description, as in Example 19.
**Example 19:** **Example 19:**
@ -777,18 +785,18 @@ Turtle
     dwc:recordedBy "Oliver P. Pearson | Anita K. Pearson".      dwc:recordedBy "Oliver P. Pearson | Anita K. Pearson".
``` ```
The terms in the `dwc:` namespace should NOT be used for IRI reference objects, even if the term definition suggests that the object resource is of a non-literal type. Instead, IRI reference objects should be used with terms in the `dwciri:` namespace as defined by this guide in [Section 2.5](#25-terms-in-the-dwciri-namespace-normative). When a string value is provided as the object of a `dwc:` namespace term whose definition suggests that the object is of a non-literal type, that string is understood to be serving as a value string. The terms in the `dwc:` namespace SHOULD NOT be used for IRI reference objects, even if the term definition suggests that the object resource is of a non-literal type. Instead, IRI reference objects SHOULD be used with terms in the `dwciri:` namespace as defined by this guide in [Section 2.5](#25-terms-in-the-dwciri-namespace-normative). When a string value is provided as the object of a `dwc:` namespace term whose definition suggests that the object is of a non-literal type, that string is understood to be serving as a value string.
### 2.5 Terms in the dwciri: namespace (normative) ### 2.5 Terms in the dwciri: namespace (normative)
Terms in the namespace `dwciri:` (`http://rs.tdwg.org/dwc/iri/`) are intended for use with IRI reference objects and should NOT be used with literal objects. They may also be used with blank node objects, although in most cases this will probably be unnecessary. Terms in the namespace `dwciri:` (`http://rs.tdwg.org/dwc/iri/`) are intended for use with IRI reference objects and MUST NOT be used with literal objects. They MAY be used with blank node objects, although in most cases this will probably be unnecessary.
#### 2.5.1 Definition of dwciri: terms (normative) #### 2.5.1 Definition of dwciri: terms (normative)
If a term in the `dwciri:` namespace has a corresponding term with the same [local name](http://www.w3.org/TR/swbp-vocab-pub/#naming) in the `dwc:` namespace, the `dwciri:` namespace term is defined to have the same meaning as its `dwc:` namespace term analogue. In defining a `dwciri:` term that has a `dwc:` analogue, the definition of the `dwc:` term is understood to be modified in the following ways: If a term in the `dwciri:` namespace has a corresponding term with the same [local name](http://www.w3.org/TR/swbp-vocab-pub/#naming) in the `dwc:` namespace, the `dwciri:` namespace term is defined to have the same meaning as its `dwc:` namespace term analogue. In defining a `dwciri:` term that has a `dwc:` analogue, the definition of the `dwc:` term is understood to be modified in the following ways:
- when a `dwciri:` term is used as an RDF predicate, its non-literal object will be identified by an IRI reference rather than a string literal - when a `dwciri:` term is used as an RDF predicate, its non-literal object SHALL be identified by an IRI reference rather than a string literal
- the object of the `dwciri:` term predicate will be a single resource. If the `dwc:` term definition specifies that multiple values should be a concatenated list, the resource described by a `dwciri:` property should be the subject of a triple for each value on the list. Alternatively, a single triple can be used to describe the subject if the object is a single resource composed of component resources described using additional RDF triples. - the object of the `dwciri:` term predicate SHOULD be a single resource. If the `dwc:` term definition specifies that multiple values should be a concatenated list, the resource described by a `dwciri:` property SHOULD be the subject of a triple for each value on the list. Alternatively, a single triple MAY be used to describe the subject if the object is a single resource composed of component resources described using additional RDF triples.
Several terms in the `dwciri:` namespace do not have `dwc:` namespace analogues (`dwciri:inCollection`, `dwciri:toTaxon`, `dwciri:inDescribedPlace`, `dwciri:earliestGeochronologicalEra`, `dwciri:latestGeochronologicalEra`, `dwciri:fromLithostratigraphicUnit`, and `dwciri:inDataset`). Their definitions are given in [Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative). Several terms in the `dwciri:` namespace do not have `dwc:` namespace analogues (`dwciri:inCollection`, `dwciri:toTaxon`, `dwciri:inDescribedPlace`, `dwciri:earliestGeochronologicalEra`, `dwciri:latestGeochronologicalEra`, `dwciri:fromLithostratigraphicUnit`, and `dwciri:inDataset`). Their definitions are given in [Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative).
@ -802,8 +810,8 @@ RDF/XML
```xml ```xml
<rdf:Description rdf:about="http://arctos.database.museum/guid/MVZ:Mamm:115956"> <rdf:Description rdf:about="http://arctos.database.museum/guid/MVZ:Mamm:115956">
     <dwciri:recordedBy rdf:resource="http://viaf.org/viaf/263074474" />      <dwciri:recordedBy rdf:resource="http://viaf.org/viaf/263074474"/>
     <dwciri:recordedBy rdf:resource="http://museum-x.org/personnel/akp" />      <dwciri:recordedBy rdf:resource="http://museum-x.org/personnel/akp"/>
</rdf:Description> </rdf:Description>
``` ```
@ -818,11 +826,11 @@ where `<http://viaf.org/viaf/263074474>` is a persistent IRI identifier for the
#### 2.5.3 Expectation of clients encountering RDF containing dwc: and dwciri: terms (normative) #### 2.5.3 Expectation of clients encountering RDF containing dwc: and dwciri: terms (normative)
A client that encounters a triple having a term from the `dwciri:` namespace as its predicate can expect the object of the triple to be an IRI reference and subsequently may be able to dereference that IRI to obtain additional information about the entity that it represents. A client encountering a triple having a term from the `dwc:` namespace should be prepared to accept a literal object, although it is possible that some data providers unaware of this guide may have used `dwc:` terms with IRI references as `rdf:resource` attributes. Application developers should be flexible in their expectations for the values of properties from the `dwc:` namespace. A client that encounters a triple having a term from the `dwciri:` namespace as its predicate SHOULD expect the object of the triple to be an IRI reference and subsequently MAY be able to dereference that IRI to obtain additional information about the entity that it represents. A client encountering a triple having a term from the `dwc:` namespace SHOULD be prepared to accept a literal object, although it is possible that some data providers unaware of this guide might have used `dwc:` terms with IRI references as `rdf:resource` attributes. Application developers should be flexible in their expectations for the values of properties from the `dwc:` namespace.
### 2.6 Darwin Core ID terms and RDF (normative) ### 2.6 Darwin Core ID terms and RDF (normative)
Darwin Core contains a number of "ID" terms intended to designate identifiers, e.g., `dwc:occurrenceID`, `dwc:identificationID`, `dwc:locationID`, etc. The "ID" terms provide two functions, specifying the class of the resource and indicating that value of the term is an identifier. These functions are illustrated by the non-RDF XML below, which is part of an example provided in the [Darwin Core XML Guide ](http://rs.tdwg.org/dwc/terms/guides/xml/index.htm): Darwin Core contains a number of "ID" terms intended to designate identifiers, e.g., `dwc:occurrenceID`, `dwc:identificationID`, `dwc:locationID`, etc. The "ID" terms provide two functions, specifying the class of the resource and indicating that value of the term is an identifier. These functions are illustrated by the non-RDF XML below, which is part of a hypothetical example provided in the [Darwin Core XML Guide ](http://rs.tdwg.org/dwc/terms/guides/xml/):
**Example 21:** **Example 21:**
@ -845,7 +853,7 @@ dwc:identificationID | dwc:identifiedBy | dwc:dateIdentified | dwc:taxonID
--- | --- | --- | --- --- | --- | --- | ---
"http://guid.mvz.org/identifications/23459" | "Richard Sage" | "2000" | "urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721" "http://guid.mvz.org/identifications/23459" | "Richard Sage" | "2000" | "urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721"
In Example 21 and Table 3, the "ID" terms are used to specify both the identifier of a resource which is the subject of the record itself (using the term `dwc:identificationID`) and to specify an object resource related to the subject resource by a foreign key (using the term `dwc:taxonID`). Because the ID terms are not designated to be used with subjects or objects specifically, the resource they identify must be made clear by the context in which they are used. In the case of the database table, a pre-existing understanding between the data provider and consumer would indicate that the rows of the table represent `dwc:Identification` instances and therefore the `dwc:identificationID` property would provide the identifier of the subject and any other ID terms would refer to object resources that are related to that particular identification instance. In the XML example, the type of the subject resource is made clear through the record's container element and hence a consumer would know that the `dwc:identificationID` property referred to the subject resource. In RDF, the two functions (specifying type and referencing an identifier) are handled separately using `rdf:type` declarations ([Section 2.3.1](#231-declaring-the-type-of-the-resource-non-normative)) and defined mechanisms for expressing the identifier of the subject resource ([Section 2.2](#22-subject-resources-normative)). Because these mechanisms are well-known best practices outside TDWG, they should be used rather than using the Darwin Core ID terms when data are expressed as RDF. The following example shows an appropriate way to express the information in Example 21 as RDF: In Example 21 and Table 3, the "ID" terms are used to specify both the identifier of a resource that is the subject of the record itself (using the term `dwc:identificationID`) and to specify an object resource related to the subject resource by a foreign key (using the term `dwc:taxonID`). Because the ID terms are not designated to be used with subjects or objects specifically, the resource they identify must be made clear by the context in which they are used. In the case of the database table, a pre-existing understanding between the data provider and consumer would indicate that the rows of the table represent `dwc:Identification` instances and therefore the `dwc:identificationID` property would provide the identifier of the subject and any other ID terms would refer to object resources that are related to that particular identification instance. In the XML example, the type of the subject resource is made clear through the record's container element and hence a consumer would know that the `dwc:identificationID` property referred to the subject resource. In RDF, the two functions (specifying type and referencing an identifier) are handled separately using `rdf:type` declarations ([Section 2.3.1](#231-declaring-the-type-of-the-resource-non-normative)) and defined mechanisms for expressing the identifier of the subject resource ([Section 2.2](#22-subject-resources-normative)). Because these mechanisms are well-known best practices outside TDWG, they SHOULD be used rather than using the Darwin Core ID terms, which MUST NOT be used when data are expressed as RDF. The following example shows an appropriate way to express the information in Example 21 as RDF:
**Example 22:** **Example 22:**
@ -873,9 +881,9 @@ Turtle:
The following points about the Example 22 should be noted: The following points about the Example 22 should be noted:
1. In RDF/XML use the `rdf:about` attribute of an `rdf:Description` element to specify an IRI identifier used with the subject ID term, i.e., the value of `dwc:identificationID` is an HTTP IRI and can therefore be used with the `rdf:about` attribute. The `dcterms:identifier` property can also be used to express the primary identifier for the subject resource as a literal. In Example 22, the HTTP IRI is considered the primary identifier for the Identification instance so it is included as a string value of `dcterms:identifier` as well as the IRI of the subject resource. Refer to [Section 2.2.1](#221-identifying-subject-resources-using-iris-normative) and [2.2.2](#222-associating-a-string-identifier-with-a-subject-resource-normative) for more details. 1. In RDF/XML use the `rdf:about` attribute of an `rdf:Description` element to specify an IRI identifier used with the subject ID term, i.e., the value of `dwc:identificationID` is an HTTP IRI and SHOULD therefore be used with the `rdf:about` attribute. The `dcterms:identifier` property MAY also be used to express the primary identifier for the subject resource as a literal. In Example 22, the HTTP IRI is considered the primary identifier for the Identification instance so it is included as a string value of `dcterms:identifier` as well as the IRI of the subject resource. Refer to [Section 2.2.1](#221-identifying-subject-resources-using-iris-normative) and [2.2.2](#222-associating-a-string-identifier-with-a-subject-resource-normative) for more details.
2. Data providers who want to relate a subject resource to related non-literal resources should use [object properties](http://www.w3.org/TR/owl-primer/#Object_Properties) (i.e., properties which relate IRI-identified instances to other IRI-identified instances) from a well-known vocabulary or ontology. Darwin Core does not generally define object properties that connect its core classes and in those cases users will have to find object properties outside of Darwin Core (see the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md)] for examples). In this example, the term `dwciri:toTaxon` (see [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative)) is used to relate the `dwc:Identification` instance to a taxon instance. (Please note that this is for illustration purposes only and this guide takes no position on the nature of taxa or taxon concepts or whether the resource used in this example is actually a taxon or not.) 2. Data providers who want to relate a subject resource to related non-literal resources SHOULD use [object properties](http://www.w3.org/TR/owl-primer/#Object_Properties) (i.e., properties that relate IRI-identified instances to other IRI-identified instances) from a well-known vocabulary or ontology. Darwin Core does not generally define object properties that connect its core classes and in those cases users will have to find object properties outside of Darwin Core (see the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md)] for examples). In this example, the term `dwciri:toTaxon` (see [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative)) is used to relate the `dwc:Identification` instance to a taxon instance. (Please note that this is for illustration purposes only and this guide takes no position on the nature of taxa or taxon concepts or whether the resource used in this example is actually a taxon or not.)
3. If an identified object of a triple is a non-literal resource ([Section 2.4.2](#242-non-literal-object-resources-normative)), RDF requires that it be referenced by an IRI. Although the UUID "d79c11aa-29c1-102b-9a4a-00304854f820:col20120721" is a globally unique and hopefully persistent identifier for the taxon, it is not an IRI. Catalog of Life has created an IRI from the UUID in the form of an LSID: 3. If an identified object of a triple is a non-literal resource ([Section 2.4.2](#242-non-literal-object-resources-normative)), RDF REQUIRES that it be referenced by an IRI. Although the UUID "d79c11aa-29c1-102b-9a4a-00304854f820:col20120721" is a globally unique and hopefully persistent identifier for the taxon, it is not an IRI. Catalog of Life has created an IRI from the UUID in the form of an LSID:
```xml ```xml
<urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721> <urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721>
@ -887,13 +895,17 @@ so expressing the object reference as
<dwciri:toTaxon rdf:resource="urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721"/> <dwciri:toTaxon rdf:resource="urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721"/>
``` ```
would be valid RDF. However, best practices ([Section 2.4.2.1.1](#24211-objects-identified-by-lsids-normative)) specify that when LSIDs are the objects of triples, they should be in HTTP-proxied form. In Example 22, the LSID is proxied using the TDWG LSID resolver. would be valid RDF. However, best practices ([Section 2.4.2.1.1](#24211-objects-identified-by-lsids-normative)) specify that when LSIDs are the objects of triples, they SHOULD be in HTTP-proxied form. In Example 22, the LSID is proxied using the TDWG LSID resolver.
4. The class of the subject identification instance (`dwc:Identification`) is asserted explicitly using `rdf:type`. The type of the object taxon instance is not stated directly - a client would need to dereference the IRI to discover it. 4. The class of the subject identification instance (`dwc:Identification`) is asserted explicitly using `rdf:type`. The type of the object taxon instance is not stated directly - a client would need to dereference the IRI to discover it.
#### 2.6.1 Unintended consequences of using Darwin Core ID terms in RDF (non-normative) #### 2.6.1 Unintended consequences of using Darwin Core ID terms in RDF (non-normative)
The previous section showed that using a Darwin Core ID term to indicate the identifier associated with the subject resource is not necessary because there are well-known means in RDF (the `rdf:about` attribute and the `dcterms:identifier` property) for exposing the subjects identifier. However, as shown below, using a Darwin Core ID term to identify an object resource (as shown in the non-RDF XML Example 21) would actually be problematic. In its normative definition, each Darwin Core ID term is declared to be `rdfs:subPropertyOf` `dcterms:identifier`. In RDF, the purpose of a subproperty declaration is to allow a client with reasoning capability to infer a triple containing a broader (and presumably more well-known) property. In the terminology of Dublin Core, the "ID" term is a [qualifier](http://dublincore.org/documents/usageguide/qualifiers.shtml) which "refines" a basic Dublin Core term and the process of inferring a broader meaning from a more specific term is called a "dumb-down" operation. If a provider attempting to expose the data of Table 3 as RDF used the `dwc:taxonID` term as a property of the identification as shown in the (incorrect) Example 23: *Note added in 2021-07-15 version:*
*When the Darwin Core vocabulary was brought into conformance with the [Standards Documentation Specification](http://rs.tdwg.org/sds/doc/specification/), in order to comply with [Section 4.4.2.2](http://rs.tdwg.org/sds/doc/specification/#44-vocabularies-term-lists-and-terms), all subproperty declarations were removed from the term metadata. Therefore, the effects described in this section are no longer entailed by RDF metadata describing any Darwin Core terms. This explanation has been retained to explain part of the rationale for why ID terms were not recommended for use with RDF in the original specification.*
The previous section showed that using a Darwin Core ID term to indicate the identifier associated with the subject resource is not necessary because there are well-known means in RDF (the `rdf:about` attribute and the `dcterms:identifier` property) for exposing the subjects identifier. However, as shown below, using a Darwin Core ID term to identify an object resource (as shown in the non-RDF XML Example 21) would actually be problematic. In its normative definition, each Darwin Core ID term is declared to be `rdfs:subPropertyOf` `dcterms:identifier`. In RDF, the purpose of a subproperty declaration is to allow a client with reasoning capability to infer a triple containing a broader (and presumably more well-known) property. In the terminology of Dublin Core, the "ID" term is a [qualifier](https://www.dublincore.org/specifications/dublin-core/usageguide/qualifiers/) which "refines" a basic Dublin Core term and the process of inferring a broader meaning from a more specific term is called a "dumb-down" operation. If a provider attempting to expose the data of Table 3 as RDF used the `dwc:taxonID` term as a property of the identification as shown in the (incorrect) Example 23:
**Example 23:** **Example 23:**
@ -952,14 +964,14 @@ In general, it should not be necessary for a data provider to recreate hierarchi
#### 2.7.2 Literal convenience terms versus a single object property reference (normative) #### 2.7.2 Literal convenience terms versus a single object property reference (normative)
There are several groups of convenience terms in the `dwc:` namespace which may be used to provide literal values for the purposes listed above ([Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative)). In the case of each of these groups, it is not expected that a provider will link to IRI references for each level in the hierarchy. Therefore, `dwciri:` analogues are not defined for those convenience terms from the `dwc:` namespace. Rather, for each category of convenience terms, there is a single `dwciri:` namespace term (having no analogue in the `dwc:` namespace; [Section 3.7](#37-dwc-namespace-terms-that-have-analogues-in-the-dwciri-namespace-normative)) that can be used to link to the lowest available level in the hierarchy with the understanding that the RDF of the object resource will provide links to other IRIs for higher levels of the hierarchy. Such `dwciri:` terms can refer to any level in the hierarchy if there is uncertainty about the identity of lower levels, or if lower levels do not exist. There are several groups of convenience terms in the `dwc:` namespace which MAY be used to provide literal values for the purposes listed above ([Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative)). In the case of each of these groups, it is not expected that a provider will link to IRI references for each level in the hierarchy. Therefore, `dwciri:` analogues are not defined for those convenience terms from the `dwc:` namespace. Rather, for each category of convenience terms, there is a single `dwciri:` namespace term (having no analogue in the `dwc:` namespace; [Section 3.7](#37-dwc-namespace-terms-that-have-analogues-in-the-dwciri-namespace-normative)) that MAY be used to link to the lowest available level in the hierarchy with the understanding that the RDF of the object resource will provide links to other IRIs for higher levels of the hierarchy. Such `dwciri:` terms MAY refer to any level in the hierarchy if there is uncertainty about the identity of lower levels, or if lower levels do not exist.
**Table 6** **Table 6**
Category of convenience term | Object property | Category of object resource | rdf:type of example resource Category of convenience term | Object property | Category of object resource | rdf:type of example resource
--- | --- | --- | --- --- | --- | --- | ---
ownership of collection item (Section 2.7.3) | dwciri:inCollection | collection | not specified ownership of collection item (Section 2.7.3) | dwciri:inCollection | collection | not specified
description of a taxonomic entity (Section 2.7.4) | dwciri:toTaxon | taxon concept; taxon name use | dwc:Taxon description of a taxonomic entity (Section 2.7.4) | dwciri:toTaxon | taxon concept; taxon name usage | dwc:Taxon
names of geographic subdivisions (Section 2.7.5) | dwciri:inDescribedPlace | geographic place | gn:Feature names of geographic subdivisions (Section 2.7.5) | dwciri:inDescribedPlace | geographic place | gn:Feature
chronostratographic (geological timescale) description (Section 2.7.6) | dwciri:earliestGeochronologicalEra dwciri:latestGeochronologicalEra | geochronological time period | gsml:GeochronologicEra chronostratographic (geological timescale) description (Section 2.7.6) | dwciri:earliestGeochronologicalEra dwciri:latestGeochronologicalEra | geochronological time period | gsml:GeochronologicEra
lithostratigraphy descriptors (Section 2.7.7) | dwciri:fromLithostratigraphicUnit | lithostratigraphic unit | not specified lithostratigraphy descriptors (Section 2.7.7) | dwciri:fromLithostratigraphicUnit | lithostratigraphic unit | not specified
@ -972,9 +984,9 @@ Historically, the set of values for `dwc:institutionCode`, `dwc:collectionCode`,
dwc:basisOfRecord | dwc:institutionCode | dwc:collectionCode | dwc:catalogNumber | dwc:collectionID dwc:basisOfRecord | dwc:institutionCode | dwc:collectionCode | dwc:catalogNumber | dwc:collectionID
--- | --- | --- | --- | --- --- | --- | --- | --- | ---
"PreservedSpecimen" | "MVZ" | "Mamm" | "115956" | "urn:lsid:biocol.org:col:34904" "PreservedSpecimen" | "MVZ" | "Mamm" | "115956" | "http://grbio.org/cool/0rht-pj95"
In RDF, unique identification of collection items is done through the IRI which acts as a globally unique identifier for that item. The Darwin Core triplet properties may still be provided as literal values, but ownership/control of the collection item should be indicated using `dwciri:inCollection` with an HTTP IRI as the IRI-reference object. For physical specimens, the recommended best practice is to use a collection IRI from a collections registry such as an HTTP-proxied LSID from the [Global Registry of Biorepositories](http://grbio.org/). Example 24 illustrates this for the data from Table 7. In RDF, unique identification of collection items is done through the IRI, which acts as a globally unique identifier for that item. The Darwin Core triplet properties may still be provided as literal values, but ownership/control of the collection item SHOULD be indicated using `dwciri:inCollection` with an HTTP IRI as the IRI-reference object. For physical specimens, the recommended best practice is to use a collection IRI from a collections registry such as an HTTP IRI from the [GBIF Registry of Scientific Collections](https://www.gbif.org/grscicoll). Example 24 illustrates this for the data from Table 7.
**Example 24:** **Example 24:**
@ -986,7 +998,7 @@ RDF/XML
     <dwc:institutionCode>MVZ</dwc:institutionCode>      <dwc:institutionCode>MVZ</dwc:institutionCode>
     <dwc:collectionCode>Mamm</dwc:collectionCode>      <dwc:collectionCode>Mamm</dwc:collectionCode>
     <dwc:catalogNumber>115956</dwc:catalogNumber>      <dwc:catalogNumber>115956</dwc:catalogNumber>
     <dwciri:inCollection rdf:resource="http://biocol.org/urn:lsid:biocol.org:col:34904"/>      <dwciri:inCollection rdf:resource="http://grbio.org/cool/0rht-pj95"/>
</rdf:Description> </rdf:Description>
``` ```
@ -997,16 +1009,16 @@ Turtle
     dwc:collectionCode "Mamm";      dwc:collectionCode "Mamm";
     dwc:institutionCode "MVZ";      dwc:institutionCode "MVZ";
     dwc:catalogNumber "115956";      dwc:catalogNumber "115956";
     dwciri:inCollection <http://biocol.org/urn:lsid:biocol.org:col:34904>.      dwciri:inCollection <http://grbio.org/cool/0rht-pj95>.
``` ```
#### 2.7.4 Description of a taxonomic entity (normative) #### 2.7.4 Description of a taxonomic entity (normative)
The consensus embodied in the [TDWG Taxon Concept Transfer Schema (TCS) standard](http://www.tdwg.org/standards/117/) is that identification instances refer to taxon concept instances. Therefore it would be a best practice to describe taxonomic entities in RDF as taxon concepts sensu TCS. However, because the TCS standard is an XML schema, it is not directly translatable to RDF. It is considered to be out of the scope of this document to specify how taxon concepts should be rendered as RDF. Nevertheless, Darwin Core does define many convenience terms listed under the `dwc:Taxon` class that can be used as properties of `dwc:Identification` instances ([Section 3.5](#35-darwin-core-convenience-terms-that-are-expected-to-be-used-only-with-literal-values-normative)). The consensus embodied in the [TDWG Taxon Concept Transfer Schema (TCS) standard](http://www.tdwg.org/standards/117/) is that identification instances refer to taxon concept instances. Therefore it would be a best practice to describe taxonomic entities in RDF as taxon concepts sensu TCS. However, because the TCS standard is an XML schema, it is not directly translatable to RDF. It is considered to be out of the scope of this document to specify how taxon concepts should be rendered as RDF. Nevertheless, Darwin Core does define many convenience terms listed under the `dwc:Taxon` class that can be used as properties of `dwc:Identification` instances ([Section 3.5](#35-darwin-core-convenience-terms-that-are-expected-to-be-used-only-with-literal-values-normative)).
It might be argued that these convenience terms would more appropriately be properties of a `dwc:Taxon` instance. However, the object properties necessary to relate `dwc:Taxon` instances to name entities, references, parent taxa, and child taxa do not exist and the exact relationship between taxonomic entities such as taxon concepts, protonyms, taxon name uses, etc. has not been established using RDF. So the creation of functional `dwc:Taxon` instances described using RDF is not possible at the present time. Therefore this document establishes the convention that convenience terms for taxonomic entities should be properties of `dwc:Identification`. The task of describing taxonomic entities using RDF must be an effort outside of Darwin Core. This guide does establish the object property `dwciri:toTaxon` for use in relating a Darwin Core identification instance to a taxonomic entity as defined elsewhere. It might be argued that these convenience terms would more appropriately be properties of a `dwc:Taxon` instance. However, the object properties necessary to relate `dwc:Taxon` instances to name entities, references, parent taxa, and child taxa do not exist and the exact relationship between taxonomic entities such as taxon concepts, protonyms, taxon name usages, etc. has not been established using RDF. So the creation of functional `dwc:Taxon` instances described using RDF is not possible at the present time. Therefore this document establishes the convention that convenience terms for taxonomic entities SHOULD be properties of `dwc:Identification`. The task of describing taxonomic entities using RDF will have to be an effort outside of Darwin Core. This guide does establish the object property `dwciri:toTaxon` for use in relating a Darwin Core identification instance to a taxonomic entity as defined elsewhere.
Consider the following example where Takuma Yun identified a spider to the species _Hersilia yaeyamaensis_ using information in Tanikawa (1999). The data about this identification was listed in a database as shown in Table 8. Consider the following hypothetical example where Takuma Yun identified a spider to the species _Hersilia yaeyamaensis_ using information in Tanikawa (1999). The data about this identification was listed in a database as shown in Table 8.
**Table 8** (non-normative) **Table 8** (non-normative)
@ -1051,7 +1063,7 @@ In the example, providing the triple
<http://museum.or.jp/9AC9BD26-8B41-458A-AA35-503A4527D009> dwc:order  "Araneae" <http://museum.or.jp/9AC9BD26-8B41-458A-AA35-503A4527D009> dwc:order  "Araneae"
``` ```
should not be taken to imply that Takuma Yun asserted that the spider he identified was classified within the order Araneae, nor should the RDF be assumed to imply that Takuma Yun asserted that Aranaeae is the name of a parent taxon of the genus _Hersilia_. Those sorts of assertions would need to be made using more complex RDF and a more expressive vocabulary outside of Darwin Core. The RDF simply makes it easier for users who are looking for spider identifications to search for them by looking for identifications having `dwc:order` of "Araneae", `dcw:genus` of "Hersilia", and a specific epithet of "yaeyamaensis". If it can be determined (perhaps at a later time) that the taxon described by the convenience terms corresponds to a particular IRI-identified instance, the identification instance can be linked to it using an object property, e.g., should not be taken to imply that Takuma Yun asserted that the spider he identified was classified within the order Araneae, nor should the RDF be assumed to imply that Takuma Yun asserted that Aranaeae is the name of a parent taxon of the genus _Hersilia_. Those sorts of assertions would need to be made using more complex RDF and a more expressive vocabulary outside of Darwin Core. The RDF simply makes it easier to search for spider identifications having `dwc:order` of "Araneae", `dcw:genus` of "Hersilia", and a specific epithet of "yaeyamaensis". If it can be determined (perhaps at a later time) that the taxon described by the convenience terms corresponds to a particular IRI-identified instance, the identification instance can be linked to it using an object property, e.g.,
```xml ```xml
<http://muse.or.jp/9AC9BD26-8B41-458A-AA35-503A4527D009> dwciri:toTaxon <http://zoobank.org/75C9EA16-72B1-44C9-AD40-3C3D41323AB9> <http://muse.or.jp/9AC9BD26-8B41-458A-AA35-503A4527D009> dwciri:toTaxon <http://zoobank.org/75C9EA16-72B1-44C9-AD40-3C3D41323AB9>
@ -1059,7 +1071,7 @@ should not be taken to imply that Takuma Yun asserted that the spider he identif
#### 2.7.5 Names of geographic subdivisions (normative) #### 2.7.5 Names of geographic subdivisions (normative)
The data from Table 5 can be expressed as shown in Example 26. In the example, the term `dwciri:inDescribedPlace` is used as an object property to link the `dcterms:Location` instance to an IRI for the lowest known geographic subdivision which applies to the locality. `dwc:locality` could also be used to provide a string literal description of the specific description of the place. The data from Table 5 can be expressed as shown in Example 26. The term `dwciri:inDescribedPlace` SHOULD be used as an object property to link the `dcterms:Location` instance to an IRI for the lowest known geographic subdivision which applies to the locality. `dwc:locality` MAY also be used to provide a string literal description of the specific description of the place.
**Example 26:** **Example 26:**
@ -1093,11 +1105,11 @@ Turtle
     dwciri:inDescribedPlace <http://sws.geonames.org/4653638/>.      dwciri:inDescribedPlace <http://sws.geonames.org/4653638/>.
``` ```
Because generic RDF places no restrictions on repeating properties, a `dcterms:Location` instance could have multiple `dwciri:inDescribedPlace` properties if the location is included within several described geographic subdivisions. For example, a particular location could be included within `<http://sws.geonames.org/4626068/>` (The Great Smoky Mountains National Park which straddles two states) and `<http://sws.geonames.org/4656568/>` (Sevier County, Tennessee, US, which is the lowest level political subdivision). Because generic RDF places no restrictions on repeating properties, a `dcterms:Location` instance could have multiple `dwciri:inDescribedPlace` properties if the location is included within several described geographic subdivisions. For example, a particular location could be included within `<http://sws.geonames.org/4626068/>` (The Great Smoky Mountains National Park, which straddles two states) and `<http://sws.geonames.org/4656568/>` (Sevier County, Tennessee, US, which is the lowest level political subdivision).
#### 2.7.6 Chronostratographic (geological timescale) descriptors (normative) #### 2.7.6 Chronostratographic (geological timescale) descriptors (normative)
The following example is taken from <http://dx.doi.org/10.1098/rsbl.2011.0228>, which involves the geological context of a fossil serving as a holotype for a species description. In this example (Table 9), there is a single value given for the Epoch (Middle Jurassic), so the values for each of the `earliest.../latest...` stratigraphic timescale term pairs are the same. The following example is taken from <http://dx.doi.org/10.1098/rsbl.2011.0228>, which involves the geological context of a fossil serving as a holotype for a species description. In this example (Table 9), there is a single value given for the Epoch (Middle Jurassic), so the values for each of the `earliest.../latest...` stratigraphic timescale term pairs SHOULD be the same.
**Table 9** (non-normative) **Table 9** (non-normative)
@ -1137,7 +1149,7 @@ In this example, the object properties `dwciri:earliestGeochronologicalEra` and
#### 2.7.7 Lithostratigraphy descriptors (normative) #### 2.7.7 Lithostratigraphy descriptors (normative)
Since lithostratigraphic units are hierarchical, the pattern followed with the other hierarchical convenience terms applies to the Darwin Core lithostratigraphic terms categorized under the `dwc:GeologicalContext` class (`dwc:group`, `dwc:formation`, `dwc:member`, and `dwc:bed`). The object property `dwciri:fromLithostratigraphicUnit` can be used to link the IRI for a lithostratigraphic unit at the lowest appropriate level. Since lithostratigraphic units are hierarchical, the pattern followed with the other hierarchical convenience terms applies to the Darwin Core lithostratigraphic terms categorized under the `dwc:GeologicalContext` class (`dwc:group`, `dwc:formation`, `dwc:member`, and `dwc:bed`). The object property `dwciri:fromLithostratigraphicUnit` SHOULD be used to link the IRI for a lithostratigraphic unit at the lowest appropriate level.
### 2.8 Darwin Core association terms (non-normative) ### 2.8 Darwin Core association terms (non-normative)
@ -1163,7 +1175,7 @@ dwc:organismID | dwc:associatedOrganisms | dwc:associatedMedia
--- | --- | --- --- | --- | ---
"http://bioimages.vanderbilt.edu/ind-durandp/dd343" | "sibling of AX3467" | "http://bioimages.vanderbilt.edu/durandp/dd343 &#124; http://bioimages.vanderbilt.edu/durandp/dd344" "http://bioimages.vanderbilt.edu/ind-durandp/dd343" | "sibling of AX3467" | "http://bioimages.vanderbilt.edu/durandp/dd343 &#124; http://bioimages.vanderbilt.edu/durandp/dd344"
These data can be serialized as RDF using the `dwc:` namespace literal value terms `dwc:associatedOrganisms` and `dwc:associatedMedia` as shown in Example 28. These data MAY be serialized as RDF using the `dwc:` namespace literal value terms `dwc:associatedOrganisms` and `dwc:associatedMedia` as shown in Example 28.
**Example 28:** **Example 28:**
@ -1187,17 +1199,17 @@ Turtle:
Because the values of the association terms are literals, a consuming client would need to carry out additional processing to determine the identity of the associated resources referenced in the literals. In the case of the value for `dwc:associatedOrgansims`, the client would have to determine that the substring "AX3467" was an identifier for the organism and determine the nature of the relationship represented by the substring "sibling of". In the case of the value for `dwc:associatedMedia`, the client would need to parse the identifiers included in the value string, then determine that those substrings were IRIs. Because the values of the association terms are literals, a consuming client would need to carry out additional processing to determine the identity of the associated resources referenced in the literals. In the case of the value for `dwc:associatedOrgansims`, the client would have to determine that the substring "AX3467" was an identifier for the organism and determine the nature of the relationship represented by the substring "sibling of". In the case of the value for `dwc:associatedMedia`, the client would need to parse the identifiers included in the value string, then determine that those substrings were IRIs.
The advantage of presenting the literal values of Darwin Core association properties via RDF as in Example 28 is that a provider could easily expose existing data with little effort. The disadvantage is that there would be a significant processing burden on consuming clients. It is possible that some literal values would be uninterpretable without additional information from another source. For example, how can the organism identified by "AX3467" be distinguished globally from other organisms that might have also been assigned the identifier "AX3467" ? The advantage of presenting the literal values of Darwin Core association properties via RDF as in Example 28 is that a provider could easily expose existing data with little effort. The disadvantage is that there would be a significant processing burden on consuming clients. It is possible that some literal values would be uninterpretable without additional information from another source. For example, how can the organism identified by "AX3467" be distinguished globally from other organisms that might have also been assigned the identifier "AX3467"?
#### 2.8.3 Expressing Darwin Core association terms as RDF with URI references (normative) #### 2.8.3 Expressing Darwin Core association terms as RDF with URI references (normative)
Because a Darwin Core association term property/value pair actually encodes two or more discrete "facts", it is probably better to represent the information contained in that property/value pair by more than a single triple. Because a Darwin Core association term property/value pair actually encodes two or more discrete "facts", it is probably better to represent the information contained in that property/value pair by more than a single triple.
The well-known Dublin Core term `dcterms:relation` (`<http://purl.org/dc/terms/relation>`) can be used to link related resources. `dcterms:relation` does not specify the exact nature of the relationship, although it has a number of declared subproperties that more precisely specify kinds and directions of the relationships (for example: `dcterms:hasPart`, `dcterms:isPartOf`, `dcterms:hasFormat`, `dcterms:isFormatOf`, `dcterms:references`, `dcterms:isVersionOf`, etc.). So `dcterms:relation` is a generic term that may be used to indicate that a resource has an unspecified association with some other resource. The well-known Dublin Core term `dcterms:relation` (`<http://purl.org/dc/terms/relation>`) MAY be used to link related resources. `dcterms:relation` does not specify the exact nature of the relationship, although it has a number of declared subproperties that more precisely specify kinds and directions of the relationships (for example: `dcterms:hasPart`, `dcterms:isPartOf`, `dcterms:hasFormat`, `dcterms:isFormatOf`, `dcterms:references`, `dcterms:isVersionOf`, etc.). So `dcterms:relation` is a generic term that MAY be used to indicate that a resource has an unspecified association with some other resource.
The term `rdf:type` ([Section 2.3.1](#231-declaring-the-type-of-the-resource-non-normative)) is the standard property for indicating the type of a resource in RDF. So it should be used to declare the type of the object resource of a Darwin Core association term property/value pair. The term `rdf:type` ([Section 2.3.1](#231-declaring-the-type-of-the-resource-non-normative)) is the standard property for indicating the type of a resource in RDF. So it SHOULD be used to declare the type of the object resource of a Darwin Core association term property/value pair.
The nature and direction of the association between the subject and object resource can be described more precisely if an appropriate term exists for that purpose. For example, in many cases, the relationship between a subject and object resource that are linked by `dwc:associatedMedia` is depiction. Thus the well known term `foaf:depiction` can be used to link two resources in lieu of, or in addition to `dcterms:relation`. The nature and direction of the association between the subject and object resource MAY be described more precisely if an appropriate term exists for that purpose. For example, in many cases, the relationship between a subject and object resource that are linked by `dwc:associatedMedia` is depiction. Thus the well known term `foaf:depiction` MAY be used to link two resources in lieu of, or in addition to `dcterms:relation`.
Example 29 shows how the data presented in Table 10 may be expressed as RDF using URI-reference properties instead of Darwin Core association properties. Example 29 shows how the data presented in Table 10 may be expressed as RDF using URI-reference properties instead of Darwin Core association properties.
@ -1247,7 +1259,7 @@ Turtle:
Notes: Notes:
1. Because there is no well-known term for expressing the relationship "sibling of", the nature of the relation between the two associated organisms was not represented in the triples. 1. Because there is no well-known term for expressing the relationship "sibling of", the nature of the relation between the two associated organisms was not represented in the triples.
2. In this example, both of the associated media items were depictions of the subject organism, so `foaf:depiction` was used to indicate that. However, it is likely that many associated media items will not depict the subject resource. For example, specimen labels do not depict specimens with which they are associated nor do still images of habitats depict organisms with which they are associated. Therefore, there is a need for more expressive terms to describe these types of relations more precisely. However, creating such terms is beyond the scope of this guide. 2. In this example, both of the associated media items were depictions of the subject organism, so `foaf:depiction` was used to indicate that. However, it is likely that many associated media items will not depict the subject resource. For example, specimen labels do not depict specimens with which they are associated, nor do still images of habitats depict organisms with which they are associated. Therefore, there is a need for more expressive terms to describe these types of relations more precisely. However, creating such terms is beyond the scope of this guide.
#### 2.8.4 Querying for associated resources (non-normative) #### 2.8.4 Querying for associated resources (non-normative)
@ -1268,9 +1280,9 @@ SELECT ?resource WHERE {
### 2.9 MeasurementOrFact instances (normative) ### 2.9 MeasurementOrFact instances (normative)
Darwin Core provides a mechanism for expressing measurements and factual information associated with resources that are described using Darwin Core. Terms from the `dwc:` namespace that are organized in the `dwc:MeasurementOrFact` class were designed to express this information using string values in "flat" files. The information can be also expressed as RDF using a combination of literal value `dwc:` namespace terms and IRI value `dwciri:` terms. It is likely that this information could also be mapped to more expressive terms. However, that sort of translation is beyond the scope of this guide. Darwin Core provides a mechanism for expressing measurements and factual information associated with resources that are described using Darwin Core. Terms from the `dwc:` namespace that are organized in the `dwc:MeasurementOrFact` class were designed to express this information using string values in "flat" files. The information MAY be also expressed as RDF using a combination of literal value `dwc:` namespace terms and IRI value `dwciri:` terms. It is likely that this information could also be mapped to more expressive terms. However, that sort of translation is beyond the scope of this guide.
Measurement properties can be grouped as part of a `dwc:MeasurementOrFact` instance as shown in Example 31. In order for that instance to have meaning, it must be linked to some other resource that is the measured entity. In Example 31, the measurement took place when the occurrence of an organism was documented by its collection as a preserved specimen. The measurement was of a part of the documented organism. Given that the organism itself was preserved, the measurement also applies to the corresponding part of the specimen. Darwin Core does not provide the object properties that would be required to describe precisely how the `dwc:MeasurementOrFact` instance was related to the `dwc:Occurrence` instance, the `dwc:Organism` whose occurrence was recorded, or the `dwc:PreservedSpecimen` that was collected. In the example, `dcterms:relation` was used to link the `dwc:MeasurementOrFact` instance to the `dwc:Occurrence` instance that was the subject of the database record that served as the source of the data serialized as RDF in Example 31. It is possible for providers to link `dwc:MeasurementOrFact` instances using more expressive object properties outside of Darwin Core. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information. Measurement properties MAY be grouped as part of a `dwc:MeasurementOrFact` instance as shown in Example 31. In order for that instance to have meaning, it MUST be linked to some other resource that is the measured entity. In Example 31, the measurement took place when the occurrence of an organism was documented by its collection as a preserved specimen. The measurement was of a part of the documented organism. Given that the organism itself was preserved, the measurement also applies to the corresponding part of the specimen. Darwin Core does not provide the object properties that would be required to describe precisely how the `dwc:MeasurementOrFact` instance was related to the `dwc:Occurrence` instance, the `dwc:Organism` whose occurrence was recorded, or the `dwc:PreservedSpecimen` that was collected. In the example, `dcterms:relation` was used to link the `dwc:MeasurementOrFact` instance to the `dwc:Occurrence` instance that was the subject of the database record that served as the source of the data serialized as RDF in Example 31. It is possible for providers to link `dwc:MeasurementOrFact` instances using more expressive object properties outside of Darwin Core. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information.
**Example 31** **Example 31**
@ -1281,14 +1293,14 @@ RDF/XML
<dcterms:relation> <dcterms:relation>
<dwc:MeasurementOrFact> <dwc:MeasurementOrFact>
<dwc:measurementType>tail length</dwc:measurementType> <dwc:measurementType>tail length</dwc:measurementType>
<dwciri:measurementType rdf:resource="http://purl.obolibrary.org/obo/VT_0002758" /> <dwciri:measurementType rdf:resource="http://purl.obolibrary.org/obo/VT_0002758"/>
<dwc:measurementValue rdf:datatype="http://www.w3.org/2001/XMLSchema#int">25</dwc:measurementValue> <dwc:measurementValue rdf:datatype="http://www.w3.org/2001/XMLSchema#int">25</dwc:measurementValue>
<dwc:measurementUnit>mm</dwc:measurementUnit> <dwc:measurementUnit>mm</dwc:measurementUnit>
<dwciri:measurementUnit rdf:resource="http://mimi.case.edu/ontologies/2009/1/UnitsOntology#millimeter"/> <dwciri:measurementUnit rdf:resource="https://www.wikidata.org/wiki/Q174789"/>
<dwc:measurementAccuracy rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">0.5</dwc:measurementAccuracy> <dwc:measurementAccuracy rdf:datatype="http://www.w3.org/2001/XMLSchema#decimal">0.5</dwc:measurementAccuracy>
<dwc:measurementDeterminedDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2009-08-22</dwc:measurementDeterminedDate> <dwc:measurementDeterminedDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date">2009-08-22</dwc:measurementDeterminedDate>
<dwc:measurementDeterminedBy>Ryan B Stephens</dwc:measurementDeterminedBy> <dwc:measurementDeterminedBy>Ryan B Stephens</dwc:measurementDeterminedBy>
<dwciri:measurementDeterminedBy rdf:resource="http://scholar.google.com/citations?user=RAsUdjoAAAAJ" /> <dwciri:measurementDeterminedBy rdf:resource="https://orcid.org/0000-0001-8524-9873" />
<dwc:measurementMethod>unspecified</dwc:measurementMethod> <dwc:measurementMethod>unspecified</dwc:measurementMethod>
<dwciri:measurementMethod rdf:resource="http://purl.obolibrary.org/obo/MMO_0000160" /> <dwciri:measurementMethod rdf:resource="http://purl.obolibrary.org/obo/MMO_0000160" />
<dwc:measurementRemarks xml:lang="en">Accuracy from significant digits.</dwc:measurementRemarks> <dwc:measurementRemarks xml:lang="en">Accuracy from significant digits.</dwc:measurementRemarks>
@ -1306,11 +1318,11 @@ Turtle
          dwciri:measurementType <http://purl.obolibrary.org/obo/VT_0002758> ;           dwciri:measurementType <http://purl.obolibrary.org/obo/VT_0002758> ;
          dwc:measurementValue "25"^^xsd:int ;           dwc:measurementValue "25"^^xsd:int ;
          dwc:measurementUnit "mm" ;           dwc:measurementUnit "mm" ;
          dwciri:measurementUnit <http://mimi.case.edu/ontologies/2009/1/UnitsOntology#millimeter> ;           dwciri:measurementUnit <https://www.wikidata.org/wiki/Q174789> ;
          dwc:measurementAccuracy "0.5"^^xsd:decimal ;           dwc:measurementAccuracy "0.5"^^xsd:decimal ;
          dwc:measurementDeterminedDate "2009-08-22"^^xsd:date ;           dwc:measurementDeterminedDate "2009-08-22"^^xsd:date ;
          dwc:measurementDeterminedBy "Ryan B Stephens" ;           dwc:measurementDeterminedBy "Ryan B Stephens" ;
          dwciri:measurementDeterminedBy <http://scholar.google.com/citations?user=RAsUdjoAAAAJ> ;           dwciri:measurementDeterminedBy <https://orcid.org/0000-0001-8524-9873> ;
          dwc:measurementMethod "unspecified" ;           dwc:measurementMethod "unspecified" ;
          dwciri:measurementMethod <http://purl.obolibrary.org/obo/MMO_0000160> ;           dwciri:measurementMethod <http://purl.obolibrary.org/obo/MMO_0000160> ;
          dwc:measurementRemarks "Accuracy from significant digits."@en].           dwc:measurementRemarks "Accuracy from significant digits."@en].
@ -1318,14 +1330,14 @@ Turtle
## 3 Term reference (normative) ## 3 Term reference (normative)
This section organizes terms from Darwin Core and other key vocabularies according to their use in RDF. If the use of a term has additional restrictions or implications (e.g., domain and range assertions), they are noted. Recommended formats and values are given when appropriate. This section organizes terms from Darwin Core and other key vocabularies according to their use in RDF. If the use of a term has additional restrictions or implications (e.g., domain and range assertions), they are noted. RECOMMENDED formats and values are given when appropriate.
### 3.1 Non-Darwin Core terms needed to express fundamental properties in RDF (normative) ### 3.1 Non-Darwin Core terms needed to express fundamental properties in RDF (normative)
term | Notes term | Notes
--- | --- --- | ---
rdf:type | Used to indicate the class of which the resource is an instance. It is considered a best practice to type resources using rdf:type, whereas type declarations using `dcterms:type` and `dwc:basisOfRecord` are optional. See [Section 2.3.1.5](#2315-classes-to-be-used-for-type-declarations-of-resources-described-using-darwin-core-normative) for recommended classes to be used with biodiversity resources. rdf:type | Used to indicate the class of which the resource is an instance. Resources SHOULD be typed using `rdf:type`, whereas type declarations using `dcterms:type` and `dwc:basisOfRecord` are OPTIONAL. See [Section 2.3.1.5](#2315-classes-to-be-used-for-type-declarations-of-resources-described-using-darwin-core-normative) for recommended classes to be used with biodiversity resources.
dcterms:identifier | Used to relate string literal identifiers to the subject resource. This can include string representations of IRIs if they are considered the identifier for the resource. dcterms:identifier | Used to relate string literal identifiers to the subject resource. This MAY include string representations of IRIs if they are considered the identifier for the resource.
dcterms:relation | Used to link subject and object resources that have an unspecified association. dcterms:relation | Used to link subject and object resources that have an unspecified association.
### 3.2 Imported Dublin Core terms for which only literal objects are appropriate (normative) ### 3.2 Imported Dublin Core terms for which only literal objects are appropriate (normative)
@ -1339,13 +1351,13 @@ dcterms:bibliographicCitation | dcterms:BibliographicResource | rdfs:Literal
### 3.3 Imported Dublin Core terms that have non-literal objects and corresponding terms that have literal objects (normative) ### 3.3 Imported Dublin Core terms that have non-literal objects and corresponding terms that have literal objects (normative)
| Term intended for use in RDF with non-literal objects[^2] | range | recommended values[^3] | Term intended for use in RDF with literal objects | | Term intended for use in RDF with non-literal objects[^2] | range | RECOMMENDED values[^3] | Term intended for use in RDF with literal objects |
| --- | --- | --- | --- | | --- | --- | --- | --- |
| dcterms:language | dcterms:LinguisticSystem | MARC ISO 639-2 language IRI | dc:language | | dcterms:language | dcterms:LinguisticSystem | MARC ISO 639-2 language IRI | dc:language |
| dcterms:license[^4] | dcterms:LicenseDocument | Creative Commons license IRI | xmpRights:UsageTerms[^5] | | dcterms:license[^4] | dcterms:LicenseDocument | Creative Commons license IRI | xmpRights:UsageTerms[^5] |
| dcterms:type | rdfs:Class | DCMI Type Vocabulary | dc:type | | dcterms:type | rdfs:Class | DCMI Type Vocabulary | dc:type |
| dcterms:rightsHolder | dcterms:Agent | IRI for the agent owning or managing the rights. | xmpRights:Owner[^6] | | dcterms:rightsHolder | dcterms:Agent | IRI for the agent owning or managing the rights. | xmpRights:Owner[^6] |
| dcterms:accessRights | dcterms:RightsStatement | A custom RDF rights statement could be created describing who can access the resource or an indication of its security status. | No literal object analogue exists for this term. The string value can be expressed as a property of a blank node.[^7] | | dcterms:accessRights | dcterms:RightsStatement | A custom RDF rights statement MAY be created describing who can access the resource or an indication of its security status. | No literal object analogue exists for this term. The string value MAY be expressed as a property of a blank node.[^7] |
| dcterms:references | --- | IRI for a publication (preferably an HTTP-proxied DOI) related to the subject resource. | Use `dwc:identificationReferences` for a reference consulted in making an taxonomic identification and `dwc:associatedReferences` for references related to occurrences. | | dcterms:references | --- | IRI for a publication (preferably an HTTP-proxied DOI) related to the subject resource. | Use `dwc:identificationReferences` for a reference consulted in making an taxonomic identification and `dwc:associatedReferences` for references related to occurrences. |
[^2]: None of these `dcterms:` namespace terms have domain declarations. [^2]: None of these `dcterms:` namespace terms have domain declarations.
@ -1364,16 +1376,14 @@ dcterms:bibliographicCitation | dcterms:BibliographicResource | rdfs:Literal
Darwin Core term | Notes on expressing as RDF Darwin Core term | Notes on expressing as RDF
--- | --- --- | ---
`dwc:eventDate`<br>`dwc:georeferencedDate`<br>`dwc:dateIdentified`<br>`dwc:relationshipEstablishedDate`<br>`dwc:measurementDeterminedDate` | These date terms have range `rdfs:Literal`[^8]. Best practice as specified in the term definition recommends that they should be formatted according to [ISO 8601-1:2019](https://en.wikipedia.org/wiki/ISO_8601). There is no defined [XML Schema datatype](https://www.w3.org/TR/xmlschema11-2/) that corresponds exactly to ISO 8601-1:2019, therefore the entire set of possible values cannot be specified using an `rdf:datatype` attribute. The [XML Schema dateTime datatype](https://www.w3.org/TR/xmlschema11-2/#dateTime) (`xsd:dateTime`) which is effectively a subset of ISO 8601-1:2019, may be used as an `rdf:datatype` attribute. However, `xsd:dateTime` requires the complete series of year, month, day, hour, second (e.g., 002-10-10T12:00:00) and does not permit listing only part of this hierarchy (e.g., only the year) as is allowed in ISO 8601-1:2019. `dwc:eventDate`<br>`dwc:georeferencedDate`<br>`dwc:dateIdentified`<br>`dwc:relationshipEstablishedDate`<br>`dwc:measurementDeterminedDate` | Best practice as specified in the term definition recommends that they SHOULD be formatted according to [ISO 8601-1:2019](https://en.wikipedia.org/wiki/ISO_8601). There is no defined [XML Schema datatype](https://www.w3.org/TR/xmlschema11-2/) that corresponds exactly to ISO 8601-1:2019, therefore the entire set of possible values cannot be specified using an `rdf:datatype` attribute. The [XML Schema dateTime datatype](https://www.w3.org/TR/xmlschema11-2/#dateTime) (`xsd:dateTime`) which is effectively a subset of ISO 8601-1:2019, MAY be used as an `rdf:datatype` attribute. However, `xsd:dateTime` requires the complete series of year, month, day, hour, second (e.g., 002-10-10T12:00:00) and does not permit listing only part of this hierarchy (e.g., only the year) as is allowed in ISO 8601-1:2019.
`dwc:eventTime` | It is recommended that the format described by [ISO 8601-1:2019](https://en.wikipedia.org/wiki/ISO_8601) be used. As with the date terms, there is no [XML Schema datatype](https://www.w3.org/TR/xmlschema11-2/) that includes all of the possible values allowed in ISO 8601-1:2019, so there is no generic `rdf:datatype` attribute that would apply to all possible instances. The [XML Schema dateTime datatype](https://www.w3.org/TR/xmlschema11-2/#dateTime) (`xsd:time`) which is effectively a subset of ISO 8601-1:2019 may be used as an `rdf:datatype` attribute although it is limited to values that include hours, minutes, and seconds (e.g., 13:07:56-05:00). `dwc:eventTime` | It is RECOMMENDED that the format described by [ISO 8601-1:2019](https://en.wikipedia.org/wiki/ISO_8601) be used. As with the date terms, there is no [XML Schema datatype](https://www.w3.org/TR/xmlschema11-2/) that includes all of the possible values allowed in ISO 8601-1:2019, so there is no generic `rdf:datatype` attribute that would apply to all possible instances. The [XML Schema dateTime datatype](https://www.w3.org/TR/xmlschema11-2/#dateTime) (`xsd:time`) which is effectively a subset of ISO 8601-1:2019 MAY be used as an `rdf:datatype` attribute although it is limited to values that include hours, minutes, and seconds (e.g., 13:07:56-05:00).
`dwc:individualCount`<br>`dwc:decimalLatitude`<br>`dwc:decimalLongitude`<br>`dwc:coordinatePrecision`<br>`dwc:pointRadiusSpatialFit`<br>`dwc:coordinateUncertaintyInMeters`<br>`dwc:minimumElevationInMeters`<br>`dwc:maximumElevationInMeters`<br>`dwc:minimumDepthInMeters`<br>`dwc:maximumDepthInMeters`<br>`dwc:minimumDistanceAboveSurfaceInMeters`<br>`dwc:maximumDistanceAboveSurfaceInMeters`<br>`dwc:startDayOfYear`<br>`dwc:endDayOfYear`<br>`dwc:year`<br>`dwc:month`<br>`dwc:day`<br>`dwc:footprintSpatialFit`<br>`dwc:measurementAccuracy` | These terms are expected to have literal values that are numeric. Therefore, an `rdf:datatype` attribute describing the form of the number should be used. `dwc:individualCount`<br>`dwc:decimalLatitude`<br>`dwc:decimalLongitude`<br>`dwc:coordinatePrecision`<br>`dwc:pointRadiusSpatialFit`<br>`dwc:coordinateUncertaintyInMeters`<br>`dwc:minimumElevationInMeters`<br>`dwc:maximumElevationInMeters`<br>`dwc:minimumDepthInMeters`<br>`dwc:maximumDepthInMeters`<br>`dwc:minimumDistanceAboveSurfaceInMeters`<br>`dwc:maximumDistanceAboveSurfaceInMeters`<br>`dwc:startDayOfYear`<br>`dwc:endDayOfYear`<br>`dwc:year`<br>`dwc:month`<br>`dwc:day`<br>`dwc:footprintSpatialFit`<br>`dwc:measurementAccuracy` | These terms are expected to have literal values that are numeric. Therefore, an `rdf:datatype` attribute describing the form of the number SHOULD be used.
`dwc:occurrenceRemarks`<br>`dwc:eventRemarks`<br>`dwc:locationRemarks`<br>`dwc:georeferenceRemarks`<br>`dwc:identificationRemarks`<br>`dwc:taxonRemarks`<br>`dwc:organismRemarks`<br>`dwc:relationshipRemarks`<br>`dwc:measurementRemarks` | Because these are remarks, they are expected to have literal values with an `xml:lang` attribute. `dwc:occurrenceRemarks`<br>`dwc:eventRemarks`<br>`dwc:locationRemarks`<br>`dwc:georeferenceRemarks`<br>`dwc:identificationRemarks`<br>`dwc:taxonRemarks`<br>`dwc:organismRemarks`<br>`dwc:relationshipRemarks`<br>`dwc:measurementRemarks` | Because these are remarks, they SHOULD have literal values with an `xml:lang` attribute.
`dwc:catalogNumber`<br>`dwc:samplingEffort`<br>`dwc:organismName`<br>`dwc:verbatimIdentification`<br>`dwc:verbatimEventDate`<br>`dwc:verbatimLocality`<br>`dwc:verbatimElevation`<br>`dwc:verbatimCoordinates`<br>`dwc:verbatimLatitude`<br>`dwc:verbatimLongitude`<br>`dwc:verbatimDepth`<br>`dwc:verbatumTaxonRank` | Based on their term definitions, these terms are expected to have untyped literal values. `dwc:catalogNumber`<br>`dwc:samplingEffort`<br>`dwc:organismName`<br>`dwc:verbatimIdentification`<br>`dwc:verbatimEventDate`<br>`dwc:verbatimLocality`<br>`dwc:verbatimElevation`<br>`dwc:verbatimCoordinates`<br>`dwc:verbatimLatitude`<br>`dwc:verbatimLongitude`<br>`dwc:verbatimDepth`<br>`dwc:verbatumTaxonRank` | Based on their term definitions, these terms SHOULD have untyped literal values.
`dwc:otherCatalogNumbers` | There is no simple mapping because of the kinds of identifiers people use and variety of relationships that there may be among identifiers. For non-IRI identifiers expressed as string values, the string may be provided as a literal value of `dwc:otherCatalogNumbers`. Whether this is preferable to providing multiple `dwc:catalogNumber` properties may depend on community practice. `owl:sameAs` may be used to associate other IRI identifiers with the subject IRI if that is appropriate. `dwc:otherCatalogNumbers` | There is no simple mapping because of the kinds of identifiers people use and variety of relationships that there may be among identifiers. For non-IRI identifiers expressed as string values, the string MAY be provided as a literal value of `dwc:otherCatalogNumbers`. Whether this is preferable to providing multiple `dwc:catalogNumber` properties may depend on community practice. `owl:sameAs` MAY be used to associate other IRI identifiers with the subject IRI if that is appropriate.
`dwc:basisOfRecord` | Use only with literal value strings consisting of the local name component of Darwin Core class IRIs. Use `rdf:type` to refer to IRIs that describe the type of the resource. `dwc:basisOfRecord` | MUST be used only with literal value strings consisting of the local name component of Darwin Core class IRIs. Use `rdf:type` to refer to IRIs that describe the type of the resource.
`dwc:dynamicProperties` | Expected to contain JSON as a literal. Communities of practice might choose to use other vocabularies or develop their own vocabularies to express this sort of content directly as RDF. `dwc:dynamicProperties` | Expected to contain JSON as a literal. Communities of practice MAY choose to use other vocabularies or develop their own vocabularies to express this sort of content directly as RDF.
[^8]: No Darwin Core terms defined by Darwin Core (as opposed to those imported from Dublin Core) have domain or range declarations as a part of their definitions. However, the five terms in the `dwc:` namespace listed in the table above are defined to be `rdfs:subPropertyOf` of `dcterms:date`, which has the range `rdfs:Literal`. Under the extensional entailment rule ext4 listed in section 7.3.1 of the [RDF Semantics 2004 W3C Recommendation](http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#RDFSExtRules), these terms can be inferred to have the range `rdfs:Literal`. However, the [RDF 1.1 Semantics W3C Recommendation](http://www.w3.org/TR/rdf11-mt/) does not include these extensional entailment rules. Nevertheless, it is reasonable to expect that date properties should have literal values, with datatype attributes whenever possible.
### 3.5 Darwin Core convenience terms that are expected to be used only with literal values (normative) ### 3.5 Darwin Core convenience terms that are expected to be used only with literal values (normative)
@ -1381,32 +1391,32 @@ See [Section 2.7](#27-darwin-core-convenience-terms-non-normative) for more info
Darwin Core term | Notes Darwin Core term | Notes
--- | --- --- | ---
`dwc:collectionCode`<br>`dwc:institutionCode`<br>`dwc:ownerInstitutionCode` | The subject resource can be any resource that is part of a collection. As an alternative, use the object property `dwciri:inCollection` to link the subject resource to an IRI for the collection containing the institution that owns or controls the resource. `dwc:collectionCode`<br>`dwc:institutionCode`<br>`dwc:ownerInstitutionCode` | The subject resource MAY be any resource that is part of a collection. As an alternative, the object property `dwciri:inCollection` MAY be used to link the subject resource to an IRI for the collection containing the institution that owns or controls the resource.
`dwc:kingdom`<br>`dwc:phylum`<br>`dwc:class`<br>`dwc:order`<br>`dwc:family`<br>`dwc:subfamily`<br>`dwc:genus`<br>`dwc:genericName`<br>`dwc:subgenus`<br>`dwc:infragenericEpithet`<br>`dwc:specificEpithet`<br>`dwc:infraspecificEpithet`<br>`dwc:cultivarEpithet`<br>`dwc:higherClassification`<br>`dwc:vernacularName`<br>`dwc:nameAccordingTo`<br>`dwc:scientificName`<br>`dwc:taxonRank`<br>`dwc:scientificNameAuthorship`<br>`dwc:nomenclaturalStatus`<br>`dwc:namePublishedIn`<br>`dwc:namePublishedInYear`<br>`dwc:nomenclaturalCode`<br>`dwc:originalNameUsage`<br>`dwc:taxonomicStatus`<br>`dwc:parentNameUsage`<br>`dwc:acceptedNameUsage` | The subject resource should be a `dwc:Identification` instance. See [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative) for a discussion of why it not recommended to use these as properties of `dwc:Taxon` instances. As an alternative, use the object property `dwciri:toTaxon` to link the subject `dwc:Identification` instance to a taxonomic entity such as a taxon, taxon concept, or taxon name use. It is likely that these taxonomic entities will have a complex structure which differentiates among name entities, name strings, application of a name to a concept, which expresses parent/child and set relationships among entities, and which tracks provenance information about the names, references, and concepts. The flat nature of text-based Darwin Core cannot represent such a complex structure and it is beyond the scope of this guide to describe them. `dwc:kingdom`<br>`dwc:phylum`<br>`dwc:class`<br>`dwc:order`<br>`dwc:family`<br>`dwc:subfamily`<br>`dwc:genus`<br>`dwc:genericName`<br>`dwc:subgenus`<br>`dwc:infragenericEpithet`<br>`dwc:specificEpithet`<br>`dwc:infraspecificEpithet`<br>`dwc:cultivarEpithet`<br>`dwc:higherClassification`<br>`dwc:vernacularName`<br>`dwc:nameAccordingTo`<br>`dwc:scientificName`<br>`dwc:taxonRank`<br>`dwc:scientificNameAuthorship`<br>`dwc:nomenclaturalStatus`<br>`dwc:namePublishedIn`<br>`dwc:namePublishedInYear`<br>`dwc:nomenclaturalCode`<br>`dwc:originalNameUsage`<br>`dwc:taxonomicStatus`<br>`dwc:parentNameUsage`<br>`dwc:acceptedNameUsage` | The subject resource SHOULD be a `dwc:Identification` instance. See [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative) for a discussion of why it NOT RECOMMENDED to use these as properties of `dwc:Taxon` instances. The object property `dwciri:toTaxon` MAY be used as an alternative to link the subject `dwc:Identification` instance to a taxonomic entity such as a taxon, taxon concept, or taxon name usage. It is likely that these taxonomic entities will have a complex structure that differentiates among name entities, name strings, application of a name to a concept, that expresses parent/child and set relationships among entities, and that tracks provenance information about the names, references, and concepts. The flat nature of text-based Darwin Core cannot represent such a complex structure and it is beyond the scope of this guide to describe them.
`dwc:higherGeography`<br>`dwc:continent`<br>`dwc:waterBody`<br>`dwc:islandGroup`<br>`dwc:island`<br>`dwc:countryCode`<br>`dwc:country`<br>`dwc:stateProvince`<br>`dwc:county`<br>`dwc:municipality`<br>`dwc:locality` | The subject resource should be a `dcterms:Location` instance. As an alternative, use the object property `dwciri:inDescribedPlace` to link the subject resource to a standardized place described as part of a hierarchy. See [Section 2.7.5](#275-names-of-geographic-subdivisions-normative) for details. It is likely that providers will want to provide a text value for `dwc:locality` even if `dwciri:inDescribedPlace` is used to replace the other hierarchical convenience terms in this category. This is because it is unlikely that a place description at this most specific level (e.g., "15 km N of Essen") would be represented by a standardized IRI-identified place instance. There is no `dwciri:` analogue of `dwc:locality` because if an IRI-identified place were available to represent the locality, the term `dwciri:inDescribedPlace` would be used to link to it. `dwc:higherGeography`<br>`dwc:continent`<br>`dwc:waterBody`<br>`dwc:islandGroup`<br>`dwc:island`<br>`dwc:countryCode`<br>`dwc:country`<br>`dwc:stateProvince`<br>`dwc:county`<br>`dwc:municipality`<br>`dwc:locality` | The subject resource SHOULD be a `dcterms:Location` instance. The object property `dwciri:inDescribedPlace` MAY be used as an alternative to link the subject resource to a standardized place described as part of a hierarchy. See [Section 2.7.5](#275-names-of-geographic-subdivisions-normative) for details. It is likely that providers will want to provide a text value for `dwc:locality` even if `dwciri:inDescribedPlace` is used to replace the other hierarchical convenience terms in this category. This is because it is unlikely that a place description at this most specific level (e.g., "15 km N of Essen") would be represented by a standardized IRI-identified place instance. There is no `dwciri:` analogue of `dwc:locality` because if an IRI-identified place were available to represent the locality, the term `dwciri:inDescribedPlace` would be used to link to it.
`dwc:earliestEonOrLowestEonothem`<br>`dwc:latestEonOrHighestEonothem`<br>`dwc:earliestEraOrLowestErathem`<br>`dwc:latestEraOrHighestErathem`<br>`dwc:earliestPeriodOrLowestSystem`<br>`dwc:latestPeriodOrHighestSystem`<br>`dwc:earliestEpochOrLowestSeries`<br>`dwc:latestEpochOrHighestSeries`<br>`dwc:earliestAgeOrLowestStage`<br>`dwc:latestAgeOrHighestStage` | The subject resource should be a `dwc:GeologicalContext` instance. As an alternative, use the object properties `dwciri:earliestGeochronologicalEra` and `dwciri:latestGeochronologicalEra` as described in [Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative). See [Section 2.7.6](#276-chronostratographic-geological-timescale-descriptors-normative) for details. `dwc:earliestEonOrLowestEonothem`<br>`dwc:latestEonOrHighestEonothem`<br>`dwc:earliestEraOrLowestErathem`<br>`dwc:latestEraOrHighestErathem`<br>`dwc:earliestPeriodOrLowestSystem`<br>`dwc:latestPeriodOrHighestSystem`<br>`dwc:earliestEpochOrLowestSeries`<br>`dwc:latestEpochOrHighestSeries`<br>`dwc:earliestAgeOrLowestStage`<br>`dwc:latestAgeOrHighestStage` | The subject resource SHOULD be a `dwc:GeologicalContext` instance. The object properties `dwciri:earliestGeochronologicalEra` and `dwciri:latestGeochronologicalEra` MAY be used as an alternative as described in [Section 3.6](#36-dwciri-terms-having-local-names-that-dont-correspond-to-terms-in-the-dwc-darwin-core-namespace-normative). See [Section 2.7.6](#276-chronostratographic-geological-timescale-descriptors-normative) for details.
`dwc:lithostratigraphicTerms`<br>`dwc:group`<br>`dwc:formation`<br>`dwc:member`<br>`dwc:bed` | The subject resource should be a `dwc:GeologicalContext` instance. As an alternative, use the object property `dwciri:fromLithostratigraphicUnit` to link the subject resource to the lowest appropriate unit of a lithostratigraphic hierarchy. `dwc:lithostratigraphicTerms`<br>`dwc:group`<br>`dwc:formation`<br>`dwc:member`<br>`dwc:bed` | The subject resource SHOULD be a `dwc:GeologicalContext` instance. The object property `dwciri:fromLithostratigraphicUnit` MAY be used as an alternative to link the subject resource to the lowest appropriate unit of a lithostratigraphic hierarchy.
### 3.6 `dwciri:` terms having local names that dont correspond to terms in the `dwc:` Darwin Core namespace (normative) ### 3.6 `dwciri:` terms having local names that dont correspond to terms in the `dwc:` Darwin Core namespace (normative)
Darwin Core term | Notes Darwin Core term | Notes
--- | --- --- | ---
`dwciri:inCollection` | Use to link any subject resource that is part of a collection to the collection containing the resource. Recommended best practice is to use IRIs from [Global Registry of Biorepositories](http://grbio.org/). For details, see the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). See [Section 2.7.3](#273-ownership-of-a-collection-item-normative) for usage details. `dwciri:inCollection` | MAY be used to link any subject resource that is part of a scientific collection to the collection containing the resource. RECOMMENDED best practice is to use IRIs from the [GBIF Registry of Scientific Collections](https://www.gbif.org/grscicoll). See [Section 2.7.3](#273-ownership-of-a-collection-item-normative) for usage details.
`dwciri:toTaxon` | Use to link a `dwc:Identification` instance subject to a taxonomic entity such as a taxon, taxon concept, or taxon name use. See [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative) for usage details. `dwciri:toTaxon` | MAY be used to link a `dwc:Identification` instance subject to a taxonomic entity such as a taxon, taxon concept, or taxon name usage. See [Section 2.7.4](#274-description-of-a-taxonomic-entity-normative) for usage details.
`dwciri:inDescribedPlace` | Use to link a `dcterms:Location` instance subject to the lowest level standardized hierarchically-described resource. It is expected that such resources will be linked to higher levels in the hierarchy by the organization minting the IRI. Recommended best practice is to use IRIs from the [GeoNames geographical database](https://www.geonames.org/). For details, see the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). See [Section 2.7.5](#275-names-of-geographic-subdivisions-normative) for usage details. `dwciri:inDescribedPlace` | MAY be used to link a `dcterms:Location` instance subject to the lowest level standardized hierarchically-described resource. It is expected that such resources will be linked to higher levels in the hierarchy by the organization minting the IRI. RECOMMENDED best practice is to use IRIs from the [GeoNames geographical database](https://www.geonames.org/). See [Section 2.7.5](#275-names-of-geographic-subdivisions-normative) for usage details.
`dwciri:earliestGeochronologicalEra`<br>`dwciri:latestGeochronologicalEra` | Use to link a `dwc:GeologicalContext` instance to chronostratigraphic time periods at the lowest possible level in a standardized hierarchy. Use `dwciri:earliestGeochronologicalEra` to point to the earliest possible geological time period from which the cataloged item was collected and the object property `dwciri:latestGeochronologicalEra` to point to the latest possible geological time period from which the cataloged item was collected. The organization minting the IRI should link those time periods to higher levels in the hierarchy. Recommended best practice is to use IRIs defined by the [International Commission on Stratigraphy](http://www.stratigraphy.org/). For details, see the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). See [Section 2.7.6](#276-chronostratographic-geological-timescale-descriptors-normative) for usage details. `dwciri:earliestGeochronologicalEra`<br>`dwciri:latestGeochronologicalEra` | MAY be used to link a `dwc:GeologicalContext` instance to chronostratigraphic time periods at the lowest possible level in a standardized hierarchy. Use `dwciri:earliestGeochronologicalEra` to point to the earliest possible geological time period from which the cataloged item was collected and the object property `dwciri:latestGeochronologicalEra` to point to the latest possible geological time period from which the cataloged item was collected. The organization minting the IRI should link those time periods to higher levels in the hierarchy. RECOMMENDED best practice is to use IRIs defined by the [International Commission on Stratigraphy](http://www.stratigraphy.org/). See [Section 2.7.6](#276-chronostratographic-geological-timescale-descriptors-normative) for usage details.
`dwciri:fromLithostratigraphicUnit` | Use to link a `dwc:GeologicalContext` instance to an IRI-identified lithostratigraphic unit at the lowest possible level in a hierarchy. It is expected that such resources will be linked to higher levels in the hierarchy by the organization minting the IRI. See [Section 2.7.7](#277-lithostratigraphy-descriptors-normative) for usage details. `dwciri:fromLithostratigraphicUnit` | MAY be used to link a `dwc:GeologicalContext` instance to an IRI-identified lithostratigraphic unit at the lowest possible level in a hierarchy. It is expected that such resources will be linked to higher levels in the hierarchy by the organization minting the IRI. See [Section 2.7.7](#277-lithostratigraphy-descriptors-normative) for usage details.
`dwciri:inDataset` | This object property is provided to link a subject dataset record to the dataset which contains it. A string literal name of the dataset can be provided using the term `dwc:datasetName`. `dwciri:inDataset` | This object property is provided to link a subject dataset record to the dataset which contains it. A string literal name of the dataset MAY be provided using the term `dwc:datasetName`.
### 3.7 `dwc:` namespace terms that have analogues in the `dwciri:` namespace (normative) ### 3.7 `dwc:` namespace terms that have analogues in the `dwciri:` namespace (normative)
Darwin Core term having a `dwciri:` analogue with the same local name | Notes on the `dwciri:` analogues Darwin Core term having a `dwciri:` analogue with the same local name | Notes on the `dwciri:` analogues
--- | --- --- | ---
`dwc:recordedBy`<br>`dwc:identifiedBy`<br>`dwc:georeferencedBy`<br>`dwc:measurementDeterminedBy` | The object is an agent; use a well-known IRI such as those referenced in the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). `dwc:recordedBy`<br>`dwc:identifiedBy`<br>`dwc:georeferencedBy`<br>`dwc:measurementDeterminedBy` | The object is an agent; the value SHOULD be a well-known IRI such as those referenced in the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md).
`dwc:locationAccordingTo` | The object is an agent or publication. Use a well-known IRI when possible, such as an HTTP-proxied DOI. `dwc:locationAccordingTo` | The object is an agent or publication. The value SHOULD be a well-known IRI, such as an HTTP-proxied DOI.
`dwc:georeferenceProtocol`<br>`dwc:georeferenceSources`<br>`dwc:samplingProtocol` | The object is a published or well-known reference; use an IRI version (preferably HTTP proxied) of doi, isbn, issn, etc. if available. `dwc:georeferenceProtocol`<br>`dwc:georeferenceSources`<br>`dwc:samplingProtocol` | The object is a published or well-known reference; the value SHOULD be an IRI version (preferably HTTP proxied) of doi, isbn, issn, etc. if available.
`dwc:sex`<br>`dwc:lifeStage`<br>`dwc:reproductiveCondition`<br>`dwc:establishmentMeans`<br>`dwc:behavior`<br>`dwc:occurrenceStatus`<br>`dwc:disposition`<br>`dwc:verbatimCoordinateSystem`<br>`dwc:verbatimSRS`<br>`dwc:geodeticDatum`<br>`dwc:verticalDatum`<br>`dwc:georeferenceVerificationStatus`<br>`dwc:footprintWKT`<br>`dwc:footprintSRS`<br>`dwc:lowestBiostratigraphicZone`<br>`dwc:highestBiostratigraphicZone`<br>`dwc:identificationVerificationStatus`<br>`dwc:identificationQualifier`<br>`dwc:preparations`<br>`dwc:typeStatus`<br>`dwc:measurementType`<br>`dwc:measurementValue`<br>`dwc:measurementUnit`<br>`dwc:measurementMethod` | Recommended best practice is to use a controlled vocabulary if one is available. `dwc:sex`<br>`dwc:lifeStage`<br>`dwc:reproductiveCondition`<br>`dwc:establishmentMeans`<br>`dwc:behavior`<br>`dwc:occurrenceStatus`<br>`dwc:disposition`<br>`dwc:verbatimCoordinateSystem`<br>`dwc:verbatimSRS`<br>`dwc:geodeticDatum`<br>`dwc:verticalDatum`<br>`dwc:georeferenceVerificationStatus`<br>`dwc:footprintWKT`<br>`dwc:footprintSRS`<br>`dwc:lowestBiostratigraphicZone`<br>`dwc:highestBiostratigraphicZone`<br>`dwc:identificationVerificationStatus`<br>`dwc:identificationQualifier`<br>`dwc:preparations`<br>`dwc:typeStatus`<br>`dwc:measurementType`<br>`dwc:measurementValue`<br>`dwc:measurementUnit`<br>`dwc:measurementMethod` | RECOMMENDED best practice is to use a controlled vocabulary if one is available.
`dwc:informationWithheld`<br>`dwc:dataGeneralizations`<br>`dwc:habitat` | If the object property (`dwciri:` analogue) is used rather than the `dwc:` property, the object property should point to a stable resource which might be a controlled vocabulary. `dwc:informationWithheld`<br>`dwc:dataGeneralizations`<br>`dwc:habitat` | If the object property (`dwciri:` analogue) is used rather than the `dwc:` property, the object property SHOULD point to a stable resource, which might be a controlled vocabulary.
`dwc:fieldNumber`<br>`dwc:fieldNotes` | `dwciri:fieldNumber` is an object property whose subject is a (possibly IRI-identified) resource that is the field notes and whose object is a `dwc:Event` instance. `dwciri:fieldNotes` is an object property whose subject is a `dwc:Event` instance and whose object is a (possibly IRI-identified) resource that is the field notes. `dwc:fieldNumber`<br>`dwc:fieldNotes` | `dwciri:fieldNumber` is an object property whose subject is a (possibly IRI-identified) resource that is the field notes and whose object is a `dwc:Event` instance. `dwciri:fieldNotes` is an object property whose subject is a `dwc:Event` instance and whose object is a (possibly IRI-identified) resource that is the field notes.
`dwc:recordNumber` | `dwciri:recordNumber` is an object property whose subject is an occurrence and whose object is a (possibly IRI-identified) resource that is the field notes. `dwc:recordNumber` | `dwciri:recordNumber` is an object property whose subject is an occurrence and whose object is a (possibly IRI-identified) resource that is the field notes.
@ -1414,17 +1424,17 @@ Darwin Core term having a `dwciri:` analogue with the same local name | Notes on
Darwin Core term | Notes Darwin Core term | Notes
--- | --- --- | ---
`dwc:relationshipOfResource`<br>`dwc:relationshipAccordingTo` | The non-RDF use of terms organized under the `dwc:ResourceRelationship` class depends on values for `dwc:resourceID` and `dwc:relatedResourceID`, terms which cannot be used in RDF for reasons discussed in [Section 2.6](#26-darwin-core-id-terms-and-rdf-normative). As of November 2014, the RDF/OWL Task Group is seeking a way to express resource relationships as RDF. For the present, `dwciri:` analogues have not been adopted for these two terms. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for further discussion. `dwc:relationshipOfResource`<br>`dwc:relationshipAccordingTo` | The non-RDF use of terms organized under the `dwc:ResourceRelationship` class depends on values for `dwc:resourceID`, `relationshipOfResourceID`, and `dwc:relatedResourceID`, terms which cannot be used in RDF for reasons discussed in [Section 2.6](#26-darwin-core-id-terms-and-rdf-normative). As of November 2014, the RDF/OWL Task Group is seeking a way to express resource relationships as RDF. For the present, `dwciri:` analogues have not been adopted for these two terms. See the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for further discussion.
`dwc:associatedOccurrences`<br>`dwc:associatedMedia`<br>`dwc:associatedSequences`<br>`dwc:associatedTaxa`<br>`dwc:associatedOrganisms` | Use `dcterms:relation` and `rdf:type`, or terms that indicate more specific relationships as described in [Section 2.8](#28-darwin-core-association-terms-non-normative) (Darwin Core association terms). `dwc:associatedOccurrences`<br>`dwc:associatedMedia`<br>`dwc:associatedSequences`<br>`dwc:associatedTaxa`<br>`dwc:associatedOrganisms` | Properties used MAY be `dcterms:relation` and `rdf:type`, or terms that indicate more specific relationships as described in [Section 2.8](#28-darwin-core-association-terms-non-normative) (Darwin Core association terms).
`dwc:previousIdentifications` | There is no consensus object property for associating identifications with resources of other classes. From whatever scheme you have chosen to provide object properties, use same object property as used for the most recent identification but provide an earlier dwc:dateIdentified value. `dwc:previousIdentifications` | There is no consensus object property for associating identifications with resources of other classes. From whatever scheme you have chosen to provide object properties, you SHOULD use same object property as used for the most recent identification but provide an earlier dwc:dateIdentified value.
`dwc:organismScope` | Use `rdf:type` with a non-literal object. See the comment at <http://rs.tdwg.org/dwc/terms/organismScope>. `dwc:organismScope` | Use `rdf:type` with a non-literal object. See the comment at <http://rs.tdwg.org/dwc/terms/organismScope>.
### 3.9 Chronometric Age extension `chrono:` terms that have analogues in the `chronoiri:` namespace (normative) ### 3.9 Chronometric Age extension `chrono:` terms that have analogues in the `chronoiri:` namespace (normative)
The [Chronometric Age vocabulary](http://rs.tdwg.org/dwc/doc/chrono/) extends the core Darwin Core vocabulary. It has a second namespace for IRI-valued terms, `chronoiri:`, which operates analogously to the `dwciri:` namespace. The [Chronometric Age vocabulary](http://rs.tdwg.org/dwc/doc/chrono/) extends the core Darwin Core vocabulary. It has a separate namespace for IRI-valued terms, `chronoiri:`, which operates analogously to the `dwciri:` namespace.
Chronometric Age term having a `chronoiri:` analogue with the same local name | Notes on the `chronoiri:` analogues Chronometric Age term having a `chronoiri:` analogue with the same local name | Notes on the `chronoiri:` analogues
--- | --- --- | ---
`chrono:chronometricAgeDeterminedBy` | The object is an agent; use a well-known IRI such as those referenced in the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). `chrono:chronometricAgeDeterminedBy` | The object is an agent; the value SHOULD be a well-known IRI such as those referenced in the list of sources of controlled values in the [Darwin Core informative ancillary web page](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md).
`chrono:chronometricAgeConversionProtocol`<br>`chrono:chronometricAgeProtocol`<br>`chrono:chronometricAgeUncertaintyMethod` | The object is a published or well-known reference; use an IRI version (preferably HTTP proxied) of doi, isbn, issn, etc. if available. `chrono:chronometricAgeConversionProtocol`<br>`chrono:chronometricAgeProtocol`<br>`chrono:chronometricAgeUncertaintyMethod` | The object is a published or well-known reference; the value SHOULD be an IRI version (preferably HTTP proxied) of doi, isbn, issn, etc. if available.
`chrono:earliestChronometricAgeReferenceSystem`<br>`chrono:latestChronometricAgeReferenceSystem`<br>`chrono:materialDated` | Recommended best practice is to use a controlled vocabulary if one is available. `chrono:earliestChronometricAgeReferenceSystem`<br>`chrono:latestChronometricAgeReferenceSystem`<br>`chrono:materialDated` | RECOMMENDED best practice is to use a controlled vocabulary if one is available.

194
docs/simple/2014-11-08.md Normal file
View File

@ -0,0 +1,194 @@
# Simple Darwin Core
Title
: Simple Darwin Core
Date version issued
: 2015-06-02
Date created
: 2009-04-21
Part of TDWG Standard
: <http://www.tdwg.org/standards/450/>
This version
: <http://rs.tdwg.org/dwc/terms/simple/2014-11-08>
Latest Version
: <http://rs.tdwg.org/dwc/terms/simple/>
Previous version
: <http://rs.tdwg.org/dwc/terms/simple/2013-10-22>
Replaced by
: <http://rs.tdwg.org/dwc/terms/simple/2021-07-15>
Abstract
: This document is a reference for the Simple Darwin Core standard.
Contributors
: John Wieczorek (MVZ), Markus Döring (GBIF), Renato De Giovanni (CRIA), Tim Robertson (GBIF), Dave Vieglais (KUNHM)
Creator
: Darwin Core Task Group
Bibliographic citation
: Darwin Core Task Group. 2014. Simple Darwin Core. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/simple/2014-11-08>
## 1 Introduction
Simple Darwin Core is a predefined subset of the terms that have common use across a wide variety of biodiversity applications. The terms used in Simple Darwin Core are those that are found at the cross-section of taxonomic names, places, and events that document biological occurrences on the planet. The two driving principles are simplicity and flexibility.
### 1.1 Status of the content of this document
All sections of this document are normative, except for examples, which are explicitly marked as non-normative.
## 2 Audience
This document is targeted toward those who want to share biodiversity information using the simplest methods and structure: Simple Darwin Core. It explains the uses and limitations of this structure and how to expand upon it.
## 3 What makes it simple?
Simple Darwin Core is simple in that it assumes (and allows) no structure beyond the concept of rows and columns, which might be thought of as attributes and their values, or fields and records. The words field and record will be used throughout the rest of the document to refer to the two dimensions of the Simple Darwin Core structure. Think of the term names as the field names. In other words, a Simple Darwin Core record could be captured in a spreadsheet or in a single database table.
## 4 What makes it flexible?
Simple Darwin Core has minimal restrictions on which fields are required (none). You might argue that there should be more required fields, that there isn't anything useful you can do without them. That is partially true. A record with no fields in it wouldn't be very interesting, but there is a difference between requiring that there be a field in a record and requiring that a particular field be in all records. By having no required field restriction, Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (an occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services.
## 5 Are there any rules?
There are just a few general guiding principles on how to make the best use of Simple Darwin Core:
1. Any Darwin Core term name can be used as a field name.
2. No field name may be repeated in a record.
3. Do not use a _Class_ ([`Occurrence`](http://rs.tdwg.org/dwc/terms/Occurrence), [`Organism`](http://rs.tdwg.org/dwc/terms/Organism), [`MaterialSample`](http://rs.tdwg.org/dwc/terms/MaterialSample), [`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`Event`](http://rs.tdwg.org/dwc/terms/Event), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`Location`](http://rs.tdwg.org/dwc/terms/Location), [`GeologicalContext`](http://rs.tdwg.org/dwc/terms/GeologicalContext), [`Identification`](http://rs.tdwg.org/dwc/terms/Identification), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon)) as a field.
4. Provide data in as many fields as you can.
5. Use the [`dcterms:type`](http://rs.tdwg.org/dwc/terms/dcterms:type) field to provide the name of the what Dublin Core type class (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record represents.
6. Use the [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord) field to provide the name of the most specific Darwin Core class (`LivingSpecimen`, `PreservedSpecimen`, `FossilSpecimen`, `MaterialSample`, `HumanObservation`, `MachineObservation`, `Event`, `Occurrence`, `Taxon`, `Identification`, `Organism`, `Location`, `GeologicalContext`, `MeasurementOrFact`, `ResourceRelationship`) the record represents.
7. Populate fields with data that match the definition of the field.
8. Use the controlled vocabulary for the values of fields that recommend them.
9. If data are withheld, use [`informationWithheld`](http://rs.tdwg.org/dwc/terms/informationWithheld) to say so.
10. If data are shared in lower quality than the original, use [`dataGeneralizations`](http://rs.tdwg.org/dwc/terms/dataGeneralizations) to say so.
Every field in Simple Darwin Core may appear either once or not at all in a single record - otherwise how could you distinguish one [`scientificName`](http://rs.tdwg.org/dwc/terms/scientificName) field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the [`MeasurementOrFact`](http://rs.tdwg.org/dwc/terms/MeasurementOrFact) and [`ResourceRelationship`](http://rs.tdwg.org/dwc/terms/ResourceRelationship) classes are of somewhat limited utility here - you could only share one `MeasurementOrFact` and one `ResourceRelationship` per record. You might argue then that there is no way to share information that requires related structures, such as a history of identifications of a specimen. That is mostly true. The only recourse within Simple Darwin Core is to force the data into one of the catch all "list" terms such as [`recordedBy`](http://rs.tdwg.org/dwc/terms/recordedBy), [`preparations`](http://rs.tdwg.org/dwc/terms/preparations), [`otherCatalogNumbers`](http://rs.tdwg.org/dwc/terms/otherCatalogNumbers), [`associatedMedia`](http://rs.tdwg.org/dwc/terms/associatedMedia), [`associatedReferences`](http://rs.tdwg.org/dwc/terms/associatedReferences), [`associatedSequences`](http://rs.tdwg.org/dwc/terms/associatedSequences), [`associatedTaxa`](http://rs.tdwg.org/dwc/terms/associatedTaxa), [`associatedOccurrences`](http://rs.tdwg.org/dwc/terms/associatedOccurrences), [`associatedOrganisms`](http://rs.tdwg.org/dwc/terms/associatedOrganisms), [`previousIdentifications`](http://rs.tdwg.org/dwc/terms/previousIdentifications), [`higherGeography`](http://rs.tdwg.org/dwc/terms/higherGeography), [`georeferencedBy`](http://rs.tdwg.org/dwc/terms/georeferencedBy), [`georeferenceSources`](http://rs.tdwg.org/dwc/terms/georeferenceSources), [`identifiedBy`](http://rs.tdwg.org/dwc/terms/identifiedBy), [`identificationReferences`](http://rs.tdwg.org/dwc/terms/identificationReferences), and [`higherClassification`](http://rs.tdwg.org/dwc/terms/higherClassification).
There is a difference between having data in a field and requiring that field to have a value from among a legal set of values. Darwin Core is simple in that it has minimal restrictions on the contents of fields. The term comments give recommendations about the use of controlled vocabularies and how to structure content wherever appropriate. Data contributors are encouraged to follow these recommendations as well as possible. You might argue that having no restrictions will promote "dirty" data (data of low quality or dubious value). Consider the simple axiom "It's not what you have, but what you do with it that matters." If data restrictions were in place at the fundamental level, then a record having any non-compliant data in any of its fields could not be shared via the standard. Not only would there be a dearth of shared data in that case (or an unused standard), but also there would be no way to use the standard to build shared data cleaning tools to actually improve the situation, nor to use data services to look up alternative representations (language translations, for example) to serve a broader audience. The rest is up to how the records will be used - in other words, it is up to applications to enforce further restrictions if appropriate, and it is up to the stakeholders of those applications to decide what the restrictions will be for the purpose the application is trying to serve.
## 6 How do I use Simple Darwin Core?
Darwin Core is simple in that data "complying with" Simple Darwin Core can be easily shared in a variety of ways, including, but not limited to, text files and xml documents. Equivalent ways of sharing the same data are described in the sections [Simple Darwin Core as Text](#61-simple-darwin-core-as-text) and [Simple Darwin Core as XML](#62-simple-darwin-core-as-xml).
What you need to do as a contributor of data via Simple Darwin Core depends on the requirements of the ones who are going to consume those data. For example, if you have a collaborator who wants to share data via Simple Darwin Core, then it may be sufficient to create a spreadsheet that contains column headers matching as many of the Darwin Core term names as you are both interested in sharing - just to be sure you both understand the meaning of the fields you share, and therefore hopefully something about their content. You might create a table in a database using Simple Darwin Core as a model (if it met all of your needs), and then connect that database with services for sharing via the web. You might use that same database (or spreadsheet) to export a comma-separated value (CSV) file for upload into a hosted service that could serve the data on your behalf. Or you might use that same file to upload into a service that would allow you to add value (such as a georeference) or quality (with a data cleaning tool), or to see your data in the context of other shared data.
### 6.1 Simple Darwin Core as text
The [Text guide](../text/) describes how to construct and format a text file using a simplified subset of the [Fielded Text](http://www.fieldedtext.org/) specification, which allows the contributor to describe the contents of a text file, or set of text files (related or not) through a separate configuration file (called a metafile). The metafile allows the contributor to communicate the structure of the content of the file or files and any relationships between them. Though it is good practice to describe a Simple Darwin Core file with such a metafile, it isn't strictly necessary if the file follows the CSV file specification and the first line of the file contains the field names. A `Fielded Text` metafile for any text file based on Simple Darwin Core can be created by customizing the [example metafile](../text/example_text_simpledwc_complete.xml), which includes references to all Darwin Core terms. Refer to the comments in the file itself as well as the metafile specification in the [Text guide](../text/) for more information.
### 6.2 Simple Darwin Core as XML
The [XML guide](../xml/) describes how to construct XML schemas to share data based on Darwin Core terms. Looking at the [Simple Darwin Core XML Schema](../xml/tdwg_dwc_simple.xsd) using the XML guide as a reference you will be able to see that the schema supports the notion of a `SimpleDarwinRecord`, which is just a grouping of up to one of each of the Darwin Core terms that are `Properties` (not `Classes`).
#### 6.2.1 Example of Simple Darwin Core as XML (non-normative)
The following example shows a `SimpleDarwinRecordSet` containing one `SimpleDarwinRecord` for a `Taxon`:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2006-05-04T18:13:51.0Z</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificNameID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:scientificNameID>
<dwc:acceptedNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=22010</dwc:acceptedNameUsageID>
<dwc:originalNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:originalNameUsageID>
<dwc:nameAccordingToID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=22764</dwc:nameAccordingToID>
<dwc:namePublishedInID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=671</dwc:namePublishedInID>
<dwc:scientificName>Centropyge flavicauda Fraser-Brunner 1933</dwc:scientificName>
<dwc:acceptedNameUsage>Centropyge fisheri (Snyder 1904)</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Centropyge Kaup, 1860</dwc:parentNameUsage>
<dwc:originalNameUsage>Centropyge flavicauda Fraser-Brunner 1933</dwc:originalNameUsage>
<dwc:nameAccordingTo>Allen, G.R. 1980. Butterfly and angelfishes of the world. Volume II. Mergus Publishers. Pp. 149-352.</dwc:nameAccordingTo>
<dwc:namePublishedIn>Fraser-Brunner, A. 1933. A revision of the chaetodont fishes of the subfamily Pomacanthinae. Proceedings of the General
Meetings for Scientific Business of the Zoological Society of London 1933 (pt 3, no.30): 543-599, Pl. 1.</dwc:namePublishedIn>
<dwc:higherClassification>Animalia;Chordata;Vertebrata;Osteichthyes;Actinopterygii;Neopterygii;Teleostei;Acanthopterygii;Perciformes;
Percoidei;Pomacanthidae;Centropyge</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Osteichthyes</dwc:class>
<dwc:order>Perciformes</dwc:order>
<dwc:family>Pomacanthidae</dwc:family>
<dwc:genus>Centropyge</dwc:genus>
<dwc:specificEpithet>flavicauda</dwc:specificEpithet>
<dwc:scientificNameAuthorship>Fraser-Brunner 1933</dwc:scientificNameAuthorship>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:taxonomicStatus>accepted</dwc:taxonomicStatus>
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>
```
The `SimpleDarwinRecord` acts as a `Class` in implementation, because all of the terms are properties of it. The Simple Darwin Core schema has just one other level of structure, the `SimpleDarwinRecordSet`, which is a grouping of one or more `SimpleDarwinRecords`. The `SimpleDarwinRecordSet` acts as a `Class` to define a data set during implementation.
## 7 Doing more with Simple Darwin Core
Sooner or later you may want to share more information than Simple Darwin Core seems to allow. For example, you and your colleagues might decide that it would be useful to have a standard way to exchange additional information relevant to questions in Conservation. How would you do it?
One way would be to try to "overload" existing terms by using them to hold information other than what was intended based on the definition of the terms. Please don't do this. If an existing term has close to the same meaning as one you want to use, but just doesn't quite fit because of the way the definition is worded, it would be better to request an amendment to the term definition so that it will be clear for your community how to use it. You can request such a change by submitting an issue in the [Darwin Core repository](https://github.com/tdwg/dwc).
### 7.1 Structured content using dynamicProperties
Another way to get more out of Darwin Core without adding a term is to "payload" the [`dynamicProperties`](http://rs.tdwg.org/dwc/terms/dynamicProperties) term with structured content, as shown in the example below, using Javascript Open Notation (JSON). This is perfectly legal, since it doesn't compromise the meaning of the term. One of the weaknesses of payloading data in this way is that it is subject to a lack of stable or well-defined semantics. Also, it is highly recommended to flatten the content into a single string with no non-printing characters (such as line feeds) to facilitate use in the widest variety of data sharing contexts. Still, this might be a reasonable way to at least allow you to share all of your data, even if there might be problems with people using it reliably.
#### 7.1.1 Example of structured JSON content within XML (non-normative)
```xml
<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2009-02-12T12:43:31</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificName>Ctenomys sociabilis</dwc:scientificName>
<dwc:acceptedNameUsage>Ctenomys sociabilis Pearson and Christie, 1985</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Ctenomys Blainville, 1826</dwc:parentNameUsage>
<dwc:higherClassification>Animalia; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Hystricognatha; Hystricognathi; Ctenomyidae; Ctenomyini; Ctenomys</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Mammalia</dwc:class>
<dwc:order>Rodentia</dwc:order>
<dwc:family>Ctenomyidae</dwc:family>
<dwc:genus>Ctenomys</dwc:genus>
<dwc:specificEpithet>sociabilis</dwc:specificEpithet>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:scientificNameAuthorship>Pearson and Christie, 1985</dwc:scientificNameAuthorship>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:namePublishedIn>Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388</dwc:namePublishedIn>
<dwc:taxonomicStatus>valid</dwc:taxonomicStatus>
<dwc:dynamicProperties>{"iucnStatus":"vulnerable", "distribution":"Neuquén, Argentina"}</dwc:dynamicProperties>
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>
```
### 7.2 Extending Darwin Core by adding terms
If you were using just CSV text files to exchange information, then you might be tempted to just add the new fields to the files. This approach suffers most of the same problems as payloading - no one aside from those with whom you communicated would know what those new fields were or how to use them. Sharing in this way via XML would be an even bigger problem, because the [Simple Darwin Core XML Schema](../xml/tdwg_dwc_simple.xsd) defines the terms that it supports and the new fields would not correspond with any terms understood by the schema. In other words, the XML with your fields in it would not be a valid Simple Darwin Core XML document.
So, if you really need to extend the capabilities of Darwin Core, the best first step is to follow the standards process to add the terms you need. See the [Contributing guide](https://github.com/tdwg/dwc/blob/master/.github/CONTRIBUTING.md) to understand how to suggest a new term.
## 8 Going beyond Simple Darwin Core
For cases where rich data require rich (non-simple) structure, Simple Darwin Core alone is not suitable. When sharing information via [Fielded Text](http://www.fieldedtext.org/), the solution is to use Simple Darwin Core as a core record with one or more associated extensions for the additional information. See the [Text guide](../text/) for an explanation and examples.
When sharing information via [XML](http://www.w3.org/XML/), a richer structure such as the Access to Biological Collections Data schema ([ABCD](https://github.com/tdwg/abcd)), or the [Generic Darwin Core](../xml/tdwg_dwcterms.xsd), or another schema built from Darwin Core terms to suit the use of the data in a particular context. See the [XML guide](../xml/) for examples and references to model schemas.

View File

@ -4,7 +4,7 @@ Title
: Simple Darwin Core : Simple Darwin Core
Date version issued Date version issued
: 2015-06-02 : 2021-07-15
Date created Date created
: 2009-04-21 : 2009-04-21
@ -13,13 +13,13 @@ Part of TDWG Standard
: <http://www.tdwg.org/standards/450/> : <http://www.tdwg.org/standards/450/>
This version This version
: <http://rs.tdwg.org/dwc/terms/simple/2014-11-08> : <http://rs.tdwg.org/dwc/terms/simple/2021-07-15>
Latest Version Latest Version
: <http://rs.tdwg.org/dwc/terms/simple/> : <http://rs.tdwg.org/dwc/terms/simple/>
Previous version Previous version
: <http://rs.tdwg.org/dwc/terms/simple/2013-10-22> : <http://rs.tdwg.org/dwc/terms/simple/2014-11-08>
Abstract Abstract
: This document is a reference for the Simple Darwin Core standard. : This document is a reference for the Simple Darwin Core standard.
@ -31,7 +31,7 @@ Creator
: Darwin Core Task Group : Darwin Core Task Group
Bibliographic citation Bibliographic citation
: Darwin Core Task Group. 2009. Simple Darwin Core. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/simple/> : Darwin Core Maintenance Group. 2021. Simple Darwin Core. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/simple/2021-07-15>
## 1 Introduction ## 1 Introduction
@ -39,7 +39,11 @@ Simple Darwin Core is a predefined subset of the terms that have common use acro
### 1.1 Status of the content of this document ### 1.1 Status of the content of this document
All sections of this document are normative, except for examples, which are explicitly marked as non-normative. All sections of this document are non-normative (explanatory), except for Section 5.
#### 1.1.1 RFC 2119 key words
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119).
## 2 Audience ## 2 Audience
@ -51,24 +55,24 @@ Simple Darwin Core is simple in that it assumes (and allows) no structure beyond
## 4 What makes it flexible? ## 4 What makes it flexible?
Simple Darwin Core has minimal restrictions on which fields are required (none). You might argue that there should be more required fields, that there isn't anything useful you can do without them. That is partially true. A record with no fields in it wouldn't be very interesting, but there is a difference between requiring that there be a field in a record and requiring that a particular field be in all records. By having no required field restriction, Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (an occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services. Simple Darwin Core has minimal restrictions on which fields are manditory (none). You might argue that there should be more manditory fields, that there isn't anything useful you can do without them. That is partially true. A record with no fields in it wouldn't be very interesting, but there is a difference between requiring that there be a field in a record and requiring that a particular field be in all records. By having no manditory field restriction, Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (an occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services.
## 5 Are there any rules? ## 5 Are there any rules? (Normative)
There are just a few general guiding principles on how to make the best use of Simple Darwin Core: There are just a few general guiding principles on how to make the best use of Simple Darwin Core:
1. Any Darwin Core term name can be used as a field name. 1. Any Darwin Core term name can be used as a field name.
2. No field name may be repeated in a record. 2. A field name MUST NOT be repeated in a record.
3. Do not use a _Class_ ([`Occurrence`](http://rs.tdwg.org/dwc/terms/Occurrence), [`Organism`](http://rs.tdwg.org/dwc/terms/Organism), [`MaterialSample`](http://rs.tdwg.org/dwc/terms/MaterialSample), [`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`Event`](http://rs.tdwg.org/dwc/terms/Event), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`Location`](http://rs.tdwg.org/dwc/terms/Location), [`GeologicalContext`](http://rs.tdwg.org/dwc/terms/GeologicalContext), [`Identification`](http://rs.tdwg.org/dwc/terms/Identification), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon)) as a field. 3. Class names (e.g., `Occurrence`, `Organism`) MUST NOT be used as field names.
4. Provide data in as many fields as you can. 4. Data SHOULD be provided in as many fields as possible.
5. Use the [`dcterms:type`](http://rs.tdwg.org/dwc/terms/dcterms:type) field to provide the name of the what Dublin Core type class (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record represents. 5. The [`dc:type`](http://purl.org/dc/elements/1.1/type) field SHOULD be populated with the name of the most appropriate Dublin Core type class (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record represents.
6. Use the [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord) field to provide the name of the most specific Darwin Core class (`LivingSpecimen`, `PreservedSpecimen`, `FossilSpecimen`, `MaterialSample`, `HumanObservation`, `MachineObservation`, `Event`, `Occurrence`, `Taxon`, `Identification`, `Organism`, `Location`, `GeologicalContext`, `MeasurementOrFact`, `ResourceRelationship`) the record represents. 6. The [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord) SHOULD be populated with the name of the most specific Darwin Core class ([`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`MaterialSample`](http://rs.tdwg.org/dwc/terms/MaterialSample), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`MaterialCitation`](http://rs.tdwg.org/dwc/terms/MaterialCitation), [`Event`](http://rs.tdwg.org/dwc/terms/Event), [`Occurrence`](http://rs.tdwg.org/dwc/terms/Occurrence), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon), [`Organism`](http://rs.tdwg.org/dwc/terms/Organism), [`Location`](http://purl.org/dc/terms/Location), [`GeologicalContext`](http://rs.tdwg.org/dwc/terms/GeologicalContext)) the record represents.
7. Populate fields with data that match the definition of the field. 7. Fields SHOULD be populated with data that match the definition of the field.
8. Use the controlled vocabulary for the values of fields that recommend them. 8. Values from a recommended controlled vocabulary SHOULD be used for the values of a field that recommend it.
9. If data are withheld, use [`informationWithheld`](http://rs.tdwg.org/dwc/terms/informationWithheld) to say so. 9. If data are withheld, the field [`informationWithheld`](http://rs.tdwg.org/dwc/terms/informationWithheld) SHOULD be populated to say so.
10. If data are shared in lower quality than the original, use [`dataGeneralizations`](http://rs.tdwg.org/dwc/terms/dataGeneralizations) to say so. 10. If data are shared in lower quality than the original, the field [`dataGeneralizations`](http://rs.tdwg.org/dwc/terms/dataGeneralizations) SHOULD be populated to say so.
Every field in Simple Darwin Core may appear either once or not at all in a single record - otherwise how could you distinguish one [`scientificName`](http://rs.tdwg.org/dwc/terms/scientificName) field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the [`MeasurementOrFact`](http://rs.tdwg.org/dwc/terms/MeasurementOrFact) and [`ResourceRelationship`](http://rs.tdwg.org/dwc/terms/ResourceRelationship) classes are of somewhat limited utility here - you could only share one `MeasurementOrFact` and one `ResourceRelationship` per record. You might argue then that there is no way to share information that requires related structures, such as a history of identifications of a specimen. That is mostly true. The only recourse within Simple Darwin Core is to force the data into one of the catch all "list" terms such as [`recordedBy`](http://rs.tdwg.org/dwc/terms/recordedBy), [`preparations`](http://rs.tdwg.org/dwc/terms/preparations), [`otherCatalogNumbers`](http://rs.tdwg.org/dwc/terms/otherCatalogNumbers), [`associatedMedia`](http://rs.tdwg.org/dwc/terms/associatedMedia), [`associatedReferences`](http://rs.tdwg.org/dwc/terms/associatedReferences), [`associatedSequences`](http://rs.tdwg.org/dwc/terms/associatedSequences), [`associatedTaxa`](http://rs.tdwg.org/dwc/terms/associatedTaxa), [`associatedOccurrences`](http://rs.tdwg.org/dwc/terms/associatedOccurrences), [`associatedOrganisms`](http://rs.tdwg.org/dwc/terms/associatedOrganisms), [`previousIdentifications`](http://rs.tdwg.org/dwc/terms/previousIdentifications), [`higherGeography`](http://rs.tdwg.org/dwc/terms/higherGeography), [`georeferencedBy`](http://rs.tdwg.org/dwc/terms/georeferencedBy), [`georeferenceSources`](http://rs.tdwg.org/dwc/terms/georeferenceSources), [`identifiedBy`](http://rs.tdwg.org/dwc/terms/identifiedBy), [`identificationReferences`](http://rs.tdwg.org/dwc/terms/identificationReferences), and [`higherClassification`](http://rs.tdwg.org/dwc/terms/higherClassification). Every field in Simple Darwin Core MAY appear either once or not at all in a single record - otherwise how could you distinguish one [`scientificName`](http://rs.tdwg.org/dwc/terms/scientificName) field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the [`MeasurementOrFact`](http://rs.tdwg.org/dwc/terms/MeasurementOrFact) and [`ResourceRelationship`](http://rs.tdwg.org/dwc/terms/ResourceRelationship) classes are of somewhat limited utility here - you could only share one `MeasurementOrFact` and one `ResourceRelationship` per record. You might argue then that there is no way to share information that requires related structures, such as a history of identifications of a specimen. That is mostly true. The only recourse within Simple Darwin Core is to force the data into one of the catch all "list" terms such as [`recordedBy`](http://rs.tdwg.org/dwc/terms/recordedBy), [`preparations`](http://rs.tdwg.org/dwc/terms/preparations), [`otherCatalogNumbers`](http://rs.tdwg.org/dwc/terms/otherCatalogNumbers), [`associatedMedia`](http://rs.tdwg.org/dwc/terms/associatedMedia), [`associatedReferences`](http://rs.tdwg.org/dwc/terms/associatedReferences), [`associatedSequences`](http://rs.tdwg.org/dwc/terms/associatedSequences), [`associatedTaxa`](http://rs.tdwg.org/dwc/terms/associatedTaxa), [`associatedOccurrences`](http://rs.tdwg.org/dwc/terms/associatedOccurrences), [`associatedOrganisms`](http://rs.tdwg.org/dwc/terms/associatedOrganisms), [`previousIdentifications`](http://rs.tdwg.org/dwc/terms/previousIdentifications), [`higherGeography`](http://rs.tdwg.org/dwc/terms/higherGeography), [`georeferencedBy`](http://rs.tdwg.org/dwc/terms/georeferencedBy), [`georeferenceSources`](http://rs.tdwg.org/dwc/terms/georeferenceSources), [`identifiedBy`](http://rs.tdwg.org/dwc/terms/identifiedBy), [`identificationReferences`](http://rs.tdwg.org/dwc/terms/identificationReferences), and [`higherClassification`](http://rs.tdwg.org/dwc/terms/higherClassification).
There is a difference between having data in a field and requiring that field to have a value from among a legal set of values. Darwin Core is simple in that it has minimal restrictions on the contents of fields. The term comments give recommendations about the use of controlled vocabularies and how to structure content wherever appropriate. Data contributors are encouraged to follow these recommendations as well as possible. You might argue that having no restrictions will promote "dirty" data (data of low quality or dubious value). Consider the simple axiom "It's not what you have, but what you do with it that matters." If data restrictions were in place at the fundamental level, then a record having any non-compliant data in any of its fields could not be shared via the standard. Not only would there be a dearth of shared data in that case (or an unused standard), but also there would be no way to use the standard to build shared data cleaning tools to actually improve the situation, nor to use data services to look up alternative representations (language translations, for example) to serve a broader audience. The rest is up to how the records will be used - in other words, it is up to applications to enforce further restrictions if appropriate, and it is up to the stakeholders of those applications to decide what the restrictions will be for the purpose the application is trying to serve. There is a difference between having data in a field and requiring that field to have a value from among a legal set of values. Darwin Core is simple in that it has minimal restrictions on the contents of fields. The term comments give recommendations about the use of controlled vocabularies and how to structure content wherever appropriate. Data contributors are encouraged to follow these recommendations as well as possible. You might argue that having no restrictions will promote "dirty" data (data of low quality or dubious value). Consider the simple axiom "It's not what you have, but what you do with it that matters." If data restrictions were in place at the fundamental level, then a record having any non-compliant data in any of its fields could not be shared via the standard. Not only would there be a dearth of shared data in that case (or an unused standard), but also there would be no way to use the standard to build shared data cleaning tools to actually improve the situation, nor to use data services to look up alternative representations (language translations, for example) to serve a broader audience. The rest is up to how the records will be used - in other words, it is up to applications to enforce further restrictions if appropriate, and it is up to the stakeholders of those applications to decide what the restrictions will be for the purpose the application is trying to serve.
@ -86,7 +90,7 @@ The [Text guide](../text/) describes how to construct and format a text file usi
The [XML guide](../xml/) describes how to construct XML schemas to share data based on Darwin Core terms. Looking at the [Simple Darwin Core XML Schema](../xml/tdwg_dwc_simple.xsd) using the XML guide as a reference you will be able to see that the schema supports the notion of a `SimpleDarwinRecord`, which is just a grouping of up to one of each of the Darwin Core terms that are `Properties` (not `Classes`). The [XML guide](../xml/) describes how to construct XML schemas to share data based on Darwin Core terms. Looking at the [Simple Darwin Core XML Schema](../xml/tdwg_dwc_simple.xsd) using the XML guide as a reference you will be able to see that the schema supports the notion of a `SimpleDarwinRecord`, which is just a grouping of up to one of each of the Darwin Core terms that are `Properties` (not `Classes`).
#### 6.2.1 Example of Simple Darwin Core as XML (non-normative) #### 6.2.1 Example of Simple Darwin Core as XML
The following example shows a `SimpleDarwinRecordSet` containing one `SimpleDarwinRecord` for a `Taxon`: The following example shows a `SimpleDarwinRecordSet` containing one `SimpleDarwinRecord` for a `Taxon`:
@ -141,9 +145,9 @@ One way would be to try to "overload" existing terms by using them to hold infor
### 7.1 Structured content using dynamicProperties ### 7.1 Structured content using dynamicProperties
Another way to get more out of Darwin Core without adding a term is to "payload" the [`dynamicProperties`](http://rs.tdwg.org/dwc/terms/dynamicProperties) term with structured content, as shown in the example below, using Javascript Open Notation (JSON). This is perfectly legal, since it doesn't compromise the meaning of the term. One of the weaknesses of payloading data in this way is that it is subject to a lack of stable or well-defined semantics. Also, it is highly recommended to flatten the content into a single string with no non-printing characters (such as line feeds) to facilitate use in the widest variety of data sharing contexts. Still, this might be a reasonable way to at least allow you to share all of your data, even if there might be problems with people using it reliably. Another way to get more out of Darwin Core without adding a term is to "payload" the [`dynamicProperties`](http://rs.tdwg.org/dwc/terms/dynamicProperties) term with structured content, as shown in the example below, using Javascript Open Notation (JSON). This is perfectly legal, since it doesn't compromise the meaning of the term. One of the weaknesses of payloading data in this way is that it is subject to a lack of stable or well-defined semantics. Also, it is strongly suggested to flatten the content into a single string with no non-printing characters (such as line feeds) to facilitate use in the widest variety of data sharing contexts. Still, this might be a reasonable way to at least allow you to share all of your data, even if there might be problems with people using it reliably.
#### 7.1.1 Example of structured JSON content within XML (non-normative) #### 7.1.1 Example of structured JSON content within XML
```xml ```xml
<?xml version="1.0" encoding="UTF-8"?> <?xml version="1.0" encoding="UTF-8"?>

233
docs/text/2020-09-05.md Normal file
View File

@ -0,0 +1,233 @@
# Darwin Core text guide
Title
: Darwin Core text guide
Date version issued
: 2020-09-05
Date created
: 2009-02-12
Part of TDWG Standard
: <http://www.tdwg.org/standards/450/>
This version
: <http://rs.tdwg.org/dwc/terms/guides/text/2020-09-05>
Latest version
: <http://rs.tdwg.org/dwc/terms/guides/text/>
Previous version
: <http://rs.tdwg.org/dwc/terms/guides/text/2014-11-08>
Replaced by
: <http://rs.tdwg.org/dwc/terms/guides/text/2021-07-15>
Abstract
: Guidelines for implementing Darwin Core in Text files.
Contributors
: Tim Robertson (GBIF), Markus Döring (GBIF), John Wieczorek (MVZ), Renato De Giovanni (CRIA), Dave Vieglais (KUNHM)
Creator
: Darwin Core Task Group
Bibliographic citation
: Darwin Core Maintenance Group. 2020. Darwin Core text guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/text/2020-09-05>
## 1 Introduction
This document provides guidelines for formatting and sharing [Darwin Core terms](http://rs.tdwg.org/dwc/terms) in _fielded text_ formats, such as one or more comma separated value (CSV) files. Data conforming to the [Simple Darwin Core](../simple/) (CSV format and having the first row include Darwin Core standard term names) can be shared in a single file, while a non-standard text file can be understood using an [XML](http://www.w3.org/XML/) metafile to describe its contents and formatting.
![Usage](usage.png)
More complex structure can be shared in multiple related files. The description of content and relationships between files can be achieved using the metafile. This guideline makes recommendations for the simple case of a _core_ file, upon which Darwin Core _records_ are based, and _extensions_ that are linked to records in that core file. Specifically, extension records have a _many-to-one_ relationship with records in the core file. For example, a core file might contain specimen records, with one specimen per row in the file, while an extension file contains one or more identifications for those specimens, with one identification per row in the extension file, and with an identifier to the specimen for each identification row. This example would allow many identifications to be associated with each specimen.
### 1.1 Status of the content of this document
All sections of this document are normative, except for examples, whose sections are marked as non-normative.
### 1.2 Simple example metafile content (non-normative)
A simple comma separated values (CSV) data file with the following content:
```csv
ID,Species,Count
123,"Cryptantha gypsophila Reveal & C.R. Broome",12
124,"Buxbaumia piperi",2
```
can be described with the following Darwin Core metafile:
```xml
<?xml version="1.0" encoding="UTF-8"?>
<archive xmlns="http://rs.tdwg.org/dwc/text/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://rs.tdwg.org/dwc/text/ http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd">
<core rowType="http://rs.tdwg.org/dwc/xsd/simpledarwincore/SimpleDarwinRecord" ignoreHeaderLines="1">
<files>
<location>http://data.gbif.org/download/specimens.csv</location>
</files>
<field index="0" term="http://rs.tdwg.org/dwc/terms/catalogNumber" />
<field index="1" term="http://rs.tdwg.org/dwc/terms/scientificName" />
<field index="2" term="http://rs.tdwg.org/dwc/terms/individualCount" />
<!-- A constant value has no index, but applies to all rows -->
<field term="http://rs.tdwg.org/dwc/terms/datasetID" default="urn:lsid:tim.lsid.tdwg.org:collections:1"/>
</core>
</archive>
```
These same data could be understood without the metafile if the first row of the CSV file contained the term names:
```csv
type,institutionCode,collectionCode,catalogNumber,scientificName,individualCount,datasetID
PhysicalObject,ANSP,PH,123,"Cryptantha gypsophila Reveal & C.R. Broome",12,urn:lsid:tim.lsid.tdwg.org:collections:1
PhysicalObject,ANSP,PH,124,"Buxbaumia piperi",2,urn:lsid:tim.lsid.tdwg.org:collections:1
```
### 1.3 XML versus fielded text
Many resources exist on the web describing the advantages of Extensible Markup Language [XML](http://www.w3.org/XML/) over less structured content such as _fielded text_. The Darwin Core text guide (this document) is not meant to promote the use of fielded text over XML for data exchange, but rather to provide recommendations for how to handle such data files when necessary.
Two scenarios that might benefit from the use of fielded text are:
- The transfer of large numbers of Darwin Core records and related data from one database to another. Typically databases are very efficient at exporting and importing comma separated text files.
- The description of legacy data existing in a fielded text format, such that it might be automatically understood and loaded into another system. It could be that this system would then serve the data in another format such as XML.
## 2 Metafile content
The [text metafile schema](tdwg_dwc_text.xsd) provides technical details for the structure of a metafile by defining the elements and attributes required to describe the contents and relationships between text files. These elements and attributes, with descriptions and specifications for their use in a metafile, are described in the following table.
### 2.1 The `<archive>` element
The `<archive>` element is the container for the list of related files (one core and zero or more extensions). The `<archive>` element has just one attribute, `metadata`.
#### 2.1.1 Attributes
Attribute | Description | Required | Default
--- | --- | --- | ---
`metadata` | Contains a qualified Uniform Resource Locator (URL) defining the location of a metadata description of the entire archive. The format of the metadata is not prescribed, but a standardized format such as Ecological Metadata Language (EML), Federal Geographic Data Committee (FGDC), or ISO 19115 family is recommended. | no |
#### 2.1.2 Elements
Element | Description
--- | ---
`<core>` | An `<archive>` must contain exactly one `<core>` element, representing the data entity (the actual file and its column header mappings to Darwin Core terms) upon which records are based. If extensions are being used, each record in the core data must have a unique identifier. The field for this identifier must be specified in an explicit `<id>` field in order to associate extension records with the core record.
`<extension>` | An `<archive>` may define zero or more `<extension>` elements, each representing an individual extension entity directly related to the core. In addition to the general file attributes described below, every extension entity must have an explicit `<coreid>` field to relate the extension record to a row in the core entity. The extension itself does not have to have a unique ID field and many rows can point to the same core record.
### 2.2 The `<core>` or `<extension>` element
#### 2.2.1 Attributes
Attribute | Description | Required | Default
--- | --- | --- | ---
`rowType` | A Unified Resource Identifier (URI) for the term identifying the class of data represented by each row, for example, <http://rs.tdwg.org/dwc/terms/Occurrence> for Occurrence records or <http://rs.tdwg.org/dwc/terms/Taxon> for Taxon records. Additional classes may be referenced by URI and defined outside the Darwin Core specification. The row type is required. For convenience the URIs for classes defined by the Darwin Core are: `Occurrence`: <http://rs.tdwg.org/dwc/terms/Occurrence>, `Event`: <http://rs.tdwg.org/dwc/terms/Event>, `Location`: <http://purl.org/dc/terms/Location>, `GeologicalContext`: <http://purl.org/dc/terms/GeologicalContext>, `Identification`: <http://rs.tdwg.org/dwc/terms/Identification>, `Taxon`: <http://rs.tdwg.org/dwc/terms/Taxon>, `ResourceRelationship`: <http://rs.tdwg.org/dwc/terms/ResourceRelationship>, `MeasurementOrFact`: <http://rs.tdwg.org/dwc/terms/MeasurementOrFact> | yes |
`fieldsTerminatedBy` | Specifies the delimiter between fields. Typical values might be `,` or `\t` for CSV or Tab files respectively. | no | `,`
`linesTerminatedBy` | Specifies the row separator character. | no | `\n`
`fieldsEnclosedBy` | Specifies the character used to enclose (mark the start and end of) each field. CSV files frequently use the double quote character (`"`), which is the default value if none is explicitly provided. Note that a comma separated value file that has commas within the content of any field must have an enclosing character. | no | `"`
`encoding` | Specifies the [character encoding](http://en.wikipedia.org/wiki/Character_encoding) for the data file. The encoding is extremely important, but often ignored. The most frequently used encodings are: `UTF-8`: 8-bit Unicode Transformation Format, `UTF-16`: 16-bit Unicode Transformation Format, `ISO-8859-1`: commonly known as "Latin-1" and a common default on systems configured for a single western European language, `Windows-1252`: commonly known as "WinLatin" and a common default of legacy versions of Microsoft Windows based operating systems. | no | `UTF-8`
`ignoreHeaderLines` | Specifies the number lines to ignore from the beginning of the file. This can be used to ignore files with column headings or preamble comments for example. | no | `0`
`dateFormat` | When verbatim dates are consistent in format, this field can be used to indicate the format represented. It is recommended to use the date, dateTime and time for field formats wherever possible, but where verbatim dates are required, a format may be specified here. This should be considered a 'hint' for consumers. It is recommended that consumers support the minimum combinations of `DD` `MM` and `YYYY` with the separators `/` and `-`. Examples: `DDMMYYYY`: for dates of the form 21121978, `DD-MM-YYYY`: for dates of the form 21-12-1978, `MMDDYYYY`: for dates of the form 12211978, `MM-DD-YYYY`: for dates of the form 12-21-1978, `YYYYMMDD`: for dates of the form 19781221. | no | `YYYY-MM-DD`
#### 2.2.2 Elements
Element | Description
--- | ---
`<files>` | `<core>` or `<extension>` element must contain one `<files>` element to locate the data being described.
`<id>` | If extensions are being used, the `<core>` must contain an <id> element that indicates the identifier for a record.
`<coreid>` | If extensions are being used, the `<extension>` element must contain a `<coreid>` element that indicates the column in the extension file that contains the core record identifier (the matching `<id>` in the core file).
`<field>` | A `<core>` or `<extension>` element must contain one or more <field> elements, each representing a 'column' in the row.
### 2.3 `<files>` element
The files element must contain one or more <location> elements, each defining where a file resides. Each core or extension entity can be composed from one or more files. If an entity has data in more than one file, use the `<location>` element multiple times, once for each file that makes up the entity.
#### 2.3.1 Elements
Element | Description
--- | ---
`<location>` | Specifies the location of the file being described, which may take either of the following forms: 1) a web accessible URL such as `http://www.gbif.org/data/specimen.csv` or `ftp://ftp.gbif.org/tim/specimen.txt`, 2) a filepath relative to the location of the metafile such as `specimen.txt`, `./specimen.txt`, `data/specimen.txt`.
### 2.4 The `<field>` element
The field element is used to specify the location and content of data within a file. There must be one field element for every term being shared for the entity, whether explicitly or through the use of a default value for all rows in the file.
#### 2.4.1 Attributes
Attribute | Description | Required | Default
--- | --- | --- | ---
`index` | Specifies the position of the column in the row. The first column has an index of 0, the second column 1, etc. If no column index is specified, then the term and the default may be used to define a constant value for all rows. | no |
`term` | A Unified Resource Identifier (URI) for the term represented by this field. For example, a field containing the scientific name would have `term="http://rs.tdwg.org/dwc/terms/scientificName"`. Terms outside of the Darwin Core specification may be used, such as those from the Dublin Core Metadata Initative, for example, `dcterms:modified` would be `term="http://purl.org/dc/terms/modified"`. | yes |
`default` | Specifies value to use if one is not supplied for the field in a given row. If no index is supplied, the default can be used to define a constant for all rows for a field that is not in the data file. | no |
`vocabulary` | A Unified Resource Identifier (URI) for a vocabulary that the source values for this field are based on. The URI ideally should resolve to some machine readable definition like SKOS, RDF or at least some simple text or html file often found for ISO or RFC standards. For example <http://rs.gbif.org/vocabulary/gbif/nomenclatural_code.xml>, <http://www.ietf.org/rfc/rfc3066.txt> or <http://www.iso.org/iso/list-en1-semic-3.txt>. | no |
## 3 Implementation guide
### 3.1 Extension example (non-normative)
The following example illustrates the use of extensions. In this example there are three files in the archive, all of which are located in the same directory as the metafile. The whales.txt file acts as a core file of Taxon records. The whales.txt file is extended by two other files, types.txt and distribution.txt. The types.txt file contains records of a type specified in an external definition at <http://http://rs.gbif.org/terms/1.0/Types> and consists of Dublin Core and Darwin Core terms, while the distribution.txt file contains records of a type specified at <http://http://rs.gbif.org/terms/1.0/Distribution> and consists of Darwin Core terms plus an additional term for threatStatus. Both extension files are related to the core file by the taxonNameID fields. Presumably, this archive contains information about whale species, type specimen records for those species, and lists of countries and the threat status for those species.
![Extension](extension.png)
```xml
<?xml version="1.0" encoding="UTF-8"?>
<archive xmlns="http://rs.tdwg.org/dwc/text/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xsi:schemaLocation="http://rs.tdwg.org/dwc/text/ http://rs.tdwg.org/dwc/text/tdwg_dwc_text.xsd">
<core encoding="UTF-8" fieldsTerminatedBy="\t" linesTerminatedBy="\n" ignoreHeaderLines="1" rowType="http://rs.tdwg.org/dwc/terms/Taxon">
<files>
<location>whales.txt</location>
</files>
<id index="0" />
<field index="0" term="http://rs.tdwg.org/dwc/terms/taxonID" />
<field index="1" term="http://purl.org/dc/terms/modified" />
<field index="2" term="http://rs.tdwg.org/dwc/terms/scientificName"/>
<field index="3" term="http://rs.tdwg.org/dwc/terms/acceptedNameUsageID"/>
<field index="4" term="http://rs.tdwg.org/dwc/terms/parentNameUsageID"/>
<field index="5" term="http://rs.tdwg.org/dwc/terms/originalNameUsageID"/>
</core>
<extension encoding="UTF-8" fieldsTerminatedBy="," linesTerminatedBy="\n" fieldsEnclosedBy='"' ignoreHeaderLines="1" rowType="http://rs.gbif.org/terms/1.0/Types">
<files>
<location>types.csv</location>
</files>
<coreid index="0" />
<field index="1" term="http://purl.org/dc/terms/bibliographicCitation"/>
<field index="2" term="http://rs.tdwg.org/dwc/terms/catalogNumber"/>
<field index="3" term="http://rs.tdwg.org/dwc/terms/collectionCode"/>
<field index="4" term="http://rs.tdwg.org/dwc/terms/institutionCode"/>
<field index="5" term="http://rs.tdwg.org/dwc/terms/typeStatus"/>
</extension>
<extension encoding="UTF-8" fieldsTerminatedBy="," linesTerminatedBy="\n" fieldsEnclosedBy='"' ignoreHeaderLines="1" rowType="http://rs.gbif.org/terms/1.0/Distribution">
<files>
<location>distribution.csv</location>
</files>
<coreid index="0" />
<field index="1" term="http://rs.tdwg.org/dwc/terms/countryCode"/>
<field index="2" term="http://rs.gbif.org/terms/1.0/threatStatus"/>
<field index="3" term="http://rs.tdwg.org/dwc/terms/occurrenceStatus"/>
</extension>
</archive>
```
## 4 Database example (non-normative)
### 4.1 MySQL
It is very easy to produce fielded text using the `SELECT INTO` outfile command from MySQL. The encoding of the resulting file will depend on the server variables and collations used, and might need to be modified before the operation is done. Note that MySQL will export `NULL` values as `\N` by default. Use the `IFNULL()` function as shown in the following example to avoid this.
```sql
SELECT
IFNULL(id, ''), IFNULL(scientific_name, ''), IFNULL(count,'')
INTO outfile '/tmp/dwc.txt'
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM
dwc;
```

View File

@ -4,7 +4,7 @@ Title
: Darwin Core text guide : Darwin Core text guide
Date version issued Date version issued
: 2020-09-05 : 2021-07-15
Date created Date created
: 2009-02-12 : 2009-02-12
@ -13,13 +13,13 @@ Part of TDWG Standard
: <http://www.tdwg.org/standards/450/> : <http://www.tdwg.org/standards/450/>
This version This version
: <http://rs.tdwg.org/dwc/terms/guides/text/2020-09-05> : <http://rs.tdwg.org/dwc/terms/guides/text/2021-07-15>
Latest version Latest version
: <http://rs.tdwg.org/dwc/terms/guides/text/> : <http://rs.tdwg.org/dwc/terms/guides/text/>
Previous version Previous version
: <http://rs.tdwg.org/dwc/terms/guides/text/2014-11-08> : <http://rs.tdwg.org/dwc/terms/guides/text/2020-09-05>
Abstract Abstract
: Guidelines for implementing Darwin Core in Text files. : Guidelines for implementing Darwin Core in Text files.
@ -31,7 +31,7 @@ Creator
: Darwin Core Task Group : Darwin Core Task Group
Bibliographic citation Bibliographic citation
: Darwin Core Task Group. 2009. Darwin Core text guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/text/> : Darwin Core Maintenance Group. 2021. Darwin Core text guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/text/2021-07-15>
## 1 Introduction ## 1 Introduction
@ -45,6 +45,10 @@ More complex structure can be shared in multiple related files. The description
All sections of this document are normative, except for examples, whose sections are marked as non-normative. All sections of this document are normative, except for examples, whose sections are marked as non-normative.
#### 1.1.1 RFC 2119 key words
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119).
### 1.2 Simple example metafile content (non-normative) ### 1.2 Simple example metafile content (non-normative)
A simple comma separated values (CSV) data file with the following content: A simple comma separated values (CSV) data file with the following content:
@ -95,24 +99,24 @@ Two scenarios that might benefit from the use of fielded text are:
## 2 Metafile content ## 2 Metafile content
The [text metafile schema](tdwg_dwc_text.xsd) provides technical details for the structure of a metafile by defining the elements and attributes required to describe the contents and relationships between text files. These elements and attributes, with descriptions and specifications for their use in a metafile, are described in the following table. The [text metafile schema](tdwg_dwc_text.xsd) provides technical details for the structure of a metafile by defining the elements and attributes necessary to describe the contents and relationships between text files. These elements and attributes, with descriptions and specifications for their use in a metafile, are described in the following table.
### 2.1 The `<archive>` element ### 2.1 The `<archive>` element
The `<archive>` element is the container for the list of related files (one core and zero or more extensions). The `<archive>` element has just one attribute, `metadata`. The `<archive>` element is the container for the list of related files (one core and zero or more extensions). The `<archive>` element MUST have one attribute, `metadata`.
#### 2.1.1 Attributes #### 2.1.1 Attributes
Attribute | Description | Required | Default Attribute | Description | Required | Default
--- | --- | --- | --- --- | --- | --- | ---
`metadata` | Contains a qualified Uniform Resource Locator (URL) defining the location of a metadata description of the entire archive. The format of the metadata is not prescribed, but a standardized format such as Ecological Metadata Language (EML), Federal Geographic Data Committee (FGDC), or ISO 19115 family is recommended. | no | `metadata` | If used, the value MUST be a qualified Uniform Resource Locator (URL) defining the location of a metadata description of the entire archive. The format of the metadata is not prescribed, but a standardized format such as Ecological Metadata Language (EML), Federal Geographic Data Committee (FGDC), or ISO 19115 family is RECOMMENDED. | no |
#### 2.1.2 Elements #### 2.1.2 Elements
Element | Description Element | Description
--- | --- --- | ---
`<core>` | An `<archive>` must contain exactly one `<core>` element, representing the data entity (the actual file and its column header mappings to Darwin Core terms) upon which records are based. If extensions are being used, each record in the core data must have a unique identifier. The field for this identifier must be specified in an explicit `<id>` field in order to associate extension records with the core record. `<core>` | An `<archive>` MUST contain exactly one `<core>` element, representing the data entity (the actual file and its column header mappings to Darwin Core terms) upon which records are based. If extensions are being used, each record in the core data MUST have a unique identifier. The field for this identifier MUST be specified in an explicit `<id>` field in order to associate extension records with the core record.
`<extension>` | An `<archive>` may define zero or more `<extension>` elements, each representing an individual extension entity directly related to the core. In addition to the general file attributes described below, every extension entity must have an explicit `<coreid>` field to relate the extension record to a row in the core entity. The extension itself does not have to have a unique ID field and many rows can point to the same core record. `<extension>` | An `<archive>` MAY define zero or more `<extension>` elements, each representing an individual extension entity directly related to the core. In addition to the general file attributes described below, every extension entity MUST have an explicit `<coreid>` field to relate the extension record to a row in the core entity. The extension itself does not have to have a unique ID field and many rows can point to the same core record.
### 2.2 The `<core>` or `<extension>` element ### 2.2 The `<core>` or `<extension>` element
@ -120,45 +124,45 @@ Element | Description
Attribute | Description | Required | Default Attribute | Description | Required | Default
--- | --- | --- | --- --- | --- | --- | ---
`rowType` | A Unified Resource Identifier (URI) for the term identifying the class of data represented by each row, for example, <http://rs.tdwg.org/dwc/terms/Occurrence> for Occurrence records or <http://rs.tdwg.org/dwc/terms/Taxon> for Taxon records. Additional classes may be referenced by URI and defined outside the Darwin Core specification. The row type is required. For convenience the URIs for classes defined by the Darwin Core are: `Occurrence`: <http://rs.tdwg.org/dwc/terms/Occurrence>, `Event`: <http://rs.tdwg.org/dwc/terms/Event>, `Location`: <http://purl.org/dc/terms/Location>, `GeologicalContext`: <http://purl.org/dc/terms/GeologicalContext>, `Identification`: <http://rs.tdwg.org/dwc/terms/Identification>, `Taxon`: <http://rs.tdwg.org/dwc/terms/Taxon>, `ResourceRelationship`: <http://rs.tdwg.org/dwc/terms/ResourceRelationship>, `MeasurementOrFact`: <http://rs.tdwg.org/dwc/terms/MeasurementOrFact> | yes | `rowType` | MUST be a Unified Resource Identifier (URI) for the term identifying the class of data represented by each row, for example, <http://rs.tdwg.org/dwc/terms/Occurrence> for Occurrence records or <http://rs.tdwg.org/dwc/terms/Taxon> for Taxon records. Additional classes MAY be defined outside the Darwin Core specification if denoted by a URI. The row type is REQUIRED. For convenience the URIs for classes defined by the Darwin Core are: `Occurrence`: <http://rs.tdwg.org/dwc/terms/Occurrence>, `Event`: <http://rs.tdwg.org/dwc/terms/Event>, `Location`: <http://purl.org/dc/terms/Location>, `GeologicalContext`: <http://purl.org/dc/terms/GeologicalContext>, `Identification`: <http://rs.tdwg.org/dwc/terms/Identification>, `Taxon`: <http://rs.tdwg.org/dwc/terms/Taxon>, `ResourceRelationship`: <http://rs.tdwg.org/dwc/terms/ResourceRelationship>, `MeasurementOrFact`: <http://rs.tdwg.org/dwc/terms/MeasurementOrFact> | yes |
`fieldsTerminatedBy` | Specifies the delimiter between fields. Typical values might be `,` or `\t` for CSV or Tab files respectively. | no | `,` `fieldsTerminatedBy` | Specifies the delimiter between fields. Typical values MAY be `,` or `\t` for CSV or Tab files respectively. | no | `,`
`linesTerminatedBy` | Specifies the row separator character. | no | `\n` `linesTerminatedBy` | Specifies the row separator character. | no | `\n`
`fieldsEnclosedBy` | Specifies the character used to enclose (mark the start and end of) each field. CSV files frequently use the double quote character (`"`), which is the default value if none is explicitly provided. Note that a comma separated value file that has commas within the content of any field must have an enclosing character. | no | `"` `fieldsEnclosedBy` | Specifies the character used to enclose (mark the start and end of) each field. CSV files frequently use the double quote character (`"`), which is the default value if none is explicitly provided. Note that a comma separated value file that has commas within the content of any field MUST have an enclosing character. | no | `"`
`encoding` | Specifies the [character encoding](http://en.wikipedia.org/wiki/Character_encoding) for the data file. The encoding is extremely important, but often ignored. The most frequently used encodings are: `UTF-8`: 8-bit Unicode Transformation Format, `UTF-16`: 16-bit Unicode Transformation Format, `ISO-8859-1`: commonly known as "Latin-1" and a common default on systems configured for a single western European language, `Windows-1252`: commonly known as "WinLatin" and a common default of legacy versions of Microsoft Windows based operating systems. | no | `UTF-8` `encoding` | Specifies the [character encoding](http://en.wikipedia.org/wiki/Character_encoding) for the data file. The encoding is extremely important, but often ignored. The most frequently used encodings are: `UTF-8`: 8-bit Unicode Transformation Format, `UTF-16`: 16-bit Unicode Transformation Format, `ISO-8859-1`: commonly known as "Latin-1" and a common default on systems configured for a single western European language, `Windows-1252`: commonly known as "WinLatin" and a common default of legacy versions of Microsoft Windows based operating systems. | no | `UTF-8`
`ignoreHeaderLines` | Specifies the number lines to ignore from the beginning of the file. This can be used to ignore files with column headings or preamble comments for example. | no | `0` `ignoreHeaderLines` | Specifies the number lines to ignore from the beginning of the file. This MAY be used to ignore files with column headings or preamble comments for example. | no | `0`
`dateFormat` | When verbatim dates are consistent in format, this field can be used to indicate the format represented. It is recommended to use the date, dateTime and time for field formats wherever possible, but where verbatim dates are required, a format may be specified here. This should be considered a 'hint' for consumers. It is recommended that consumers support the minimum combinations of `DD` `MM` and `YYYY` with the separators `/` and `-`. Examples: `DDMMYYYY`: for dates of the form 21121978, `DD-MM-YYYY`: for dates of the form 21-12-1978, `MMDDYYYY`: for dates of the form 12211978, `MM-DD-YYYY`: for dates of the form 12-21-1978, `YYYYMMDD`: for dates of the form 19781221. | no | `YYYY-MM-DD` `dateFormat` | When verbatim dates are consistent in format, this field MAY be used to indicate the format represented. It is RECOMMENDED to use the date, dateTime and time for field formats wherever possible, but where verbatim dates are required, a format MAY be specified here. This should be considered a 'hint' for consumers. It is RECOMMENDED that consumers support the minimum combinations of `DD` `MM` and `YYYY` with the separators `/` and `-`. Examples: `DDMMYYYY`: for dates of the form 21121978, `DD-MM-YYYY`: for dates of the form 21-12-1978, `MMDDYYYY`: for dates of the form 12211978, `MM-DD-YYYY`: for dates of the form 12-21-1978, `YYYYMMDD`: for dates of the form 19781221. | no | `YYYY-MM-DD`
#### 2.2.2 Elements #### 2.2.2 Elements
Element | Description Element | Description
--- | --- --- | ---
`<files>` | `<core>` or `<extension>` element must contain one `<files>` element to locate the data being described. `<files>` | `<core>` or `<extension>` element MUST contain one `<files>` element to locate the data being described.
`<id>` | If extensions are being used, the `<core>` must contain an <id> element that indicates the identifier for a record. `<id>` | If extensions are being used, the `<core>` MUST contain an <id> element that indicates the identifier for a record.
`<coreid>` | If extensions are being used, the `<extension>` element must contain a `<coreid>` element that indicates the column in the extension file that contains the core record identifier (the matching `<id>` in the core file). `<coreid>` | If extensions are being used, the `<extension>` element MUST contain a `<coreid>` element that indicates the column in the extension file that contains the core record identifier (the matching `<id>` in the core file).
`<field>` | A `<core>` or `<extension>` element must contain one or more <field> elements, each representing a 'column' in the row. `<field>` | A `<core>` or `<extension>` element MUST contain one or more <field> elements, each representing a 'column' in the row.
### 2.3 `<files>` element ### 2.3 `<files>` element
The files element must contain one or more <location> elements, each defining where a file resides. Each core or extension entity can be composed from one or more files. If an entity has data in more than one file, use the `<location>` element multiple times, once for each file that makes up the entity. The files element MUST contain one or more <location> elements, each defining where a file resides. Each core or extension entity can be composed from one or more files. If an entity has data in more than one file, use the `<location>` element multiple times, once for each file that makes up the entity.
#### 2.3.1 Elements #### 2.3.1 Elements
Element | Description Element | Description
--- | --- --- | ---
`<location>` | Specifies the location of the file being described, which may take either of the following forms: 1) a web accessible URL such as `http://www.gbif.org/data/specimen.csv` or `ftp://ftp.gbif.org/tim/specimen.txt`, 2) a filepath relative to the location of the metafile such as `specimen.txt`, `./specimen.txt`, `data/specimen.txt`. `<location>` | Specifies the location of the file being described, which MUST take one of the following forms: 1) a web accessible URL such as `http://www.gbif.org/data/specimen.csv` or `ftp://ftp.gbif.org/tim/specimen.txt`, 2) a filepath relative to the location of the metafile such as `specimen.txt`, `./specimen.txt`, `data/specimen.txt`.
### 2.4 The `<field>` element ### 2.4 The `<field>` element
The field element is used to specify the location and content of data within a file. There must be one field element for every term being shared for the entity, whether explicitly or through the use of a default value for all rows in the file. The field element is used to specify the location and content of data within a file. There MUST be one field element for every term being shared for the entity, whether explicitly or through the use of a default value for all rows in the file.
#### 2.4.1 Attributes #### 2.4.1 Attributes
Attribute | Description | Required | Default Attribute | Description | Required | Default
--- | --- | --- | --- --- | --- | --- | ---
`index` | Specifies the position of the column in the row. The first column has an index of 0, the second column 1, etc. If no column index is specified, then the term and the default may be used to define a constant value for all rows. | no | `index` | Specifies the position of the column in the row. The first column has an index of 0, the second column 1, etc. If no column index is specified, then the term and the default MAY be used to define a constant value for all rows. | no |
`term` | A Unified Resource Identifier (URI) for the term represented by this field. For example, a field containing the scientific name would have `term="http://rs.tdwg.org/dwc/terms/scientificName"`. Terms outside of the Darwin Core specification may be used, such as those from the Dublin Core Metadata Initative, for example, `dcterms:modified` would be `term="http://purl.org/dc/terms/modified"`. | yes | `term` | MUST be a Unified Resource Identifier (URI) for the term represented by this field. For example, a field containing the scientific name would have `term="http://rs.tdwg.org/dwc/terms/scientificName"`. Terms outside of the Darwin Core specification MAY be used, such as those from the Dublin Core Metadata Initative, for example, `dcterms:modified` would be `term="http://purl.org/dc/terms/modified"`. | yes |
`default` | Specifies value to use if one is not supplied for the field in a given row. If no index is supplied, the default can be used to define a constant for all rows for a field that is not in the data file. | no | `default` | Specifies value to use if one is not supplied for the field in a given row. If no index is supplied, the default MAY be used to define a constant for all rows for a field that is not in the data file. | no |
`vocabulary` | A Unified Resource Identifier (URI) for a vocabulary that the source values for this field are based on. The URI ideally should resolve to some machine readable definition like SKOS, RDF or at least some simple text or html file often found for ISO or RFC standards. For example <http://rs.gbif.org/vocabulary/gbif/nomenclatural_code.xml>, <http://www.ietf.org/rfc/rfc3066.txt> or <http://www.iso.org/iso/list-en1-semic-3.txt>. | no | `vocabulary` | When present, MUST be a Unified Resource Identifier (URI) for a vocabulary that the source values for this field are based on. The URI ideally should resolve to some machine readable definition like SKOS, RDF or at least some simple text or html file often found for ISO or RFC standards. For example <http://rs.gbif.org/vocabulary/gbif/nomenclatural_code.xml>, <http://www.ietf.org/rfc/rfc3066.txt> or <http://www.iso.org/iso/list-en1-semic-3.txt>. | no |
## 3 Implementation guide ## 3 Implementation guide

339
docs/xml/2014-11-08.md Normal file
View File

@ -0,0 +1,339 @@
# Darwin Core XML guide
Title
: Darwin Core XML guide
Date version issued
: 2015-06-02
Date created
: 2009-02-12
Part of TDWG Standard
: <http://www.tdwg.org/standards/450/>
This version
: <http://rs.tdwg.org/dwc/terms/guides/xml/2014-11-08>
Latest version
: <http://rs.tdwg.org/dwc/terms/guides/xml/>
Previous version
: <http://rs.tdwg.org/dwc/terms/guides/xml/2010-05-23>
Replaced by
: <http://rs.tdwg.org/dwc/terms/guides/xml/2021-07-15>
Abstract
: Guidelines for the implementation of Darwin Core in XML.
Contributors
: John Wieczorek (MVZ), Markus Döring (GBIF), Renato De Giovanni (CRIA), Tim Robertson (GBIF), Dave Vieglais (KUNHM)
Creator
: Darwin Core Task Group
Bibliographic citation
: Darwin Core Task Group. 2014. Darwin Core XML guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/xml/2014-11-08>
## 1 Introduction
This document provides guidelines for implementing application schemas based on [Darwin Core terms](../../terms/) using [XML](http://www.w3.org/XML/). The underlying metadata model is described (in a syntax neutral way), followed by some specific guidelines for XML implementations. Some guidance on the use of non-Darwin Core terms is also provided.
This document does not provide guidelines for encoding Darwin Core in RDF/XML. Nor does it take a position on the relative merits of encoding metadata in "plain" XML rather than RDF/XML. This document provides guidelines in those cases where RDF/XML is not considered appropriate.
### 1.1 Status of the content of this document
All sections of this document are normative, except for sections that are explicitly marked as non-normative.
### 1.2 Audience
This document is targeted toward those who wish to use or construct application schemas using Darwin Core terms in XML. It includes explanations of existing schemas such as [Simple Darwin Core](../simple/) and how to build new schemas to meet specific models of information.
## 2 Implementation guide
### 2.1 XML schema
Implementors should base their XML applications on [XML Schemas](http://www.w3.org/XML/Schema) rather than _XML DTDs_. Approaches based on _XML Schemas_ are more flexible and are more easily re-used within other XML applications.
### 2.2 XML namespaces
Implementors should use [XML Namespaces](http://www.w3.org/TR/1999/REC-xml-names-19990114/) to uniquely identify elements. Darwin Core namespaces are defined in the [Darwin Core Namespace Policy](../../namespace/), while Dublin Core namespaces are defined in the [DCMI Namespace Recommendation](http://dublincore.org/documents/dcmi-namespace/).
### 2.3 Abstract model
The Darwin Core follows the [Dublin Core Metadata Initiative Abstract Model](http://dublincore.org/documents/abstract-model/) except that the Darwin Core _record_ is roughly equivalent to the Dublin Core _resource_.
- Darwin Core terms are either `classes` or `properties`.
- Each `property` has at most one `class` as its domain (describes no more than one `class`).
- A `Darwin Core record` is made up of zero or more `classes` and one or more `properties` with their associated `values`.
- Each `value` is a literal string.
- The `values` of `properties` within a `Darwin Core record` describe that record.
- A `Darwin Core record` must include all required `properties`, if any, and their associated `values`.
### 2.4 Properties and values
Darwin Core follows the guidelines for expressing [Dublin Core metadata using XML](http://dublincore.org/documents/dc-xml/) except in that Darwin Core implementors should encode `properties` as XML elements and `values` as the content of those elements instead of having each property contain a value representation and its associated value. The name of the XML element should be an XML qualified name (QName), which associates the value given in the `Term name` attribute in the [Darwin Core Terms](../../terms/) recommendation with the appropriate namespace name. For example, use:
```xml
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://rs.tdwg.org/dwc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/">
...
<dwc:basisOfRecord>HumanObservation</dwc:basisOfRecord>
```
rather than:
```xml
<dwc:basisOfRecord value="HumanObservation"/>
```
### 2.5 Null values
Elements for which the value is null should be omitted from the document or explicitly coded using the attribute `xsi:nil="true"`.
```xml
<dwc:locality xsi:nil="true"/>
```
Do not use an empty string - an element with no content:
```xml
<dwc:locality></dwc:locality>
```
### 2.6 Simple Darwin Core
[Simple Darwin Core](tdwg_dwc_simple.xsd) most closely models the "flat" nature of many data sets. It is a ready-made schema for sharing information with no structure beyond properties of a _record_ (equivalent to fields in a table, or columns in a spreadsheet). It is meant to accommodate all properties except those that require further structure to be meaningful (auxilliary terms in the classes [ResourceRelationship](http://rs.tdwg.org/dwc/terms/ResourceRelationship) and [MeasurementOrFact](http://rs.tdwg.org/dwc/terms/MeasurementOrFact). The schema has no required terms and no term is repeated within a given _record_. Refer to [Simple Darwin Core](../simple/) for the rationale behind this schema.
The term [`dcterms:type`](http://rs.tdwg.org/dwc/terms/dcterms:type) (which is controlled by the [Dublin Core Type Vocabulary](http://dublincore.org/documents/dcmi-type-vocabulary/)), gives the basic category of object (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record is about. The term [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord), which has a controlled vocabulary distinct from that of `dcterms:type`, shows the name of the Darwin Core class (e.g., [`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon)) the record is about.
#### 2.6.1 Simple Darwin Core example (non-normative)
Following is a brief example of an XML document for a single specimen complying with the [Simple Darwin Core Schema](tdwg_dwc_simple.xsd)]. The [Simple Darwin Core XML example document](example_simple.xml) (if this link shows a blank page in your browser, use the View Source option to see the XML document) shows detail for a single record having a more complete set of elements.
```xml
<?xml version="1.0"?>
<dwr:SimpleDarwinRecordSet
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:dwr="http://rs.tdwg.org/dwc/xsd/simpledarwincore/">
<dwr:SimpleDarwinRecord>
<dcterms:type>PhysicalObject</dcterms:type>
<dcterms:modified>2009-02-12T12:43:31</dcterms:modified>
<dcterms:rightsHolder>Museum of Vertebrate Zoology</dcterms:rightsHolder>
<dcterms:rights>Creative Commons License</dcterms:rights>
<dwc:institutionCode>MVZ</dwc:institutionCode>
<dwc:collectionCode>Mammals</dwc:collectionCode>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
<dwc:basisOfRecord>PreservedSpecimen</dwc:basisOfRecord>
<dwc:country>Argentina</dwc:country>
<dwc:countryCode>AR</dwc:countryCode>
<dwc:stateProvince>Neuquén</dwc:stateProvince>
<dwc:locality>25 km al NNE de Bariloche por Ruta 40 (=237)</dwc:locality>
</dwr:SimpleDarwinRecord>
</dwr:SimpleDarwinRecordSet>
```
### 2.7 Classes and containment
Many Darwin Core terms (`properties`) are defined as being associated with another term (a `class`). For example, [`scientificName`](http://rs.tdwg.org/dwc/terms/scientificName) and [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon) are both Darwin Core terms, but `scientificName` is a property associated with the `Taxon` class. When constructing schemas that take advantage of classes in structures, implementors are encouraged to maintain the property/class relationships defined by the terms whenever possible (refer to the `Class` attribute of the term as given in the [Quick Reference Guide](../../terms/) or the attribute `dwcattributes:organizedInClass` in the term declaration in the [`dcterms.rdf`](../rdf/dcterms.rdf) file. To promote reuse, Darwin Core provides a set of xml schemas to use as the basis of additional schemas:
- [Terms XML Schema](tdwg_dwcterms.xsd) - property term definitions as typed global elements and named groups for all terms for a given class to be referenced. The schema makes use of substitution groups `anyClass`, `anyProperty`, `anyIdentifier` and `anyXYZTerm` for each class, e.g. `anyTaxonTerm`. This is the schema upon which the [Simple Darwin Core XML Schema](tdwg_dwc_simple.xsd) is based.
- [Class Terms XML Schema](tdwg_dwc_class_terms.xsd) - class term definitions as typed global elements with subelements referencing all corresponding property terms via their substitution group.
It is encouraged to use classes in a normalized way to avoid deep nesting. A [Darwin Core Tools and Applications page](https://github.com/tdwg/dwc-documentation/blob/master/documentation/resources.md) has been created as an index to example schemas for the purpose of community discussions and development. An [XML schema](tdwg_dwc_classes.xsd) is provided to freely mix any Darwin Core Class in a global list and allow them to reference each other using the respective class identifier terms.
#### 2.7.1 Normalized classes examples (non-normative)
Following is an example of using normalized classes to represent two related specimen occurrences (one of which has had a second identification) at one location following this class-based schema. Note that you can reuse the location definition here by referring to it via locationID:
```xml
<?xml version="1.0"?>
<dwr:DarwinRecordSet
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/dwcrecord/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_classes.xsd"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:dwr="http://rs.tdwg.org/dwc/dwcrecord/">
<dcterms:Location>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
<dwc:country>Argentina</dwc:country>
<dwc:countryCode>AR</dwc:countryCode>
<dwc:stateProvince>Neuquén</dwc:stateProvince>
<dwc:locality>25 km al NNE de Bariloche por Ruta 40 (=237)</dwc:locality>
</dcterms:Location>
<dwc:Occurrence>
<dcterms:type>PhysicalObject</dcterms:type>
<dcterms:modified>2009-02-12T12:43:31</dcterms:modified>
<dwc:institutionCode>MVZ</dwc:institutionCode>
<dwc:collectionCode>Mammals</dwc:collectionCode>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
<dwc:basisOfRecord>PreservedSpecimen</dwc:basisOfRecord>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
</dwc:Occurrence>
<dwc:Identification>
<dwc:identificationID>http://guid.mvz.org/identifications/23459</dwc:identificationID>
<dwc:identifiedBy>Richard Sage</dwc:identifiedBy>
<dwc:dateIdentified>2000</dwc:dateIdentified>
<dwc:identificationQualifier>sp.</dwc:identificationQualifier>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Identification>
<dwc:Taxon>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
<dwc:scientificName>Ctenomys</dwc:scientificName>
<dwc:taxonRank>genus</dwc:taxonRank>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:genus>Ctenomys</dwc:genus>
</dwc:Taxon>
<dwc:Identification>
<dwc:identificationID>http://guid.mvz.org/identifications/94752</dwc:identificationID>
<dwc:identifiedBy>James L Patton</dwc:identifiedBy>
<dwc:dateIdentified>2001-09-14</dwc:dateIdentified>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14523</dwc:occurrenceID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:df0a797c-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Identification>
<dwc:Taxon>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:df0a797c-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
<dwc:parentNameUsageID>urn:lsid:catalogueoflife.org:taxon:d79c11aa-29c1-102b-9a4a-00304854f820:col20120721</dwc:parentNameUsageID>
<dwc:scientificName>Ctenomys sociabilis</dwc:scientificName>
<dwc:scientificNameAuthorship>Pearson and Christie, 1985</dwc:scientificNameAuthorship>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:higherClassification>Animalia; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Hystricognatha; Hystricognathi; Ctenomyidae; Ctenomyini; Ctenomys</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Mammalia</dwc:class>
<dwc:order>Rodentia</dwc:order>
<dwc:family>Ctenomyidae</dwc:family>
<dwc:genus>Ctenomys</dwc:genus>
<dwc:specificEpithet>sociabilis</dwc:specificEpithet>
</dwc:Taxon>
<dwc:Occurrence>
<dcterms:type>PhysicalObject</dcterms:type>
<dcterms:modified>2009-02-12T12:43:31</dcterms:modified>
<dwc:institutionCode>MVZ</dwc:institutionCode>
<dwc:collectionCode>Mammals</dwc:collectionCode>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14524</dwc:occurrenceID>
<dwc:basisOfRecord>PreservedSpecimen</dwc:basisOfRecord>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
</dwc:Occurrence>
<dwc:Identification>
<dwc:identificationID>http://guid.mvz.org/identifications/94753</dwc:identificationID>
<dwc:identifiedBy>James L Patton</dwc:identifiedBy>
<dwc:dateIdentified>2001-09-14</dwc:dateIdentified>
<dwc:occurrenceID>urn:catalog:MVZ:Mammals:14524</dwc:occurrenceID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:df0a797c-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Identification>
<dwc:ResourceRelationship>
<dwc:resourceRelationshipID>http://guid.mvz.org/relations/23423</dwc:resourceRelationshipID>
<dwc:resourceID>urn:catalog:MVZ:Mammals:14523</dwc:resourceID>
<dwc:relatedResourceID>urn:catalog:MVZ:Mammals:14524</dwc:relatedResourceID>
<dwc:relationshipOfResource>offspring of</dwc:relationshipOfResource>
</dwc:ResourceRelationship>
<dwc:ResourceRelationship>
<dwc:resourceRelationshipID>http://guid.mvz.org/relations/23424</dwc:resourceRelationshipID>
<dwc:resourceID>urn:catalog:MVZ:Mammals:14524</dwc:resourceID>
<dwc:relatedResourceID>urn:catalog:MVZ:Mammals:14523</dwc:relatedResourceID>
<dwc:relationshipOfResource>mother of</dwc:relationshipOfResource>
</dwc:ResourceRelationship>
</dwr:DarwinRecordSet>
```
Here is different example demonstrating area count observations for events on two different days at one location. Note that we omit the identification class here as there is not identification related data and link via the `taxonID` directly:
```xml
<?xml version="1.0"?>
<dwr:DarwinRecordSet
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/dwcrecord/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_classes.xsd"
xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:dwr="http://rs.tdwg.org/dwc/dwcrecord/">
<dcterms:Location>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
<dwc:country>Argentina</dwc:country>
<dwc:countryCode>AR</dwc:countryCode>
<dwc:stateProvince>Neuquén</dwc:stateProvince>
<dwc:locality>Valle Limay, Estancia Rincon Grande, 48 ha area with centroid at this point</dwc:locality>
<dwc:decimalLatitude>-40.97467</dwc:decimalLatitude>
<dwc:decimalLongitude>-71.0734</dwc:decimalLongitude>
<dwc:geodeticDatum>WGS84</dwc:geodeticDatum>
<dwc:coordinateUncertaintyInMeters>200</dwc:coordinateUncertaintyInMeters>
</dcterms:Location>
<dwc:Event>
<dwc:eventID>http://guid.mvz.org/events/2006/11/26/17</dwc:eventID>
<dwc:samplingProtocol>area count</dwc:samplingProtocol>
<dwc:eventDate>2006-11-26</dwc:eventDate>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
</dwc:Event>
<dwc:Occurrence>
<dwc:occurrenceID>urn:catalog:AUDCLO:EBIRD:OBS64515288</dwc:occurrenceID>
<dcterms:type>Event</dcterms:type>
<dcterms:modified>2009-02-17T07:33:04Z</dcterms:modified>
<dwc:institutionCode>AUDCLO</dwc:institutionCode>
<dwc:collectionCode>EBIRD</dwc:collectionCode>
<dwc:basisOfRecord>HumanObservation</dwc:basisOfRecord>
<dwc:individualCount>2</dwc:individualCount>
<dwc:eventID>http://guid.mvz.org/events/2006/11/26/17</dwc:eventID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000ee00-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Occurrence>
<dwc:Taxon>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000ee00-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
<dwc:scientificName>Anthus hellmayri Hartert, 1909</dwc:scientificName>
<dwc:class>Aves</dwc:class>
<dwc:genus>Anthus</dwc:genus>
<dwc:specificEpithet>hellmayri</dwc:specificEpithet>
</dwc:Taxon>
<dwc:Occurrence>
<dwc:occurrenceID>urn:catalog:AUDCLO:EBIRD:OBS64515286</dwc:occurrenceID>
<dcterms:type>Event</dcterms:type>
<dcterms:modified>2009-02-17T07:33:04Z</dcterms:modified>
<dwc:institutionCode>AUDCLO</dwc:institutionCode>
<dwc:collectionCode>EBIRD</dwc:collectionCode>
<dwc:basisOfRecord>HumanObservation</dwc:basisOfRecord>
<dwc:individualCount>1</dwc:individualCount>
<dwc:eventID>http://guid.mvz.org/events/2006/11/26/17</dwc:eventID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000e838-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Occurrence>
<dwc:Taxon>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000e838-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
<dwc:scientificName>Anthus correndera Vieillot, 1818</dwc:scientificName>
<dwc:class>Aves</dwc:class>
<dwc:genus>Anthus</dwc:genus>
<dwc:specificEpithet>correndera</dwc:specificEpithet>
</dwc:Taxon>
<dwc:Event>
<dwc:eventID>http://guid.mvz.org/events/2006/11/27/6</dwc:eventID>
<dwc:samplingProtocol>area count</dwc:samplingProtocol>
<dwc:eventDate>2006-11-27</dwc:eventDate>
<dwc:locationID>http://guid.mvz.org/sites/arg/127</dwc:locationID>
</dwc:Event>
<dwc:Occurrence>
<dwc:occurrenceID>urn:catalog:AUDCLO:EBIRD:OBS64515333</dwc:occurrenceID>
<dcterms:type>Event</dcterms:type>
<dcterms:modified>2009-02-17T07:33:04Z</dcterms:modified>
<dwc:institutionCode>AUDCLO</dwc:institutionCode>
<dwc:collectionCode>EBIRD</dwc:collectionCode>
<dwc:basisOfRecord>HumanObservation</dwc:basisOfRecord>
<dwc:individualCount>1</dwc:individualCount>
<dwc:eventID>http://guid.mvz.org/events/2006/11/27/6</dwc:eventID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000ee00-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Occurrence>
<dwc:Occurrence>
<dwc:occurrenceID>urn:catalog:AUDCLO:EBIRD:OBS64515331</dwc:occurrenceID>
<dcterms:type>Event</dcterms:type>
<dcterms:modified>2009-02-17T07:33:04Z</dcterms:modified>
<dwc:institutionCode>AUDCLO</dwc:institutionCode>
<dwc:collectionCode>EBIRD</dwc:collectionCode>
<dwc:basisOfRecord>HumanObservation</dwc:basisOfRecord>
<dwc:individualCount>2</dwc:individualCount>
<dwc:eventID>http://guid.mvz.org/events/2006/11/27/6</dwc:eventID>
<dwc:taxonID>urn:lsid:catalogueoflife.org:taxon:f000ee00-29c1-102b-9a4a-00304854f820:col20120721</dwc:taxonID>
</dwc:Occurrence>
</dwr:DarwinRecordSet>
```

View File

@ -4,7 +4,7 @@ Title
: Darwin Core XML guide : Darwin Core XML guide
Date version issued Date version issued
: 2015-06-02 : 2021-07-15
Date created Date created
: 2009-02-12 : 2009-02-12
@ -13,13 +13,13 @@ Part of TDWG Standard
: <http://www.tdwg.org/standards/450/> : <http://www.tdwg.org/standards/450/>
This version This version
: <http://rs.tdwg.org/dwc/terms/guides/xml/2014-11-08> : <http://rs.tdwg.org/dwc/terms/guides/xml/2021-07-15>
Latest version Latest version
: <http://rs.tdwg.org/dwc/terms/guides/xml/> : <http://rs.tdwg.org/dwc/terms/guides/xml/>
Previous version Previous version
: <http://rs.tdwg.org/dwc/terms/guides/xml/2010-05-23> : <http://rs.tdwg.org/dwc/terms/guides/xml/2014-11-08>
Abstract Abstract
: Guidelines for the implementation of Darwin Core in XML. : Guidelines for the implementation of Darwin Core in XML.
@ -31,18 +31,22 @@ Creator
: Darwin Core Task Group : Darwin Core Task Group
Bibliographic citation Bibliographic citation
: Darwin Core Task Group. 2009. Darwin Core XML guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/xml/> : Darwin Core Maintenance Group. 2021. Darwin Core XML guide. Biodiversity Information Standards (TDWG). <http://rs.tdwg.org/dwc/terms/guides/xml/2021-07-15>
## 1 Introduction ## 1 Introduction
This document provides guidelines for implementing application schemas based on [Darwin Core terms](../../terms/) using [XML](http://www.w3.org/XML/). The underlying metadata model is described (in a syntax neutral way), followed by some specific guidelines for XML implementations. Some guidance on the use of non-Darwin Core terms is also provided. This document provides guidelines for implementing application schemas based on [Darwin Core terms](../../terms/) using [XML](http://www.w3.org/XML/). The underlying metadata model is described (in a syntax neutral way), followed by some specific guidelines for XML implementations. Some guidance on the use of non-Darwin Core terms is also provided.
This document does not provide guidelines for encoding Darwin Core in RDF/XML. Nor does it take a position on the relative merits of encoding metadata in "plain" XML rather than RDF/XML. This document provides guidelines in those cases where RDF/XML is not considered appropriate. This document does not provide guidelines for encoding Darwin Core in RDF/XML. Nor does it take a position on the relative merits of encoding metadata in "plain" XML rather than RDF/XML. This document provides guidelines in those cases where RDF/XML is not considered appropriate. For information about implementing Darwin Core as RDF, see the Darwin Core RDF Guide, <http://rs.tdwg.org/dwc/terms/guides/rdf/>.
### 1.1 Status of the content of this document ### 1.1 Status of the content of this document
All sections of this document are normative, except for sections that are explicitly marked as non-normative. All sections of this document are normative, except for sections that are explicitly marked as non-normative.
#### 1.1.1 RFC 2119 key words
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in [RFC 2119](https://tools.ietf.org/html/rfc2119).
### 1.2 Audience ### 1.2 Audience
This document is targeted toward those who wish to use or construct application schemas using Darwin Core terms in XML. It includes explanations of existing schemas such as [Simple Darwin Core](../simple/) and how to build new schemas to meet specific models of information. This document is targeted toward those who wish to use or construct application schemas using Darwin Core terms in XML. It includes explanations of existing schemas such as [Simple Darwin Core](../simple/) and how to build new schemas to meet specific models of information.
@ -51,11 +55,11 @@ This document is targeted toward those who wish to use or construct application
### 2.1 XML schema ### 2.1 XML schema
Implementors should base their XML applications on [XML Schemas](http://www.w3.org/XML/Schema) rather than _XML DTDs_. Approaches based on _XML Schemas_ are more flexible and are more easily re-used within other XML applications. Implementors SHOULD base their XML applications on [XML Schemas](http://www.w3.org/XML/Schema) rather than _XML DTDs_. Approaches based on _XML Schemas_ are more flexible and are more easily re-used within other XML applications.
### 2.2 XML namespaces ### 2.2 XML namespaces
Implementors should use [XML Namespaces](http://www.w3.org/TR/1999/REC-xml-names-19990114/) to uniquely identify elements. Darwin Core namespaces are defined in the [Darwin Core Namespace Policy](../../namespace/), while Dublin Core namespaces are defined in the [DCMI Namespace Recommendation](http://dublincore.org/documents/dcmi-namespace/). Implementors SHOULD use [XML Namespaces](http://www.w3.org/TR/1999/REC-xml-names-19990114/) to uniquely identify elements. Darwin Core namespaces are defined in the [Darwin Core Namespace Policy](../../namespace/), while Dublin Core namespaces are defined in the [DCMI Namespace Recommendation](http://dublincore.org/documents/dcmi-namespace/).
### 2.3 Abstract model ### 2.3 Abstract model
@ -66,11 +70,11 @@ The Darwin Core follows the [Dublin Core Metadata Initiative Abstract Model](htt
- A `Darwin Core record` is made up of zero or more `classes` and one or more `properties` with their associated `values`. - A `Darwin Core record` is made up of zero or more `classes` and one or more `properties` with their associated `values`.
- Each `value` is a literal string. - Each `value` is a literal string.
- The `values` of `properties` within a `Darwin Core record` describe that record. - The `values` of `properties` within a `Darwin Core record` describe that record.
- A `Darwin Core record` must include all required `properties`, if any, and their associated `values`. - A `Darwin Core record` MUST include all required `properties`, if any, and their associated `values`.
### 2.4 Properties and values ### 2.4 Properties and values
Darwin Core follows the guidelines for expressing [Dublin Core metadata using XML](http://dublincore.org/documents/dc-xml/) except in that Darwin Core implementors should encode `properties` as XML elements and `values` as the content of those elements instead of having each property contain a value representation and its associated value. The name of the XML element should be an XML qualified name (QName), which associates the value given in the `Term name` attribute in the [Darwin Core Terms](../../terms/) recommendation with the appropriate namespace name. For example, use: Darwin Core follows the guidelines for expressing [Dublin Core metadata using XML](http://dublincore.org/documents/dc-xml/) except in that Darwin Core implementors MUST encode `properties` as XML elements and `values` as the content of those elements instead of having each property contain a value representation and its associated value. The name of the XML element SHOULD be an XML qualified name (QName), which associates the value given in the `Term name` attribute in the [Darwin Core Terms](../../terms/) recommendation with the appropriate namespace name. For example, use:
```xml ```xml
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
@ -88,13 +92,13 @@ rather than:
### 2.5 Null values ### 2.5 Null values
Elements for which the value is null should be omitted from the document or explicitly coded using the attribute `xsi:nil="true"`. Elements for which the value is null SHOULD be omitted from the document or OPTIONALLY be explicitly coded using the attribute `xsi:nil="true"`.
```xml ```xml
<dwc:locality xsi:nil="true"/> <dwc:locality xsi:nil="true"/>
``` ```
Do not use an empty string - an element with no content: Implementers MUST NOT use an empty string - an element with no content:
```xml ```xml
<dwc:locality></dwc:locality> <dwc:locality></dwc:locality>
@ -102,7 +106,7 @@ Do not use an empty string - an element with no content:
### 2.6 Simple Darwin Core ### 2.6 Simple Darwin Core
[Simple Darwin Core](tdwg_dwc_simple.xsd) most closely models the "flat" nature of many data sets. It is a ready-made schema for sharing information with no structure beyond properties of a _record_ (equivalent to fields in a table, or columns in a spreadsheet). It is meant to accommodate all properties except those that require further structure to be meaningful (auxilliary terms in the classes [ResourceRelationship](http://rs.tdwg.org/dwc/terms/ResourceRelationship) and [MeasurementOrFact](http://rs.tdwg.org/dwc/terms/MeasurementOrFact). The schema has no required terms and no term is repeated within a given _record_. Refer to [Simple Darwin Core](../simple/) for the rationale behind this schema. [Simple Darwin Core](tdwg_dwc_simple.xsd) most closely models the "flat" nature of many data sets. It is a ready-made schema for sharing information with no structure beyond properties of a _record_ (equivalent to fields in a table, or columns in a spreadsheet). It is meant to accommodate all properties except those that require further structure to be meaningful (auxilliary terms in the classes [ResourceRelationship](http://rs.tdwg.org/dwc/terms/ResourceRelationship) and [MeasurementOrFact](http://rs.tdwg.org/dwc/terms/MeasurementOrFact). The schema has no required terms and terms SHOULD NOT be repeated within a given _record_. Refer to [Simple Darwin Core](../simple/) for the rationale behind this schema.
The term [`dcterms:type`](http://rs.tdwg.org/dwc/terms/dcterms:type) (which is controlled by the [Dublin Core Type Vocabulary](http://dublincore.org/documents/dcmi-type-vocabulary/)), gives the basic category of object (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record is about. The term [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord), which has a controlled vocabulary distinct from that of `dcterms:type`, shows the name of the Darwin Core class (e.g., [`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon)) the record is about. The term [`dcterms:type`](http://rs.tdwg.org/dwc/terms/dcterms:type) (which is controlled by the [Dublin Core Type Vocabulary](http://dublincore.org/documents/dcmi-type-vocabulary/)), gives the basic category of object (`PhysicalObject`, `StillImage`, `MovingImage`, `Sound`, `Text`) the record is about. The term [`basisOfRecord`](http://rs.tdwg.org/dwc/terms/basisOfRecord), which has a controlled vocabulary distinct from that of `dcterms:type`, shows the name of the Darwin Core class (e.g., [`LivingSpecimen`](http://rs.tdwg.org/dwc/terms/LivingSpecimen), [`PreservedSpecimen`](http://rs.tdwg.org/dwc/terms/PreservedSpecimen), [`FossilSpecimen`](http://rs.tdwg.org/dwc/terms/FossilSpecimen), [`HumanObservation`](http://rs.tdwg.org/dwc/terms/HumanObservation), [`MachineObservation`](http://rs.tdwg.org/dwc/terms/MachineObservation), [`Taxon`](http://rs.tdwg.org/dwc/terms/Taxon)) the record is about.
@ -142,7 +146,7 @@ Many Darwin Core terms (`properties`) are defined as being associated with anoth
- [Terms XML Schema](tdwg_dwcterms.xsd) - property term definitions as typed global elements and named groups for all terms for a given class to be referenced. The schema makes use of substitution groups `anyClass`, `anyProperty`, `anyIdentifier` and `anyXYZTerm` for each class, e.g. `anyTaxonTerm`. This is the schema upon which the [Simple Darwin Core XML Schema](tdwg_dwc_simple.xsd) is based. - [Terms XML Schema](tdwg_dwcterms.xsd) - property term definitions as typed global elements and named groups for all terms for a given class to be referenced. The schema makes use of substitution groups `anyClass`, `anyProperty`, `anyIdentifier` and `anyXYZTerm` for each class, e.g. `anyTaxonTerm`. This is the schema upon which the [Simple Darwin Core XML Schema](tdwg_dwc_simple.xsd) is based.
- [Class Terms XML Schema](tdwg_dwc_class_terms.xsd) - class term definitions as typed global elements with subelements referencing all corresponding property terms via their substitution group. - [Class Terms XML Schema](tdwg_dwc_class_terms.xsd) - class term definitions as typed global elements with subelements referencing all corresponding property terms via their substitution group.
It is encouraged to use classes in a normalized way to avoid deep nesting. A [Darwin Core Tools and Applications page](https://github.com/tdwg/dwc-documentation/blob/master/documentation/resources.md) has been created as an index to example schemas for the purpose of community discussions and development. An [XML schema](tdwg_dwc_classes.xsd) is provided to freely mix any Darwin Core Class in a global list and allow them to reference each other using the respective class identifier terms. It is RECOMMENDED to use classes in a normalized way to avoid deep nesting. A [Darwin Core Tools and Applications page](https://github.com/tdwg/dwc-documentation/blob/master/documentation/resources.md) has been created as an index to example schemas for the purpose of community discussions and development. An [XML schema](tdwg_dwc_classes.xsd) is provided to freely mix any Darwin Core Class in a global list and allow them to reference each other using the respective class identifier terms.
#### 2.7.1 Normalized classes examples (non-normative) #### 2.7.1 Normalized classes examples (non-normative)