diff --git a/docs/rdf/index.md b/docs/rdf/index.md index 9ccc812..d8429bb 100644 --- a/docs/rdf/index.md +++ b/docs/rdf/index.md @@ -22,7 +22,7 @@ Previous version : http://rs.tdwg.org/dwc/terms/guides/rdf/2015-03-27 Abstract -: This guide is intended to facilitate the use of Darwin Core terms in the Resource Description Framework (RDF). It explains basic features of RDF and provides details of how to expose data in the form of RDF using Darwin Core terms and terms from other key vocabularies. It defines terms in the namespace ```http://rs.tdwg.org/dwc/iri/``` which are intended for use excusively with non-literal objects. +: This guide is intended to facilitate the use of Darwin Core terms in the Resource Description Framework (RDF). It explains basic features of RDF and provides details of how to expose data in the form of RDF using Darwin Core terms and terms from other key vocabularies. It defines terms in the namespace ```http://rs.tdwg.org/dwc/iri/``` which are intended for use exclusively with non-literal objects. Contributors : Steve Baskauf (TDWG RDF/OWL Task Group), John Wieczorek (TDWG Darwin Core Task Group), John Deck (Genomic Biodiversity Working Group), Campbell Webb (TDWG RDF/OWL Task Group), Paul J. Morris (Harvard University Herbaria/Museum of Comparative Zoölogy), Mark Schildhauer (National Center for Ecological Analysis and Synthesis) (KUNHM) @@ -35,16 +35,16 @@ Bibliographic citation ## 1 Introduction (non-normative) -Each method of encoding information using Darwin Core [[DWC](http://rs.tdwg.org/dwc/index.htm)] has a guide explaining how to use the Darwin Core terms in that situation. This is the guide for encoding biodiversity data using the Resource Description Framework [[RDF](http://www.w3.org/TR/rdf11-concepts/)]. +Each method of encoding information using [Darwin Core](http://rs.tdwg.org/dwc/index.htm) has a guide explaining how to use the Darwin Core terms in that situation. This is the guide for encoding biodiversity data using the [Resource Description Framework](http://www.w3.org/TR/rdf11-concepts/). -The Darwin Core RDF Guide is targeted toward those who wish to share biodiversity data described by Darwin Core (DwC) properties using RDF. It describes how community best practices for expressing fundamental information about resources using RDF relate to Darwin Core terms, and clarifies how Darwin Core terms should be used in RDF with literal (string) and non-literal (IRI reference) objects. It is not intended to explain the model and syntax of RDF. For a general introduction to RDF in a biodiversity context, see the [[RDF-BEGINNERS-GUIDE](https://github.com/tdwg/rdf/blob/master/Beginners.md)]. For a more detailed introduction to RDF, see the [[RDF-PRIMER](http://www.w3.org/TR/rdf11-primer/)]. +The Darwin Core RDF Guide is targeted toward those who wish to share biodiversity data described by Darwin Core (DwC) properties using RDF. It describes how community best practices for expressing fundamental information about resources using RDF relate to Darwin Core terms, and clarifies how Darwin Core terms should be used in RDF with literal (string) and non-literal (IRI reference) objects. It is not intended to explain the model and syntax of RDF. For a general introduction to RDF in a biodiversity context, see the [Beginner's Guide to RDF](https://github.com/tdwg/rdf/blob/master/Beginners.md). For a more detailed introduction to RDF, see the [RDF Primer](http://www.w3.org/TR/rdf11-primer/). ### 1.1 Status of the content of this document Sections of this document are explicitly identified as either normative or non-normative. All numbered examples are non-normative, even if they fall within sections designated as normative. Tables may be designated as non-normative, even if they fall within sections designated as normative. ### 1.2 Rationale (non-normative) -Darwin Core is a vocabulary which provides terms that can be used to describe the properties and types of entities (known in RDF as "resources") in the biodiversity realm. Darwin Core is a general purpose vocabulary because its terms can be used as part of a number of data transfer systems. RDF differs in several important ways from other data transfer systems for which DwC usage guides exist ([[TEXTGUIDE](http://rs.tdwg.org/dwc/terms/guides/text/)] and [[XMLGUIDE](http://rs.tdwg.org/dwc/terms/guides/xml/)]). By its nature, RDF is a distributed system. It is assumed that data from one provider will be linked to data from other providers. This also implies that it is always possible to discover new data properties about a particular resource and that those properties may be described using unfamiliar terms. This differs significantly from other data transfer systems where there must be a pre-existing agreement (in the form of a federation schema or human-understandable document) between the sender and receiver about the format of the data and the organization and interpretation of the terms within records. Because RDF is intended to facilitate data and metadata discovery by machines (actually computer programs known as semantic clients or just "clients"), the meaning and use of terms must be well-defined and discoverable by clients without human intervention. To facilitate cross-referencing of resources among different data providers, resources must be identified using standardized, machine-understandable, and globally unique identifiers known as internationalized resource identifiers (IRIs). Finally, because anyone can make statements about a resource without agreeing to a pre-determined schema, RDF by its nature is a highly normalized network of relationships, in contrast to typical database tables which are by their nature "flat". Because of these differences, effective use of RDF requires that its users adhere to what are essentially evolving social conventions about identifiers, data transfer protocols, and application of vocabularies. Some of these conventions will be described in the following sections. +Darwin Core is a vocabulary which provides terms that can be used to describe the properties and types of entities (known in RDF as "resources") in the biodiversity realm. Darwin Core is a general purpose vocabulary because its terms can be used as part of a number of data transfer systems. RDF differs in several important ways from other data transfer systems for which DwC usage guides exist ([Text Guide](http://rs.tdwg.org/dwc/terms/guides/text/) and [XML Guide](http://rs.tdwg.org/dwc/terms/guides/xml/)). By its nature, RDF is a distributed system. It is assumed that data from one provider will be linked to data from other providers. This also implies that it is always possible to discover new data properties about a particular resource and that those properties may be described using unfamiliar terms. This differs significantly from other data transfer systems where there must be a pre-existing agreement (in the form of a federation schema or human-understandable document) between the sender and receiver about the format of the data and the organization and interpretation of the terms within records. Because RDF is intended to facilitate data and metadata discovery by machines (actually computer programs known as semantic clients or just "clients"), the meaning and use of terms must be well-defined and discoverable by clients without human intervention. To facilitate cross-referencing of resources among different data providers, resources must be identified using standardized, machine-understandable, and globally unique identifiers known as internationalized resource identifiers (IRIs). Finally, because anyone can make statements about a resource without agreeing to a pre-determined schema, RDF by its nature is a highly normalized network of relationships, in contrast to typical database tables which are by their nature "flat". Because of these differences, effective use of RDF requires that its users adhere to what are essentially evolving social conventions about identifiers, data transfer protocols, and application of vocabularies. Some of these conventions will be described in the following sections. ### 1.3 Features of RDF (non-normative) @@ -58,7 +58,7 @@ The RDF model itself is independent of any specific serialization syntax. The fo Each arrow represents a statement about the image, called a "triple" in RDF. The set of triples is called an RDF graph. Resources (represented by ovals) are identified by IRIs. The described resource (in this example the image http://bioimages.vanderbilt.edu/kirchoff/ac1490 at the tail of the arrow) is called the subject of the triple. Properties of the subject resource are identified by term IRIs shown here with their namespaces abbreviated (e.g., ```dcterms:``` = "http://purl.org/dc/terms/"). The property is called the predicate of the triple. The values of the properties are called the object of the statement, with literal values (consisting of text) represented by rectangles. -This RDF graph can be serialized in a somewhat human-friendly syntax called Terse RDF Triple Language (Turtle) [[TURTLE](http://www.w3.org/TR/turtle/)]: +This RDF graph can be serialized in a somewhat human-friendly syntax called [Terse RDF Triple Language (Turtle)](http://www.w3.org/TR/turtle/): ```turtle @prefix rdf: . @@ -69,7 +69,7 @@ This RDF graph can be serialized in a somewhat human-friendly syntax called Ters        dcterms:creator . ``` -Here is the graph in RDF/XML syntax [[RDF-XML-SYNTAX](http://www.w3.org/TR/rdf-syntax-grammar/)]: +Here is the graph in [RDF/XML syntax](http://www.w3.org/TR/rdf-syntax-grammar/): ```rdf @@ -92,25 +92,25 @@ http://bioimages.vanderbilt.edu/contact/kirchoff#coblea Abbreviated UIRIs will be shown as ```inline code``` in the form ```namespace:localName```, e.g., ```rdf:type```. Namespace abbreviations when shown by themselves will also be shown in italics, e.g., ```dwc:``` . Examples will be displayed in Courier type. -XML is a widely understood form of RDF serialization. Therefore, all examples given here will be shown as RDF/XML. In most cases, they will also be shown in Turtle. For more detailed information about RDF serialization, see part 3 of the Beginner's Guide to RDF [[RDF-BEGINNERS-GUIDE](http://code.google.com/p/tdwg-rdf/wiki/Beginners)] and the references cited there. +XML is a widely understood form of RDF serialization. Therefore, all examples given here will be shown as RDF/XML. In most cases, they will also be shown in Turtle. For more detailed information about RDF serialization, see part 3 of the [Beginner's Guide to RDF](http://code.google.com/p/tdwg-rdf/wiki/Beginners) and the references cited there. #### 1.3.2 Internationalized Resource Identifier (IRI) (non-normative) -Data providers make use of a variety of identifiers to refer to resources they wish to provide. These identifiers may be locally unique within the provider's database, or they may be globally unique. Providers have sought to make their identifiers globally unique through such means as "Darwin Core Triplets" (institutionCode:collectionCode:catalogNumber) and creation of UUIDs [[UUID](http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=62795)]. However, only identifiers in the form of IRIs [[IRI](http://tools.ietf.org/html/rfc3987)] can be valid subjects of statements (known as RDF triples) in RDF, so neither “Darwin Core Triples” nor UUIDs can be used in unmodified form for that purpose. IRIs are a superset of a narrower form of identifiers known as Uniform Resource Identifiers (URIs) that can be used in place of IRIs [[URI](http://tools.ietf.org/html/rfc3986)]. This document will refer exclusively to IRIs with the understanding that URIs may be used in place of IRIs. +Data providers make use of a variety of identifiers to refer to resources they wish to provide. These identifiers may be locally unique within the provider's database, or they may be globally unique. Providers have sought to make their identifiers globally unique through such means as "Darwin Core Triplets" (institutionCode:collectionCode:catalogNumber) and creation of [UUIDs](http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=62795). However, only identifiers in the form of [IRIs](http://tools.ietf.org/html/rfc3987) can be valid subjects of statements (known as RDF triples) in RDF, so neither “Darwin Core Triples” nor UUIDs can be used in unmodified form for that purpose. IRIs are a superset of a narrower form of identifiers known as Uniform Resource Identifiers (URIs) that can be used in place of [IRIs](http://tools.ietf.org/html/rfc3986). This document will refer exclusively to IRIs with the understanding that URIs may be used in place of IRIs. -The most familiar form of IRI is a Uniform Resource Locator (URL) which not only identifies a resource, but provides information about retrieving an information resource (i.e., a resource that can be transmitted in electronic form) such as text in the form of an HTML web page. However, in general IRIs may identify non-information resources (physical or conceptual entities) that are not transmittable electronically, e.g., , a person. If a client attempts to retrieve a non-information resource by dereferencing its HTTP IRI, a process called content negotiation [[HTTP-CONTENT-NEGOTIATION](http://tools.ietf.org/html/rfc2616#section-12)] is used to refer the client to the IRI of an information resource representation of the non-information resource. For humans, this is usually a web page, while for semantic clients (machines) the representation is a document in the form of RDF/XML. For more detailed information about IRIs see part 1 of the Beginner's Guide to RDF [[RDF-BEGINNERS-GUIDE](http://code.google.com/p/tdwg-rdf/wiki/Beginners)] and the references cited there. +The most familiar form of IRI is a Uniform Resource Locator (URL) which not only identifies a resource, but provides information about retrieving an information resource (i.e., a resource that can be transmitted in electronic form) such as text in the form of an HTML web page. However, in general IRIs may identify non-information resources (physical or conceptual entities) that are not transmittable electronically, e.g., , a person. If a client attempts to retrieve a non-information resource by dereferencing its HTTP IRI, a process called [content negotiation](http://tools.ietf.org/html/rfc2616#section-12) is used to refer the client to the IRI of an information resource representation of the non-information resource. For humans, this is usually a web page, while for semantic clients (machines) the representation is a document in the form of RDF/XML. For more detailed information about IRIs see part 1 of the [Beginner's Guide to RDF](http://code.google.com/p/tdwg-rdf/wiki/Beginners) and the references cited there. ##### 1.3.2.1 Persistent Identifiers (normative) -Best practices dictate that identifiers (known as persistent identifiers or globally unique identifiers: GUIDs) which are used to identify resources of permanent interest be globally unique, referentially consistent, and persistent [[GUID-STANDARD](http://www.tdwg.org/standards/150/)]. If those identifiers are to be used to identify subject resources in RDF, they must also be in the form of an IRI. This has two implications for data providers. +Best practices dictate that identifiers (known as persistent identifiers or globally unique identifiers: GUIDs) which are used to identify resources of permanent interest be globally unique, referentially consistent, and persistent ([TDWG Globally Unique Identifiers (GUID) applicability statement:](http://www.tdwg.org/standards/150/)). If those identifiers are to be used to identify subject resources in RDF, they must also be in the form of an IRI. This has two implications for data providers. -First, if a non-IRI globally unique identifier is used to identify a subject resource, it must be converted to an IRI by making it conform to a well-known IRI scheme (e.g., a URN or HTTP IRI) [[URI-SCHEMES](http://www.iana.org/assignments/uri-schemes.html)]. For example, a UUID can be transformed into a URN [[UUID-URN-NAMESPACE](http://tools.ietf.org/html/rfc4122)] by prefixing its string representation with "urn:uuid:" as in +First, if a non-IRI globally unique identifier is used to identify a subject resource, it must be converted to an IRI by making it conform to a well-known IRI scheme (e.g., a [URN or HTTP IRI](http://www.iana.org/assignments/uri-schemes.html)). For example, a UUID can be transformed into a [URN](http://tools.ietf.org/html/rfc4122) by prefixing its string representation with "urn:uuid:" as in ``` urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6 ``` -Similarly, ISBNs can be converted to URNs [[ISBN-AS-URN](http://tools.ietf.org/html/rfc3187)] as shown in Examples 13 and 14. Although these URNs are valid IRIs, they have the disadvantage that they are not actionable (see [Section 1.3.2.2](./index.htm#1.3.2.2_HTTP_IRIs_as_self-resolving_GUIDs)). In contrast, the "Darwin Core triplet" +Similarly, [ISBNs can be converted to URNs](http://tools.ietf.org/html/rfc3187) as shown in Examples 13 and 14. Although these URNs are valid IRIs, they have the disadvantage that they are not actionable (see [Section 1.3.2.2](./index.htm#1.3.2.2_HTTP_IRIs_as_self-resolving_GUIDs)). In contrast, the "Darwin Core triplet" ``` MVZ:Mamm:165861 @@ -122,29 +122,29 @@ has been turned into an actionable IRI by appending it to the base string "http: http://arctos.database.museum/guid/MVZ:Mamm:165861 ``` -Second, if a provider refers to a resource using an URL that provides data about the resource, the provider should take care to ensure that the URL does not change over time (e.g., if the content moves to a different server, if the directory structure changes, or if a new version of the database is introduced). It may be preferable to use content negotiation [[HTTP-CONTENT-NEGOTIATION](http://www.w3.org/TR/cooluris/)] to redirect the user from a persistent IRI which refers to the resource, to the URL of a web page which describes the resource. +Second, if a provider refers to a resource using an URL that provides data about the resource, the provider should take care to ensure that the URL does not change over time (e.g., if the content moves to a different server, if the directory structure changes, or if a new version of the database is introduced). It may be preferable to use [content negotiation](http://www.w3.org/TR/cooluris/) to redirect the user from a persistent IRI which refers to the resource, to the URL of a web page which describes the resource. -For a more detailed introduction to persistent identifiers, see the GBIF Beginner's Guide to Persistent Identifiers [[GUID-GUIDE-GBIF](http://www.gbif.org/resources/2575)]. +For a more detailed introduction to persistent identifiers, see the [GBIF Beginner's Guide to Persistent Identifiers](http://www.gbif.org/resources/2575). -Based on the precedent set by the TDWG LSID Applicability Statement standard [[GUID-STANDARD](http://www.tdwg.org/standards/150/)], it is recommended that URN-based IRIs be related to HTTP-proxied equivalents (if they exist) as described in [Section 2.2.3](./index.htm#2.2.3_Associating_a_URN_with_its_HTTP-proxied_equivalent). +Based on the precedent set by the [TDWG LSID Applicability Statement standard](http://www.tdwg.org/standards/150/), it is recommended that URN-based IRIs be related to HTTP-proxied equivalents (if they exist) as described in [Section 2.2.3](./index.htm#2.2.3_Associating_a_URN_with_its_HTTP-proxied_equivalent). ##### 1.3.2.2 HTTP IRIs as self-resolving GUIDs (normative) -Advocates of principles of Linked Data [[LINKED-DATA](http://linkeddata.org/)] prefer to use identifiers which follow the HTTP IRI scheme [[HTTP](http://tools.ietf.org/html/rfc2616)], known as "HTTP IRIs". In addition to being globally unique, such identifiers have the advantage of being dereferenceable using a widely implemented protocol . As such, it is possible to implement HTTP IRIs so that a human end user can obtain information about the resource using a conventional web browser. It is also possible to implement HTTP IRIs so that data (in the form of RDF) describing the resource can be discovered by a semantic client. In recognition of the advantages conferred by HTTP IRIs, the TDWG GUID Applicability Statement standard [[GUID-STANDARD](http://www.tdwg.org/standards/150/)] specifies in Recommendation 2 that "HTTP GET resolution must be provided for non-self-resolving GUIDs". For this reason, providers of biodiversity information who intend to make data available through RDF should plan to implement GUIDs that are persistent HTTP IRIs. This is not to the exclusion of other forms of non-self-resolving globally unique identifiers that can be associated with the HTTP IRI using the methods described in [Section 2](./index.htm#2_Implementation_Guide) of this guide. +Advocates of principles of [Linked Data](http://linkeddata.org/) prefer to use identifiers which follow the [HTTP IRI scheme](http://tools.ietf.org/html/rfc2616), known as "HTTP IRIs". In addition to being globally unique, such identifiers have the advantage of being dereferenceable using a widely implemented protocol . As such, it is possible to implement HTTP IRIs so that a human end user can obtain information about the resource using a conventional web browser. It is also possible to implement HTTP IRIs so that data (in the form of RDF) describing the resource can be discovered by a semantic client. In recognition of the advantages conferred by HTTP IRIs, the [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies in Recommendation 2 that "HTTP GET resolution must be provided for non-self-resolving GUIDs". For this reason, providers of biodiversity information who intend to make data available through RDF should plan to implement GUIDs that are persistent HTTP IRIs. This is not to the exclusion of other forms of non-self-resolving globally unique identifiers that can be associated with the HTTP IRI using the methods described in [Section 2](./index.htm#2_Implementation_Guide) of this guide. ### 1.4 Use of terms in RDF #### 1.4.1 Well-known vocabularies (non-normative) -Because RDF assumes no pre-existing agreement between data providers and consumers about the terms used as properties to describe resources, the likelihood that a consuming client will "understand" the meaning of an RDF triple will be increased if the provider uses terms from a well-known vocabulary. Some well-known general and biodiversity-related vocabularies are listed in the Introduction to the Beginner's Guide to RDF [[RDF-BEGINNERS-GUIDE](http://code.google.com/p/tdwg-rdf/wiki/Beginners#0.3.7._Biodiversity-related_and_General_vocabularies_and_ontolog)]. If no well-known term exists to represent a property needed to describe a resource, a data provider may "mint" its own term. In that case, the provider should assign the term an IRI, define the term in RDF, provide clear human-understandable documentation of how the term should be used, provide for dereferencing of the IRI, and commit to the long-term stability of the term IRI and definition. +Because RDF assumes no pre-existing agreement between data providers and consumers about the terms used as properties to describe resources, the likelihood that a consuming client will "understand" the meaning of an RDF triple will be increased if the provider uses terms from a well-known vocabulary. Some well-known general and biodiversity-related vocabularies are listed in the [Introduction to the Beginner's Guide to RDF](http://code.google.com/p/tdwg-rdf/wiki/Beginners#0.3.7._Biodiversity-related_and_General_vocabularies_and_ontolog). If no well-known term exists to represent a property needed to describe a resource, a data provider may "mint" its own term. In that case, the provider should assign the term an IRI, define the term in RDF, provide clear human-understandable documentation of how the term should be used, provide for dereferencing of the IRI, and commit to the long-term stability of the term IRI and definition. #### 1.4.2 Appropriate use of terms (normative) -Because of the machine-oriented nature of RDF, a provider must assume that a consuming client will not infer any meaning from a statement other than what is directly stated or what can be inferred logically from other statements made about the resources and terms involved in the statement. For example, a provider might use the name of a resource in a triple with the intention that the name represent the resource itself. However, if the term used as the predicate in the triple is designed to refer to resources themselves rather than names of resources, the client may fail to make the connection between the name and the resource itself. Depending on how the term is defined, the client may detect an inconsistency or draw unintended conclusions as described below. Inappropriate use of terms as RDF predicates can have unintended consequences because unlike text-based data transfer protocols, RDF is designed to allow clients to infer additional facts based on information contained in the definitions of the terms. For example, the definition of the term ```foaf:depicts``` [[FOAF](http://xmlns.com/foaf/spec/)] contains a statement declaring its domain to be ```foaf:Image``` . Thus a provider which describes a resource using a ```foaf:depicts``` property is also implicitly (and perhaps unknowingly) declaring the resource to be an image. Terms which are defined using a form of RDF known as Web Ontology Language (OWL) [[OWL](http://www.w3.org/TR/owl2-overview/)] may have restrictions placed on their use. For example, declaring a term to be an ```owl:ObjectProperty``` indicates that it is inconsistent for the value of that property to be a string literal (i.e., the value should be an IRI). +Because of the machine-oriented nature of RDF, a provider must assume that a consuming client will not infer any meaning from a statement other than what is directly stated or what can be inferred logically from other statements made about the resources and terms involved in the statement. For example, a provider might use the name of a resource in a triple with the intention that the name represent the resource itself. However, if the term used as the predicate in the triple is designed to refer to resources themselves rather than names of resources, the client may fail to make the connection between the name and the resource itself. Depending on how the term is defined, the client may detect an inconsistency or draw unintended conclusions as described below. Inappropriate use of terms as RDF predicates can have unintended consequences because unlike text-based data transfer protocols, RDF is designed to allow clients to infer additional facts based on information contained in the definitions of the terms. For example, the definition of the term [```foaf:depicts```](http://xmlns.com/foaf/spec/) contains a statement declaring its domain to be ```foaf:Image``` . Thus a provider which describes a resource using a ```foaf:depicts``` property is also implicitly (and perhaps unknowingly) declaring the resource to be an image. Terms which are defined using a form of RDF known as [Web Ontology Language (OWL)](http://www.w3.org/TR/owl2-overview/) may have restrictions placed on their use. For example, declaring a term to be an ```owl:ObjectProperty``` indicates that it is inconsistent for the value of that property to be a string literal (i.e., the value should be an IRI). For these reasons, terms should be used as predicates in RDF only after the data provider has carefully examined the documentation and usage guidelines associated with the vocabulary or ontology which defines the term and has determined that use of that term is consistent with the meaning which the provider intends to impart to the triple in which the term is to be used as a predicate. -For more detailed information about the implications of using terms that have range, domain, and subproperty declarations in RDF, see part 4 of the Beginner's Guide to RDF [[RDF-BEGINNERS-GUIDE](http://code.google.com/p/tdwg-rdf/wiki/Beginners)]. For more detailed information about how OWL is used to define complex properties of terms in RDF, see part 7 of the Beginner's Guide to RDF [[RDF-BEGINNERS-GUIDE](http://code.google.com/p/tdwg-rdf/wiki/Beginners)]. +For more detailed information about the implications of using terms that have range, domain, and subproperty declarations in RDF, see part 4 of the [Beginner's Guide to RDF](http://code.google.com/p/tdwg-rdf/wiki/Beginners). For more detailed information about how OWL is used to define complex properties of terms in RDF, see part 7 of the [Beginner's Guide to RDF](http://code.google.com/p/tdwg-rdf/wiki/Beginners). #### 1.4.3 Use of Darwin Core terms in RDF (normative) @@ -158,7 +158,7 @@ This guide introduces the namespace **```dwciri:```** (http://rs.tdwg.org/dwc/ir #### 1.4.4 Limitations of this guide (non-normative) -This guide provides general guidance about how Darwin Core property terms should be used as RDF predicates and specifies that Darwin Core class terms should be used in ```rdf:type``` declarations ([Section 2.3.1.5](./index.htm#2.3.1.5_Classes_to_be_used_for_type_declarations_of_resources_de)). However, the Darwin Core standard does not specify precisely which resources should be included as instances of its classes nor does it declare domains for its property terms. Although the Darwin Core Quick Reference Guide [[DWC-GUIDE](http://rs.tdwg.org/dwc/terms/index.htm)] suggests which properties might be applied to instances of classes by organizing those property terms under class headings, Darwin Core leaves specific decisions about type declaration and property assignment to community consensus. Some examples which show varying approaches to assigning resources to Darwin Core classes and connecting them with object properties defined outside Darwin Core are provided in the Darwin Core informative ancillary web pages [[DWC-RDF-ANCILLARY](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md)]. +This guide provides general guidance about how Darwin Core property terms should be used as RDF predicates and specifies that Darwin Core class terms should be used in ```rdf:type``` declarations ([Section 2.3.1.5](./index.htm#2.3.1.5_Classes_to_be_used_for_type_declarations_of_resources_de)). However, the Darwin Core standard does not specify precisely which resources should be included as instances of its classes nor does it declare domains for its property terms. Although the [Darwin Core Quick Reference Guide](http://rs.tdwg.org/dwc/terms/index.htm) suggests which properties might be applied to instances of classes by organizing those property terms under class headings, Darwin Core leaves specific decisions about type declaration and property assignment to community consensus. Some examples which show varying approaches to assigning resources to Darwin Core classes and connecting them with object properties defined outside Darwin Core are provided in the [Darwin Core informative ancillary web pages](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md). ### 1.5 Roles of text strings as values of properties in dwc: namespace (non-normative) @@ -220,7 +220,7 @@ For brevity, the examples do not include namespace declarations, nor an ```rdf:R #### 2.1.2 Generating graphical diagrams and triple tables for the examples (non-normative) -The W3C RDF Validation Service [[W3C-RDF-VALIDATOR](http://www.w3.org/RDF/Validator/)] can be used to generate both a tabular listing and a graphical diagram of the triples that are included in the example XML serializations. Text from the examples can be placed inside the ```rdf:RDF``` container element below, then pasted into the validator box to generate the desired output. +The [W3C RDF Validation Service](http://www.w3.org/RDF/Validator/) can be used to generate both a tabular listing and a graphical diagram of the triples that are included in the example XML serializations. Text from the examples can be placed inside the ```rdf:RDF``` container element below, then pasted into the validator box to generate the desired output. ```rdf @@ -245,7 +245,7 @@ xmlns:viaf="http://viaf.org/viaf/" #### 2.1.3 Terminology (non-normative) -"Resource" is a general term for any kind of entity that can be described using RDF. Resources can be physical, conceptual, or digital entities. Statements about resources are made in RDF in the form of "triples" [[RDF-TRIPLES](http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/#section-triple)]. A triple consists of a subject, a predicate, and an object: +"Resource" is a general term for any kind of entity that can be described using RDF. Resources can be physical, conceptual, or digital entities. Statements about resources are made in RDF in the form of ["triples"](http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/#section-triple). A triple consists of a subject, a predicate, and an object: **Table 2** @@ -273,7 +273,7 @@ Turtle foaf:maker . ``` -The Dublin Core Metadata Initiative (DCMI) Abstract Model [[DCAM](http://dublincore.org/documents/abstract-model/)], which was designed to be compatible with RDF, describes subject resources using property-value pairs, which correspond to pairs of predicates and objects. When referring to Dublin Core terms (as well as Darwin Core, which is modeled on Dublin Core) "property" is used synonymously with "predicate" and "value" is used synonymously with "object". In Example 1, ```foaf:maker``` is a property and ```viaf:9854560``` is the value associated with that property. Predicates must be identified by IRIs. Objects of triples may be identified in three ways: 1) the object resource can be identified by an IRI reference, 2) the object can be identified by a non-IRI string, in which case it is called a literal, and 3) the object resource can also be left unidentified, in which case it is called a blank node or an anonymous node. Blank nodes are undesirable if it is important that other data providers be able to refer to the resource they represent. However, blank nodes may be preferable if external references to the resource are not relevant, or if the data provider is unable or unwilling to provide a stable IRI to identify the resource. IRI references and blank nodes can be the subjects of RDF triples, but literals cannot. +The [Dublin Core Metadata Initiative (DCMI) Abstract Model](http://dublincore.org/documents/abstract-model/), which was designed to be compatible with RDF, describes subject resources using property-value pairs, which correspond to pairs of predicates and objects. When referring to Dublin Core terms (as well as Darwin Core, which is modeled on Dublin Core) "property" is used synonymously with "predicate" and "value" is used synonymously with "object". In Example 1, ```foaf:maker``` is a property and ```viaf:9854560``` is the value associated with that property. Predicates must be identified by IRIs. Objects of triples may be identified in three ways: 1) the object resource can be identified by an IRI reference, 2) the object can be identified by a non-IRI string, in which case it is called a literal, and 3) the object resource can also be left unidentified, in which case it is called a blank node or an anonymous node. Blank nodes are undesirable if it is important that other data providers be able to refer to the resource they represent. However, blank nodes may be preferable if external references to the resource are not relevant, or if the data provider is unable or unwilling to provide a stable IRI to identify the resource. IRI references and blank nodes can be the subjects of RDF triples, but literals cannot. ### 2.2 Subject resources (normative) @@ -312,7 +312,7 @@ If an HTTP IRI is considered to be the identifier for a subject resource, it is #### 2.2.3 Associating a URN with its HTTP-proxied equivalent (normative) -The TDWG LSID Applicability Statement standard [[GUID-STANDARD](http://www.tdwg.org/standards/150/)] specifies in Recommendation 30 that "The description of all objects identified by an LSID **must** contain an ```owl:sameAs```, ```owl:equivalentProperty``` or ```owl:equivalentClass``` statement expressing the equivalence between the object identifier in its standard form and its proxy version". This is illustrated by Example 3: +The [TDWG LSID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies in Recommendation 30 that "The description of all objects identified by an LSID **must** contain an ```owl:sameAs```, ```owl:equivalentProperty``` or ```owl:equivalentClass``` statement expressing the equivalence between the object identifier in its standard form and its proxy version". This is illustrated by Example 3: **Example 3:** @@ -331,7 +331,7 @@ Turtle owl:sameAs . ``` -Since LSIDs follow the URN IRI scheme, they can serve as the subject of any RDF triple. However, it is better to use the http-proxied form as the subject (i.e., the value of the ```rdf:about``` attribute) in the description of the resource. See the Darwin Core informative ancillary web pages [[DWC-RDF-ANCILLARY](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md)] for more information about implementing LSIDs. +Since LSIDs follow the URN IRI scheme, they can serve as the subject of any RDF triple. However, it is better to use the http-proxied form as the subject (i.e., the value of the ```rdf:about``` attribute) in the description of the resource. See the [Darwin Core informative ancillary web pages](https://github.com/tdwg/rdf/blob/master/DwCAncillary.md) for more information about implementing LSIDs. This practice can be extended to any URN. For example, ```owl:sameAs``` can be used to relate the URN to its HTTP-proxied equivalent in a manner analogous to Example 3. @@ -341,7 +341,7 @@ Most terms in the Darwin Core vocabulary can be used as predicates in triples to #### 2.3.1 Declaring the type of the resource (non-normative) -In RDF, a resource may be characterized by declaring that it is an instance of a class. Indicating that a resource is an instance of a class provides several benefits. It allows a consumer to narrow the results of a search by limiting the search to certain types of resources. It suggests to data providers what sorts of properties should be used to describe a resource. It allows consumers to anticipate what sorts of properties they might expect to be provided for that resource and allows developers to build applications that exploit those expectations. Because of these benefits, RDF provides several built-in mechanisms for asserting class membership, most notably the ```rdf:type``` property [[RDF-TYPE](http://www.w3.org/TR/rdf-schema/#ch_type)] which is used to state that a resource is an instance of a class. There is nothing that prohibits assigning more than one ```rdf:type``` property to a resource. In fact, there may be a benefit in describing a resource as a member of both a class which has specific meaning within a narrow community and a more well-known class which has a broader meaning and is therefore more likely to be understood by generic clients. For instance, a resource may be typed as both a ```dwc:PreservedSpecimen``` and a ```dcmitype:PhysicalObject```. +In RDF, a resource may be characterized by declaring that it is an instance of a class. Indicating that a resource is an instance of a class provides several benefits. It allows a consumer to narrow the results of a search by limiting the search to certain types of resources. It suggests to data providers what sorts of properties should be used to describe a resource. It allows consumers to anticipate what sorts of properties they might expect to be provided for that resource and allows developers to build applications that exploit those expectations. Because of these benefits, RDF provides several built-in mechanisms for asserting class membership, most notably the [```rdf:type``` property](http://www.w3.org/TR/rdf-schema/#ch_type) which is used to state that a resource is an instance of a class. There is nothing that prohibits assigning more than one ```rdf:type``` property to a resource. In fact, there may be a benefit in describing a resource as a member of both a class which has specific meaning within a narrow community and a more well-known class which has a broader meaning and is therefore more likely to be understood by generic clients. For instance, a resource may be typed as both a ```dwc:PreservedSpecimen``` and a ```dcmitype:PhysicalObject```. ##### 2.3.1.1 rdf:type statement (normative) @@ -366,7 +366,7 @@ Turtle:      dcterms:created "2002-06-11T09:37:33"^^xsd:dateTime. ``` -In Turtle serialization, ```rdf:type``` can be abbreviated as "```a```" (Example 4). In XML serialization, the RDF specification provides an abbreviated way to specify the type of a described resource. This method is called a typed node element [[TYPED-NODE](http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-typed-nodes)]. The ```rdf:Description``` element is replaced by an element whose name is an XML qualified name that identifies a class of which the described resource is an instance as in Example 5: +In Turtle serialization, ```rdf:type``` can be abbreviated as "```a```" (Example 4). In XML serialization, the RDF specification provides an abbreviated way to specify the type of a described resource. This method is called a [typed node element](http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-typed-nodes). The ```rdf:Description``` element is replaced by an element whose name is an XML qualified name that identifies a class of which the described resource is an instance as in Example 5: **Example 5:** @@ -379,7 +379,7 @@ This example serializes the exact same two triples as Example 4. The ```rdf:type ##### 2.3.1.2 rdf:type assertion through domain and range declarations (normative) -The RDF Schema (RDFS) specification [[RDFS](http://www.w3.org/TR/rdf-schema/)] defines two terms that assert ```rdf:type``` implicitly when certain predicates are used. When a predicate ```P``` having the property +The [RDF Schema (RDFS) specification](http://www.w3.org/TR/rdf-schema/) defines two terms that assert ```rdf:type``` implicitly when certain predicates are used. When a predicate ```P``` having the property ``` P rdfs:domain C @@ -422,7 +422,7 @@ is used with a value, a client can infer that the value is an instance of class ``` -in its definition. If the object of that term in an RDF triple is a reference to the IRI for English assigned by the MARC ISO 639-2 Codes for the Representation of Names of Languages [[MARC-LANGUAGES](http://id.loc.gov/vocabulary/iso639-2)] (Example 7), then it can be inferred that http://id.loc.gov/vocabulary/iso639-2/eng is a ```dcterms:LingisticSystem``` even though the MARC description in RDF does not assert that directly in its definition. +in its definition. If the object of that term in an RDF triple is a reference to the IRI for English assigned by the [MARC ISO 639-2 Codes for the Representation of Names of Languages](http://id.loc.gov/vocabulary/iso639-2) (Example 7), then it can be inferred that http://id.loc.gov/vocabulary/iso639-2/eng is a ```dcterms:LingisticSystem``` even though the MARC description in RDF does not assert that directly in its definition. **Example 7:** @@ -461,7 +461,7 @@ in the description so that clients searching for instances of either ```foaf:Ima ##### 2.3.1.4 Other predicates used to indicate type (normative) -Both the Dublin Core and Darwin Core define terms that can be used to describe the nature of a resource: ```dcterms:type``` and ```dwc:basisOfRecord``` respectively. However, using these terms to describe the nature of the subject resource is not a substitute for use of ```rdf:type```. The DCMI notes on RDF semantics [[DC-RDF-SEMANTICS](http://dublincore.org/documents/dc-rdf/#sect-5)] recommend that "applications implementing this specification primarily use and understand ```rdf:type``` in place of ```dcterms:type``` when expressing Dublin Core metadata in RDF, as most RDF processors come with built-in knowledge of ```rdf:type```." A similar argument could be made for the use of ```rdf:type``` over ```dwc:basisOfRecord```. Including ```dc:type```, ```dcterms:type```, and ```dwc:basisOfRecord``` in an RDF description should be considered optional, while including ```rdf:type``` should be considered highly recommended. A ```dwciri:``` analogue ([Section 2.5](./index.htm#2.5_Terms_in_the_dwciri:_namespace)) of ```dwc:basisOfRecord``` should not be used. Use ```rdf:type``` instead when the object is an IRI reference. Here is an example that describes a specimen using several of the terms that define the nature of a resource explicitly, including multiple ```rdf:type``` declarations: +Both the Dublin Core and Darwin Core define terms that can be used to describe the nature of a resource: ```dcterms:type``` and ```dwc:basisOfRecord``` respectively. However, using these terms to describe the nature of the subject resource is not a substitute for use of ```rdf:type```. The [DCMI notes on RDF semantics](http://dublincore.org/documents/dc-rdf/#sect-5) recommend that "applications implementing this specification primarily use and understand ```rdf:type``` in place of ```dcterms:type``` when expressing Dublin Core metadata in RDF, as most RDF processors come with built-in knowledge of ```rdf:type```." A similar argument could be made for the use of ```rdf:type``` over ```dwc:basisOfRecord```. Including ```dc:type```, ```dcterms:type```, and ```dwc:basisOfRecord``` in an RDF description should be considered optional, while including ```rdf:type``` should be considered highly recommended. A ```dwciri:``` analogue ([Section 2.5](./index.htm#2.5_Terms_in_the_dwciri:_namespace)) of ```dwc:basisOfRecord``` should not be used. Use ```rdf:type``` instead when the object is an IRI reference. Here is an example that describes a specimen using several of the terms that define the nature of a resource explicitly, including multiple ```rdf:type``` declarations: **Example 8:** @@ -492,7 +492,7 @@ Refer to [Sections 2.4.3](./index.htm#2.4.3_Object_resources_that_have_been_prev ##### 2.3.1.5 Classes to be used for type declarations of resources described using Darwin Core (normative) -The TDWG GUID Applicability Statement standard [[GUID-STANDARD](http://www.tdwg.org/standards/150/)] specifies that an object in the biodiversity domain that is identified by a GUID should be typed using a well-known vocabulary. With this recommendation in mind, it should be considered a best practice to provide information about the type (i.e., class membership) of any resource that is assigned a persistent identifier in the form of an IRI. Since Darwin Core is a well-known vocabulary and a ratified TDWG standard, its classes should be used for typing in preference to classes in parts of the TDWG ontology which are not ratified standards and are effectively deprecated. The human-readable definitions of the Darwin core classes provide guidance for deciding the types to assign to resources, although community consensus may be necessary to classify some of the more complex kinds of resources. ([Section 1.4.4](./index.htm#1.4.4_Limitations_of_this_guide)) +The [TDWG GUID Applicability Statement standard](http://www.tdwg.org/standards/150/) specifies that an object in the biodiversity domain that is identified by a GUID should be typed using a well-known vocabulary. With this recommendation in mind, it should be considered a best practice to provide information about the type (i.e., class membership) of any resource that is assigned a persistent identifier in the form of an IRI. Since Darwin Core is a well-known vocabulary and a ratified TDWG standard, its classes should be used for typing in preference to classes in parts of the TDWG ontology which are not ratified standards and are effectively deprecated. The human-readable definitions of the Darwin core classes provide guidance for deciding the types to assign to resources, although community consensus may be necessary to classify some of the more complex kinds of resources. ([Section 1.4.4](./index.htm#1.4.4_Limitations_of_this_guide)) Any Darwin Core class IRI may be used as a value for ```rdf:type```, although it is not clear whether ```dwc:ResourceRelationship``` instances make sense in the context of RDF. The following list summarizes classes included in the Dublin Core type vocabulary (but which are not part of Darwin Core) that should also be used for typing biodiversity-related resources: