From 65588a7157125572cdd82e6541edce6ae5502c44 Mon Sep 17 00:00:00 2001 From: Peter Desmet Date: Wed, 14 Jan 2015 11:00:42 +0100 Subject: [PATCH] Added numbers to headings --- change_policy.html | 16 ++++++++-------- examples/index.html | 10 +++++----- index.html | 14 +++++++------- resources/index.html | 8 ++++---- simple_dwc.html | 2 +- terms/decisions.html | 4 ++-- 6 files changed, 27 insertions(+), 27 deletions(-) diff --git a/change_policy.html b/change_policy.html index 470cb13..cd2397e 100644 --- a/change_policy.html +++ b/change_policy.html @@ -69,10 +69,10 @@

Change policy

-

Introduction

+

1. Introduction

This document and the policies contained herein are modeled on the Dublin Core Metadata Initiative Namespace Policy [DCMINAMESPACEPOLICY]. All terms in the Darwin Core must be identified with a unique Uniform Resource Identifier (URI). For convenience, the term URIs are grouped into collections known as Darwin Core namespaces. This document describes the policies associated with Darwin Core namespaces and how term URIs are allocated by the Darwin Core Task Group [DWC-TASK].

-

Namespace URIs

+

2. Namespace URIs

The Darwin Core namespace URI for the collection of all Darwin Core properties, classes, and encoding schemes is:

http://rs.tdwg.org/dwc/terms/

The term identifier for the current (recommended) version of a term is a URI based on the namespace and the term name without version information. Some example Darwin Core term identifiers follow:

@@ -83,32 +83,32 @@

All Darwin Core identifiers will dereference to a Darwin Core term declaration for the identified term.

-

Term change policy

+

3. Term change policy

Changes to Darwin Core recommendations or term declarations will occur from time to time for a variety of reasons. Changes have varying implications for the decision making process and for versioning Darwin Core documents and term URIs. The types of changes and appropriate processes are identified and explained in the following sections.

Reporting issues: In all cases the proposed change should be reported to the Technical Architecture Group [TDWG-TAG] and the outcome of the proposal should be announced on the Darwin Core mailing list [TDWG-CONTENT].

Decisions: In cases where an Executive decision is required,the Technical Architecture Group will conduct a minimum 30-day public comment period on the Darwin Core mailing list [TDWG-CONTENT], during which the proposal can be refined based on discussion in an effort to reach consensus (no dissenting opinion expressed publicly on the mailing list for 30 days from the most recent iteration). If a consensus is reached, the proposal will be presented by the Technical Architecture Group to the Executive Committee for a decision [DECISIONS] within 30 days.

Versions: In cases where a decision requires a version change, the attributes Status (recommended, superseded, or deprecated), Date Issued, Date Modified, Decision, Version, Replaces, and Is Replaced By will be modified in the affected terms as appropriate.

-

Minor editorial errata

+

3.1 Minor editorial errata

An error in spelling, punctuation, grammar, or other clerical mistake discovered in a Darwin Core recommendation or term declaration may be corrected without a public comment period or Executive Decision. Minor editorial changes of this type do not require a version change for the affected term and/or documents.

-

Substantive editorial errata

+

3.2 Substantive editorial errata

A substantive error is one that compromises the usefulness or accuracy of systems based on Darwin Core. Those that are unequivocal (for example, an incorrect URI or reference) may be treated as minor editorial errata (Section 3.1).

Otherwise, the Technical Architecture Group will conduct a public comment period and seek an Executive decision to find a solution that minimizes adverse effects on existing applications. Changes of this nature require a version change for the affected term and/or documents.

-

Semantic changes to Darwin Core terms

+

3.3 Semantic changes to Darwin Core terms

Darwin Core terms may be changed based on public demand and consensus. A request to the Technical Architecture Group for a term change should consist of proposed values for the complete list of attributes given in the term definitions section of the Darwin Core Terms Complete History [HISTORY] along with a statement of justification for the change.

Term changes that are likely to have substantial functional impact on human understanding or machine processing will posted for public commentary and an Executive decision. Changes of this nature require a version change for the affected term and/or documents.

-

Addition of Darwin Core terms

+

3.4 Addition of Darwin Core terms

New terms may be added to the Darwin Core namespaces based on public demand and consensus. A request to the Technical Architecture Group [TDWG-TAG] for a new term should consist of proposed values for the complete list of attributes given in the term definitions section of the Darwin Core Terms Complete History [HISTORY] along with a statement of justification for the new term, including an explanation of why no existing term will suffice.

New term proposals will be posted for public commentary and an Executive decision. Changes of this nature require a version change for the affected term and/or documents.

-

Persistence policy

+

4. Persistence policy

TDWG recognizes that people and applications depend on the persistence of formal documents and machine processable schemas that have been made publicly available. In particular, the stability of Darwin Core term URIs and Darwin Core namespace URIs is critical to interoperability over time. Thus, the wide promulgation of this set of URIs dictates that they be maintained to support legacy applications that have adopted them.

diff --git a/examples/index.html b/examples/index.html index 089542e..63a9c1f 100644 --- a/examples/index.html +++ b/examples/index.html @@ -71,27 +71,27 @@

The following are links to examples of marking up data in Darwin Core:

-

Specimens

+

1. Specimens

-

Observations

+

2. Observations

-

Material samples

+

3. Material samples

-

Taxon

+

4. Taxon

-

RDFa

+

5. RDFa

diff --git a/index.html b/index.html index 79eec65..6749f91 100644 --- a/index.html +++ b/index.html @@ -72,25 +72,25 @@
-

Introduction

+

1. Introduction

The Darwin Core is body of standards. It includes a glossary of terms (in other contexts these might be called properties, elements, fields, columns, attributes, or concepts) intended to facilitate the sharing of information about biological diversity by providing reference definitions, examples, and commentaries. The Darwin Core is primarily based on taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. Included are documents describing how these terms are managed, how the set of terms can be extended for new purposes, and how the terms can be used. The normative document for the terms [RDF-NORMATIVE] is written in the Resource Description Framework (RDF) and is the definitive resource to understand the term definitions and their relationships to each other. The Simple Darwin Core [SIMPLEDWC] is a specification for one particular way to use the terms - to share data about taxa and their occurrences in a simply structured way - and is probably what is meant if someone suggests to "format your data according to the Darwin Core".

-

Motivation

+

2. Motivation

The Darwin Core standard was originally conceived to facilitate the discovery, retrieval, and integration of information about modern biological specimens, their spatiotemporal occurrence, and their supporting evidence housed in collections (physical or digital). The Darwin Core today is broader in scope and more versatile. It is meant to provide a stable standard reference for sharing information on biological diversity. As a glossary of terms, the Darwin Core is meant to provide stable semantic definitions with the goal of being maximally reusable in a variety of contexts.

-

Rationale

+

3. Rationale

The Darwin Core is based on the standards developed by the Dublin Core Metadata Initiative [DCMI] and can be viewed as an extension of the Dublin Core for biodiversity information. The purpose of these terms is to facilitate data sharing by providing a well-defined standard core vocabulary in a flexible framework to minimize the barriers to adoption and to maximize reusability. The terms described in this standard are a part of a larger set of vocabularies and technical specifications under development [TDWG-DEV] and maintained by Biodiversity Information Standards (TDWG) [TDWG-STANDARDS].

-

Guiding principles

+

4. Guiding principles

Each term has a definition and commentaries that are meant to promote the consistent use of the terms across applications and disciplines. Evolving commentaries that discuss, refine, expand, or translate the definitions and examples are referred to through links in the Comments attribute of each term. This means of documentation allows the standard to adapt to new purposes without disrupting existing applications. There is meant to be a clear separation between the terms defined in this standard and the applications that make use of them. For example, though the data types and constraints are not provided in the term definitions, recommendations are made about how to restrict the values where appropriate.

-

Content

+

5. Content

The standard consists of a vocabulary of terms (properties, elements, fields, concepts) [TERMS], the policy governing the maintenance of these terms [NAMESPACEPOLICY], the decisions that resulted in changes to terms [DECISIONS], the complete history of terms including detailed attributes [HISTORY], a Generic Darwin Core XML schema [TERMSXMLSCHEMA] from which other schemas can be constructed, a Simple Darwin Core XML schema [SIMPLEXMLSCHEMA] as a complete schema ready for use, a schema to allow Darwin Core data transfer in text files [TEXTSCHEMA], and associated reference schemas for the construction of more structured content. These pages also describe mappings between the current standard and pre-standard historical versions [VERSIONS], including mappings [DWCTOABCD] to concepts in the Access to Biological Collections Data standard [ABCD].

-

Extension

+

6. Extension

Though the Darwin Core is insufficient for the needs of all biological disciplines, it can be adapted to serve new purposes. Darwin Core can be extended by adding new terms to share additional information. To do so you should be familiar with the recommendations and procedures defined in the Darwin Core Namespace Policy [NAMESPACEPOLICY]. Basically, before proposing a new term, consider the existing terms in this and other compatible standards to determine if the new concept can be accommodated by a simple revision of the description and comments for an existing term, without losing the existing meaning of that term.

-

Participation

+

7. Participation

To receive notification of activity or participate in discussions about Darwin Core, join the tdwg-content mailing list [TDWG-CONTENT] and watch the Darwin Core Project [DWC-PROJECT]. For discussion or commentary on the definition of recommended terms, consult the link inside the Comment section in the listing for the term in the Quick Reference Guide [TERMS] or search for the relevant content in the auxiliary Darwin Core Documentation [DWC-WIKI].

To make a formal request for a change to or addition of a term to the Darwin Core, read and follow the recommendations in the Darwin Core Namespace Policy [NAMESPACEPOLICY]. For those who wish to construct and submit as a standard any application profile, such as an XML schema, that extends the capabilities of the Darwin Core, adding new terms to the Darwin Core vocabulary that don't already exist in a compatible vocabulary will be a prerequisite. Consult the appropriate guideline, such as the XML Guide [XMLGUIDE], for information about the construction of a new application profile. The rules of submission of proposed standards can be found in the Biodiversity Information Standards (TDWG) process document [PROCESS].

diff --git a/resources/index.html b/resources/index.html index 6f44699..2ae4e17 100644 --- a/resources/index.html +++ b/resources/index.html @@ -71,7 +71,7 @@

This entire repository can be downloaded from Github. -

Downloads

+

1. Downloads

-

Vocabularies

+

2. Vocabularies

-

Tools

+

3. Tools

-

References

+

4. References

diff --git a/simple_dwc.html b/simple_dwc.html index c56490c..4d9b697 100644 --- a/simple_dwc.html +++ b/simple_dwc.html @@ -1 +1 @@ - Simple Darwin Core

Introduction

What is Simple Darwin Core?

The Simple Darwin Core is a predefined subset of the terms that have common use across a wide variety of biodiversity applications. The terms used in the Simple Darwin Core are those that are found at the cross-section of taxonomic names, places, and events that document biological occurrences on the planet. The two driving principles are simplicity and flexibility.

What makes it simple?

The Simple Darwin Core is simple in that it assumes (and allows) no structure beyond the concept of rows and columns, which might be thought of as attributes and their values, or fields and records. The words field and record will be used throughout the rest of the document to refer to the two dimensions of the Simple Darwin Core structure. Think of the term names as the field names. In other words, a Simple Darwin Core record could be captured in a spreadsheet or in a single database table.

What makes it flexible?

The Simple Darwin Core has minimal restrictions on which fields are required (none). You might argue that there should be more required fields, that there isn't anything useful you can do without them. That is partially true. A record with no fields in it wouldn't be very interesting, but there is a difference between requiring that there be a field in a record and requiring that a particular field be in all records. By having no required field restriction, the Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (an occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services.

Are there any rules?

There are just a few general guiding principles on how to make the best use of the Simple Darwin Core:

  1. Any Darwin Core term name can be used as a field name.
  2. No field name may be repeated in a record.
  3. Do not use a Class (Occurrence, Organism, MaterialSample, LivingSpecimen, PreservedSpecimen, FossilSpecimen, Event, HumanObservation, MachineObservation, Location, GeologicalContext, Identification, Taxon) as a field.
  4. Provide data in as many fields as you can.
  5. Use the type field to provide the name of the what Dublin Core type class(PhysicalObject, StillImage, MovingImage, Sound, Text) the record represents.
  6. Use the basisOfRecord field to provide the name of the most specific Darwin Core class (LivingSpecimen, PreservedSpecimen, FossilSpecimen, MaterialSample, HumanObservation, MachineObservation, Event, Occurrence, Taxon, Identification, Organism, Location, GeologicalContext, MeasurementOrFact, ResourceRelationship) the record represents.
  7. Populate fields with data that match the definition of the field.
  8. Use the controlled vocabulary for the values of fields that recommend them.
  9. If data are withheld, use informationWithheld to say so.
  10. If data are shared in lower quality than the original, use dataGeneralizations to say so.

Every field in the Simple Darwin Core may appear either once or not at all in a single record - otherwise how could you distinguish one scientificName field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the MeasurementOrFact and ResourceRelationship classes are of somewhat limited utility here - you could only share one MeasurementOrFact and one ResourceRelationship per record. You might argue then that there is no way to share information that requires related structures, such as a history of identifications of a specimen. That is mostly true. The only recourse within the Simple Darwin Core is to force the data into one of the catch all "list" terms such as recordedBy, preparations, otherCatalogNumbers, associatedMedia, associatedReferences, associatedSequences, associatedTaxa, associatedOccurrences, associatedOrganisms, previousIdentifications, higherGeography, georeferencedBy, georeferenceSources, identifiedBy, identificationReferences, and higherClassification.

There is a difference between having data in a field and requiring that field to have a value from among a legal set of values. The Darwin Core is simple in that it has minimal restrictions on the contents of fields. The term comments give recommendations about the use of controlled vocabularies and how to structure content wherever appropriate. Data contributors are encouraged to follow these recommendations as well as possible. You might argue that having no restrictions will promote "dirty" data (data of low quality or dubious value). Consider the simple axiom "It's not what you have, but what you do with it that matters." If data restrictions were in place at the fundamental level, then a record having any non-compliant data in any of its fields could not be shared via the standard. Not only would there be a dearth of shared data in that case (or an unused standard), but also there would be no way to use the standard to build shared data cleaning tools to actually improve the situation, nor to use data services to look up alternative representations (language translations, for example) to serve a broader audience. The rest is up to how the records will be used - in other words, it is up to applications to enforce further restrictions if appropriate, and it is up to the stakeholders of those applications to decide what the restrictions will be for the purpose the application is trying to serve.

How do I use Simple Darwin Core?

The Darwin Core is simple in that data "complying with" the Simple Darwin Core can be easily shared in a variety of ways, including, but not limited to, text files and xml documents.

What you need to do as a contributor of data via the Simple Darwin Core depends on the requirements of the ones who are going to consume those data. For example, if you have a collaborator who wants to share data via the Simple Darwin Core, then it may be sufficient to create a spreadsheet that contains column headers matching as many of the Darwin Core term names as you are both interested in sharing - just to be sure you both understand the meaning of the fields you share, and therefore hopefully something about their content. You might create a table in a database using the Simple Darwin Core as a model (if it met all of your needs), and then connect that database with services for sharing via the web. You might use that same database (or spreadsheet) to export a comma-separated value (CSV) file for upload into a hosted service that could serve the data on your behalf. Or you might use that same file to upload into a service that would allow you to add value (such as a georeference) or quality (with a data cleaning tool), or to see your data in the context of other shared data.

Simple Darwin Core as Text

The Text Guide [TEXTGUIDE] describes how to construct and format a text file using a simplified subset of the Fielded Text [FIELDEDTEXT] specification, which allows the contributor to describe the contents of a text file, or set of text files (related or not) through a separate configuration file (called a metafile). The metafile allows the contributor to communicate the structure of the content of the file or files and any relationships between them. Though it is good practice to describe a Simple Darwin Core file with such a metafile, it isn't strictly necessary if the file follows the CSV file specification and the first line of the file contains the field names. A Fielded Text metafile for any text file based on the Simple Darwin Core can be created by customizing the example metafile [SIMPLEMETAFILE] (if this link shows a blank page in your browser, use the View Source option to see the XML document), which includes references to all Darwin Core terms. Refer to the comments in the file itself as well as the metafile specification in the Text Guide [TEXTGUIDE] for more information.

Simple Darwin Core as XML

The XML Guide [XMLGUIDE] describes how to construct XML schemas to share data based on Darwin Core terms. Looking at the Simple Darwin Core XML Schema [SIMPLEXMLSCHEMA] using the XML Guide as a reference you will be able to see that the schema supports the notion of a SimpleDarwinRecord, which is just a grouping of up to one of each of the Darwin Core terms that are Properties (not Classes). The following example shows a SimpleDarwinRecordSet containing one SimpleDarwinRecord for a Taxon:

<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2006-05-04T18:13:51.0Z</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificNameID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:scientificNameID>
<dwc:acceptedNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=22010</dwc:acceptedNameUsageID>
<dwc:originalNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:originalNameUsageID>
<dwc:nameAccordingToID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=22764</dwc:nameAccordingToID>
<dwc:namePublishedInID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=671</dwc:namePublishedInID>
<dwc:scientificName>Centropyge flavicauda Fraser-Brunner 1933</dwc:scientificName>
<dwc:acceptedNameUsage>Centropyge fisheri (Snyder 1904)</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Centropyge  Kaup, 1860</dwc:parentNameUsage>
<dwc:originalNameUsage>Centropyge flavicauda Fraser-Brunner 1933</dwc:originalNameUsage>
<dwc:nameAccordingTo>Allen, G.R. 1980. Butterfly and angelfishes of the world. Volume II. Mergus Publishers. Pp. 149-352.</dwc:nameAccordingTo>
<dwc:namePublishedIn>Fraser-Brunner, A. 1933. A revision of the chaetodont fishes of the subfamily Pomacanthinae. Proceedings of the General 
      Meetings for Scientific Business of the Zoological Society of London 1933 (pt 3, no.30): 543-599, Pl. 1.</dwc:namePublishedIn>
<dwc:higherClassification>Animalia;Chordata;Vertebrata;Osteichthyes;Actinopterygii;Neopterygii;Teleostei;Acanthopterygii;Perciformes;
      Percoidei;Pomacanthidae;Centropyge</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Osteichthyes</dwc:class>
<dwc:order>Perciformes</dwc:order>
<dwc:family>Pomacanthidae</dwc:family>
<dwc:genus>Centropyge</dwc:genus>
<dwc:specificEpithet>flavicauda</dwc:specificEpithet>
<dwc:scientificNameAuthorship>Fraser-Brunner 1933</dwc:scientificNameAuthorship>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:taxonomicStatus>accepted</dwc:taxonomicStatus>
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>

The SimpleDarwinRecord acts as a Class in implementation, because all of the terms are properties of it. The Simple Darwin Core schema has just one other level of structure, the SimpleDarwinRecordSet, which is a grouping of one or more SimpleDarwinRecords. The SimpleDarwinRecordSet acts as a Class to define a data set during implementation.

Doing more with Simple Darwin Core

Sooner or later you may want to share more information than the Simple Darwin Core seems to allow. For example, you and your colleagues might decide that it would be useful to have a standard way to exchange additional information relevant to questions in Conservation. How would you do it?

One way would be to try to "overload" existing terms by using them to hold information other than what was intended based on the definition of the terms. Please don't do this. If an existing term has close to the same meaning as one you want to use, but just doesn't quite fit because of the way the definition is worded, it would be better to request an amendment to the term definition so that it will be clear for your community how to use it. You can request such a change by submitting an issue in the Darwin Core Project [DWC-PROJECT].

Another way to get more out of the Darwin Core without adding a term is to "payload" the dynamicProperties term with structured content, as shown in the example below, using Javascript Open Notatation (JSON). This is perfectly legal, since it doesn't compromise the meaning of the term. One of the weaknesses of payloading data in this way is that it is subject to a lack of stable or well-defined semantics. Also, it is highly recommended to flatten the content into a single string with no non-printing characters (such as line feeds) to facilitate use in the widest variety of data sharing contexts. Still, this might be a reasonable way to at least allow you to share all of your data, even if there might be problems with people using it reliably.

<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2009-02-12T12:43:31</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificName>Ctenomys sociabilis</dwc:scientificName>
<dwc:acceptedNameUsage>Ctenomys sociabilis Pearson and Christie, 1985</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Ctenomys Blainville, 1826</dwc:parentNameUsage>
<dwc:higherClassification>Animalia; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Hystricognatha; Hystricognathi; Ctenomyidae; Ctenomyini; Ctenomys</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Mammalia</dwc:class>
<dwc:order>Rodentia</dwc:order>
<dwc:family>Ctenomyidae</dwc:family>
<dwc:genus>Ctenomys</dwc:genus>
<dwc:specificEpithet>sociabilis</dwc:specificEpithet>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:scientificNameAuthorship>Pearson and Christie, 1985</dwc:scientificNameAuthorship>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:namePublishedIn>Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388</dwc:namePublishedIn>
<dwc:taxonomicStatus>valid</dwc:taxonomicStatus>
  <dwc:dynamicProperties>{"iucnStatus":"vulnerable", "distribution":"Neuquén, Argentina"}</dwc:dynamicProperties> 
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>

If you were using just CSV text files to exchange information, then you might be tempted to just add the new fields to the files. This approach suffers most of the same problems as payloading - no one aside from those with whom you communicated would know what those new fields were or how to use them. Sharing in this way via XML would be an even bigger problem, because the Simple Darwin Core XML Schema [SIMPLEXMLSCHEMA] defines the terms that it supports and the new fields would not correspond with any terms understood by the schema. In other words, the XML with your fields in it would not be a valid Simple Darwin Core XML document.

So, if you really need to extend the capabilities of Darwin Core, the best first step is to follow the standards process to add the terms you need. The mechanisms for pursuing this are explained in the Darwin Core Namespace Policy [NAMESPACEPOLICY]. The process will help to assure that the new terms are well conceived, that they don't conflict with existing terms, and that they are properly defined in the broader context of biological diversity information.

Going beyond Simple Darwin Core

For cases where rich data require rich (non-simple) structure, the Simple Darwin Core alone is not suitable. When sharing information via fielded text [FIELDEDTEXT], the solution is to use the Simple Darwin Core as a core record with one or more associated extensions for the additional information. See the Darwin Core Text Guide [TEXTGUIDE] for an explanation and examples.

When sharing information via XML [XML], a richer structure such as the Access to Biological Collections Data schema [ABCD], or the Generic Darwin Core [GENERICXMLSCHEMA], or another schema built from the Darwin Core terms to suit the use of the data in a particular context. See the Darwin Core XML Guide [XMLGUIDE] for examples and references to model schemas.

\ No newline at end of file + Simple Darwin Core

1. Introduction

1.1 What is Simple Darwin Core?

The Simple Darwin Core is a predefined subset of the terms that have common use across a wide variety of biodiversity applications. The terms used in the Simple Darwin Core are those that are found at the cross-section of taxonomic names, places, and events that document biological occurrences on the planet. The two driving principles are simplicity and flexibility.

1.2 What makes it simple?

The Simple Darwin Core is simple in that it assumes (and allows) no structure beyond the concept of rows and columns, which might be thought of as attributes and their values, or fields and records. The words field and record will be used throughout the rest of the document to refer to the two dimensions of the Simple Darwin Core structure. Think of the term names as the field names. In other words, a Simple Darwin Core record could be captured in a spreadsheet or in a single database table.

1.3 What makes it flexible?

The Simple Darwin Core has minimal restrictions on which fields are required (none). You might argue that there should be more required fields, that there isn't anything useful you can do without them. That is partially true. A record with no fields in it wouldn't be very interesting, but there is a difference between requiring that there be a field in a record and requiring that a particular field be in all records. By having no required field restriction, the Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (an occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services.

1.4 Are there any rules?

There are just a few general guiding principles on how to make the best use of the Simple Darwin Core:

  1. Any Darwin Core term name can be used as a field name.
  2. No field name may be repeated in a record.
  3. Do not use a Class (Occurrence, Organism, MaterialSample, LivingSpecimen, PreservedSpecimen, FossilSpecimen, Event, HumanObservation, MachineObservation, Location, GeologicalContext, Identification, Taxon) as a field.
  4. Provide data in as many fields as you can.
  5. Use the type field to provide the name of the what Dublin Core type class(PhysicalObject, StillImage, MovingImage, Sound, Text) the record represents.
  6. Use the basisOfRecord field to provide the name of the most specific Darwin Core class (LivingSpecimen, PreservedSpecimen, FossilSpecimen, MaterialSample, HumanObservation, MachineObservation, Event, Occurrence, Taxon, Identification, Organism, Location, GeologicalContext, MeasurementOrFact, ResourceRelationship) the record represents.
  7. Populate fields with data that match the definition of the field.
  8. Use the controlled vocabulary for the values of fields that recommend them.
  9. If data are withheld, use informationWithheld to say so.
  10. If data are shared in lower quality than the original, use dataGeneralizations to say so.

Every field in the Simple Darwin Core may appear either once or not at all in a single record - otherwise how could you distinguish one scientificName field from another one? Think of a database table. It will not allow you to have the same name for two different fields. Because of this design restriction (lack of flexibility for the sake of simplicity), the auxiliary fields from the MeasurementOrFact and ResourceRelationship classes are of somewhat limited utility here - you could only share one MeasurementOrFact and one ResourceRelationship per record. You might argue then that there is no way to share information that requires related structures, such as a history of identifications of a specimen. That is mostly true. The only recourse within the Simple Darwin Core is to force the data into one of the catch all "list" terms such as recordedBy, preparations, otherCatalogNumbers, associatedMedia, associatedReferences, associatedSequences, associatedTaxa, associatedOccurrences, associatedOrganisms, previousIdentifications, higherGeography, georeferencedBy, georeferenceSources, identifiedBy, identificationReferences, and higherClassification.

There is a difference between having data in a field and requiring that field to have a value from among a legal set of values. The Darwin Core is simple in that it has minimal restrictions on the contents of fields. The term comments give recommendations about the use of controlled vocabularies and how to structure content wherever appropriate. Data contributors are encouraged to follow these recommendations as well as possible. You might argue that having no restrictions will promote "dirty" data (data of low quality or dubious value). Consider the simple axiom "It's not what you have, but what you do with it that matters." If data restrictions were in place at the fundamental level, then a record having any non-compliant data in any of its fields could not be shared via the standard. Not only would there be a dearth of shared data in that case (or an unused standard), but also there would be no way to use the standard to build shared data cleaning tools to actually improve the situation, nor to use data services to look up alternative representations (language translations, for example) to serve a broader audience. The rest is up to how the records will be used - in other words, it is up to applications to enforce further restrictions if appropriate, and it is up to the stakeholders of those applications to decide what the restrictions will be for the purpose the application is trying to serve.

1.5 How do I use Simple Darwin Core?

The Darwin Core is simple in that data "complying with" the Simple Darwin Core can be easily shared in a variety of ways, including, but not limited to, text files and xml documents.

What you need to do as a contributor of data via the Simple Darwin Core depends on the requirements of the ones who are going to consume those data. For example, if you have a collaborator who wants to share data via the Simple Darwin Core, then it may be sufficient to create a spreadsheet that contains column headers matching as many of the Darwin Core term names as you are both interested in sharing - just to be sure you both understand the meaning of the fields you share, and therefore hopefully something about their content. You might create a table in a database using the Simple Darwin Core as a model (if it met all of your needs), and then connect that database with services for sharing via the web. You might use that same database (or spreadsheet) to export a comma-separated value (CSV) file for upload into a hosted service that could serve the data on your behalf. Or you might use that same file to upload into a service that would allow you to add value (such as a georeference) or quality (with a data cleaning tool), or to see your data in the context of other shared data.

1.5.1 Simple Darwin Core as Text

The Text Guide [TEXTGUIDE] describes how to construct and format a text file using a simplified subset of the Fielded Text [FIELDEDTEXT] specification, which allows the contributor to describe the contents of a text file, or set of text files (related or not) through a separate configuration file (called a metafile). The metafile allows the contributor to communicate the structure of the content of the file or files and any relationships between them. Though it is good practice to describe a Simple Darwin Core file with such a metafile, it isn't strictly necessary if the file follows the CSV file specification and the first line of the file contains the field names. A Fielded Text metafile for any text file based on the Simple Darwin Core can be created by customizing the example metafile [SIMPLEMETAFILE] (if this link shows a blank page in your browser, use the View Source option to see the XML document), which includes references to all Darwin Core terms. Refer to the comments in the file itself as well as the metafile specification in the Text Guide [TEXTGUIDE] for more information.

1.5.2 Simple Darwin Core as XML

The XML Guide [XMLGUIDE] describes how to construct XML schemas to share data based on Darwin Core terms. Looking at the Simple Darwin Core XML Schema [SIMPLEXMLSCHEMA] using the XML Guide as a reference you will be able to see that the schema supports the notion of a SimpleDarwinRecord, which is just a grouping of up to one of each of the Darwin Core terms that are Properties (not Classes). The following example shows a SimpleDarwinRecordSet containing one SimpleDarwinRecord for a Taxon:

<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2006-05-04T18:13:51.0Z</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificNameID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:scientificNameID>
<dwc:acceptedNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=22010</dwc:acceptedNameUsageID>
<dwc:originalNameUsageID>http://research.calacademy.org/research/ichthyology/catalog/fishcatget.asp?spid=53548</dwc:originalNameUsageID>
<dwc:nameAccordingToID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=22764</dwc:nameAccordingToID>
<dwc:namePublishedInID>http://research.calacademy.org/research/ichthyology/catalog/getref.asp?id=671</dwc:namePublishedInID>
<dwc:scientificName>Centropyge flavicauda Fraser-Brunner 1933</dwc:scientificName>
<dwc:acceptedNameUsage>Centropyge fisheri (Snyder 1904)</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Centropyge  Kaup, 1860</dwc:parentNameUsage>
<dwc:originalNameUsage>Centropyge flavicauda Fraser-Brunner 1933</dwc:originalNameUsage>
<dwc:nameAccordingTo>Allen, G.R. 1980. Butterfly and angelfishes of the world. Volume II. Mergus Publishers. Pp. 149-352.</dwc:nameAccordingTo>
<dwc:namePublishedIn>Fraser-Brunner, A. 1933. A revision of the chaetodont fishes of the subfamily Pomacanthinae. Proceedings of the General 
      Meetings for Scientific Business of the Zoological Society of London 1933 (pt 3, no.30): 543-599, Pl. 1.</dwc:namePublishedIn>
<dwc:higherClassification>Animalia;Chordata;Vertebrata;Osteichthyes;Actinopterygii;Neopterygii;Teleostei;Acanthopterygii;Perciformes;
      Percoidei;Pomacanthidae;Centropyge</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Osteichthyes</dwc:class>
<dwc:order>Perciformes</dwc:order>
<dwc:family>Pomacanthidae</dwc:family>
<dwc:genus>Centropyge</dwc:genus>
<dwc:specificEpithet>flavicauda</dwc:specificEpithet>
<dwc:scientificNameAuthorship>Fraser-Brunner 1933</dwc:scientificNameAuthorship>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:taxonomicStatus>accepted</dwc:taxonomicStatus>
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>

The SimpleDarwinRecord acts as a Class in implementation, because all of the terms are properties of it. The Simple Darwin Core schema has just one other level of structure, the SimpleDarwinRecordSet, which is a grouping of one or more SimpleDarwinRecords. The SimpleDarwinRecordSet acts as a Class to define a data set during implementation.

1.6 Doing more with Simple Darwin Core

Sooner or later you may want to share more information than the Simple Darwin Core seems to allow. For example, you and your colleagues might decide that it would be useful to have a standard way to exchange additional information relevant to questions in Conservation. How would you do it?

One way would be to try to "overload" existing terms by using them to hold information other than what was intended based on the definition of the terms. Please don't do this. If an existing term has close to the same meaning as one you want to use, but just doesn't quite fit because of the way the definition is worded, it would be better to request an amendment to the term definition so that it will be clear for your community how to use it. You can request such a change by submitting an issue in the Darwin Core Project [DWC-PROJECT].

Another way to get more out of the Darwin Core without adding a term is to "payload" the dynamicProperties term with structured content, as shown in the example below, using Javascript Open Notatation (JSON). This is perfectly legal, since it doesn't compromise the meaning of the term. One of the weaknesses of payloading data in this way is that it is subject to a lack of stable or well-defined semantics. Also, it is highly recommended to flatten the content into a single string with no non-printing characters (such as line feeds) to facilitate use in the widest variety of data sharing contexts. Still, this might be a reasonable way to at least allow you to share all of your data, even if there might be problems with people using it reliably.

<?xml version="1.0" encoding="UTF-8"?>
<SimpleDarwinRecordSet
xmlns="http://rs.tdwg.org/dwc/xsd/simpledarwincore/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:dwc="http://rs.tdwg.org/dwc/terms/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://rs.tdwg.org/dwc/xsd/simpledarwincore/ http://rs.tdwg.org/dwc/xsd/tdwg_dwc_simple.xsd">
<SimpleDarwinRecord>
<dc:modified>2009-02-12T12:43:31</dc:modified>
<dc:language>en</dc:language>
<dwc:basisOfRecord>Taxon</dwc:basisOfRecord>
<dwc:scientificName>Ctenomys sociabilis</dwc:scientificName>
<dwc:acceptedNameUsage>Ctenomys sociabilis Pearson and Christie, 1985</dwc:acceptedNameUsage>
<dwc:parentNameUsage>Ctenomys Blainville, 1826</dwc:parentNameUsage>
<dwc:higherClassification>Animalia; Chordata; Vertebrata; Mammalia; Theria; Eutheria; Rodentia; Hystricognatha; Hystricognathi; Ctenomyidae; Ctenomyini; Ctenomys</dwc:higherClassification>
<dwc:kingdom>Animalia</dwc:kingdom>
<dwc:phylum>Chordata</dwc:phylum>
<dwc:class>Mammalia</dwc:class>
<dwc:order>Rodentia</dwc:order>
<dwc:family>Ctenomyidae</dwc:family>
<dwc:genus>Ctenomys</dwc:genus>
<dwc:specificEpithet>sociabilis</dwc:specificEpithet>
<dwc:taxonRank>species</dwc:taxonRank>
<dwc:scientificNameAuthorship>Pearson and Christie, 1985</dwc:scientificNameAuthorship>
<dwc:nomenclaturalCode>ICZN</dwc:nomenclaturalCode>
<dwc:namePublishedIn>Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388</dwc:namePublishedIn>
<dwc:taxonomicStatus>valid</dwc:taxonomicStatus>
  <dwc:dynamicProperties>{"iucnStatus":"vulnerable", "distribution":"Neuquén, Argentina"}</dwc:dynamicProperties> 
</SimpleDarwinRecord>
</SimpleDarwinRecordSet>

If you were using just CSV text files to exchange information, then you might be tempted to just add the new fields to the files. This approach suffers most of the same problems as payloading - no one aside from those with whom you communicated would know what those new fields were or how to use them. Sharing in this way via XML would be an even bigger problem, because the Simple Darwin Core XML Schema [SIMPLEXMLSCHEMA] defines the terms that it supports and the new fields would not correspond with any terms understood by the schema. In other words, the XML with your fields in it would not be a valid Simple Darwin Core XML document.

So, if you really need to extend the capabilities of Darwin Core, the best first step is to follow the standards process to add the terms you need. The mechanisms for pursuing this are explained in the Darwin Core Namespace Policy [NAMESPACEPOLICY]. The process will help to assure that the new terms are well conceived, that they don't conflict with existing terms, and that they are properly defined in the broader context of biological diversity information.

1.7 Going beyond Simple Darwin Core

For cases where rich data require rich (non-simple) structure, the Simple Darwin Core alone is not suitable. When sharing information via fielded text [FIELDEDTEXT], the solution is to use the Simple Darwin Core as a core record with one or more associated extensions for the additional information. See the Darwin Core Text Guide [TEXTGUIDE] for an explanation and examples.

When sharing information via XML [XML], a richer structure such as the Access to Biological Collections Data schema [ABCD], or the Generic Darwin Core [GENERICXMLSCHEMA], or another schema built from the Darwin Core terms to suit the use of the data in a particular context. See the Darwin Core XML Guide [XMLGUIDE] for examples and references to model schemas.

\ No newline at end of file diff --git a/terms/decisions.html b/terms/decisions.html index 6a10820..94e4b9b 100644 --- a/terms/decisions.html +++ b/terms/decisions.html @@ -69,10 +69,10 @@

Term decisions

-

Introduction

+

1. Introduction

From time to time changes are proposed to Darwin Core terms through the process described in the Term Change Policy section of the Darwin Core Namespace Policy [NAMESPACEPOLICY]. This document shows the outcome of decisions based on officially proposed changes.

-

Decisions

+

2. Decisions