wiki-archive/twiki/data/SDD/Charter.txt,v

983 lines
52 KiB
Plaintext
Raw Normal View History

head 1.22;
access;
symbols;
locks; strict;
comment @# @;
1.22
date 2009.11.25.03.45.40; author GarryJolleyRogers; state Exp;
branches;
next 1.21;
1.21
date 2009.11.20.02.45.22; author LeeBelbin; state Exp;
branches;
next 1.20;
1.20
date 2007.05.02.10.17.35; author GregorHagedorn; state Exp;
branches;
next 1.19;
1.19
date 2007.03.06.17.30.00; author TWikiGuest; state Exp;
branches;
next 1.18;
1.18
date 2006.12.29.21.28.25; author BobMorris; state Exp;
branches;
next 1.17;
1.17
date 2006.12.29.20.40.24; author GregorHagedorn; state Exp;
branches;
next 1.16;
1.16
date 2006.12.29.19.44.20; author BobMorris; state Exp;
branches;
next 1.15;
1.15
date 2006.11.29.16.16.19; author GregorHagedorn; state Exp;
branches;
next 1.14;
1.14
date 2006.11.29.03.27.23; author BobMorris; state Exp;
branches;
next 1.13;
1.13
date 2006.11.28.11.21.55; author GregorHagedorn; state Exp;
branches;
next 1.12;
1.12
date 2006.11.25.17.06.37; author BobMorris; state Exp;
branches;
next 1.11;
1.11
date 2006.11.22.15.05.28; author GregorHagedorn; state Exp;
branches;
next 1.10;
1.10
date 2006.05.03.13.24.28; author GregorHagedorn; state Exp;
branches;
next 1.9;
1.9
date 2006.05.02.15.05.52; author GregorHagedorn; state Exp;
branches;
next 1.8;
1.8
date 2006.04.24.09.48.35; author GregorHagedorn; state Exp;
branches;
next 1.7;
1.7
date 2006.04.11.08.32.08; author GregorHagedorn; state Exp;
branches;
next 1.6;
1.6
date 2006.04.09.18.03.36; author GregorHagedorn; state Exp;
branches;
next 1.5;
1.5
date 2006.04.06.13.20.18; author GregorHagedorn; state Exp;
branches;
next 1.4;
1.4
date 2006.04.06.10.16.39; author KevinThiele; state Exp;
branches;
next 1.3;
1.3
date 2006.02.26.21.14.03; author BobMorris; state Exp;
branches;
next 1.2;
1.2
date 2006.02.23.11.25.00; author GregorHagedorn; state Exp;
branches;
next 1.1;
1.1
date 2006.02.23.09.04.53; author GregorHagedorn; state Exp;
branches;
next ;
desc
@none
@
1.22
log
@none
@
text
@%META:TOPICINFO{author="GarryJolleyRogers" date="1259120740" format="1.1" reprev="1.22" version="1.22"}%
%META:TOPICPARENT{name="WebHome"}%
---+ Charter for the <br />"Structured Descriptive Data" interest group
As required by the [[http://www.tdwg.org/about-tdwg/process/][process]], we have two different charters. First the general
* BDI SDD [[Interest Group Charter]]
and charter of task groups to advance the previous SDD standard under:
* BDI SDD [[Schema Task Group Charter]]
Other task groups may be formed in the future, initiatives are welcome, see [[http://www.tdwg.org/activities/tasks/ojs/how-to-submit-a-charter-for-review/][How to submit a charter]]
@
1.21
log
@none
@
text
@d1 2
a2 2
%META:TOPICINFO{author="LeeBelbin" date="1258685122" format="1.1" reprev="1.21" version="1.21"}%
%META:TOPICPARENT{name="BDI.SDD"}%
d6 3
a8 3
* BDI.SDD [[Interest Group Charter]]
and charter of task groups to advance the previous BDI.SDD standard under:
* BDI.SDD [[Schema Task Group Charter]]
@
1.20
log
@none
@
text
@d1 2
a2 2
%META:TOPICINFO{author="GregorHagedorn" date="1178101055" format="1.1" reprev="1.20" version="1.20"}%
%META:TOPICPARENT{name="WebHome"}%
d6 3
a8 3
* SDD [[Interest Group Charter]]
and charter of task groups to advance the previous SDD standard under:
* SDD [[Schema Task Group Charter]]
@
1.19
log
@Added topic name via script
@
text
@d1 1
a1 3
---+!! %TOPIC%
%META:TOPICINFO{author="BobMorris" date="1167427705" format="1.1" version="1.18"}%
d3 1
a3 150
---+ CHARTER for the "TDWG Structured Descriptive Data" group
*As of 22. Nov. we have been asked to revise the Charter to follow a new template. In the coming days, we will reorganize our charter into the following template:*
*1. Convener*
* Gregor Hagedorn, Biologische Bundesanstalt f<>r Land und Forstwirtschaft, K<>nigin-Luise-Str. 19, 14195 Berlin, Germany. Email: name [at] bba.de, replace "name" with "g.hagedorn"
* (The convener is the principal point of contact for group member or external people interested in collaboration. It is the person responsible for reporting to TDWG and the TDWG Executive Committee on the group&#8217;s activities.)
*2. Core members of the group are:*
* Gregor Hagedorn (BBA, Germany),
* Robert Morris (UMAss, Boston, USA),
* Kevin Thiele (University of Queensland, Australia),
* Bryan Heidorn (University of Illinois, USA)
* (Core members could explain to an outsider the purpose, justification and current activities of the group. Any core member could substitute for the Convener when required.)
* (Further contributors are acknowledged under [[#SddBackground]["Background"]], below.)
*3. Motivation*
<!-- Commented out:
* TEMPLATE QUESTION: Why is this group needed? What is its niche?
* TEMPLATE QUESTION: Motivation should differentiate the group's activities from other TDWG groups and from groups in other standards organizations.
* TEMPLATE QUESTION: Any historical reasoning should be placed in the 'History' field.
-->
In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon, or occasionally of an individual specimen. They may be simple, short and written in plain language, as when used for a popular field guide, or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. Dichotomous keys are specialised identification tools comprising fragments of descriptive data arranged in couplets forming a branching tree. Each fragment (traditionally called a "lead") comprises a usually small natural-language description. Colloquially also included in this notion, but more properly called a "polytomous key", are trees in which there may be multiple leads per node. Coded descriptions comprise highly structured data used in computer identification and analysis programs such as Lucid (<a href="http://www.lucidcentral.org" rel="nofollow">www.lucidcentral.org</a>) , DELTA and a suite of phylogenetic analysis programs such as PAUP. Raw data descriptions (Box 1.2.4) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms mentioned above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless exchange of data between existing and future software platforms, including identification, data-mining and analysis tools, and federated databases.
Scope of SDD: SDD documents may be used to express descriptions of biological taxa, specimens, and non-biological objects or classes.
* SDD documents may include all or some of the following:
* terminologies (e. g., characters and states, modifiers, or character trees with higher concepts)
* character ontologies (currently through character trees, but more fundamental ontologies are planned for the future)
* structured (coded) data, typically such as is found in databases or taxon-by-character matrices
* sample data (e. g., measurements)
* unstructured natural language data
* natural language data with markup
* dichotomous or polytomous keys
* resources associated with descriptions (e. g., images, references, links)
* SDD is currently not designed to accommodate:
* molecular sequence and other genetic data (although these may be considered in future versions)
* occurrence and specimen data and representations of these (e. g., distribution maps)
* complex ecological data such as models and ecological observations
* organism interaction data like host-parasite, plant-pollinator, predator-prey
* nomenclatural and formal systematic (rank) information
Some developers of SDD-compliant software treat organism interactions as character data, or use other, non-fully interoperable methods (compare TaxonHierarchyNotReferencedAnywhere).
*4. Goals Outputs and Outcomes*
<!-- Commented out:
* TEMPLATE QUESTION: Interest Groups should have a few annual goals.
* TEMPLATE QUESTION: Task groups require specific deliverables by nominated dates
* TEMPLATE QUESTION: Deliverables are outputs that can be readily used by people outside the group.
* TEMPLATE QUESTION: Meetings and discussions are not deliverables. Reports, software and standards are deliverables.
* TEMPLATE QUESTION: Each deliverable should be accompanied by a timescale and list of dependencies.
* TEMPLATE QUESTION: Outputs are hosted and supported by the parent Interest Group
-->
The goal of the group is to develop standard computer-based mechanisms for expressing and transferring descriptive information about biological organisms or taxa (as well as similar entities such as diseases), including terminologies, ontologies, descriptions, identification tools and associated resources.
A central goal of the Interest Group is to maintain a repository containing the current schema and its evolution history, sample documents, and selected open-source tools for the support of the use of SDD. This is currently a Subversion repository whose current location is always found on the SDD Wiki.
It will be reviewed semi-annually by the Core Group for its currency and relevance to the goals of the SDD standard.
The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
* The SDD Standard:
* provides a flexible, platform-independent data structure for the capture and storage of taxonomic descriptions
* provides data structures for the support of multi-entry (interactive matrix-based keys) as well as authored polytomous identification keys (traditional keys)
* comprises a superset of data requirements of all known programs managing descriptive data
* provides extension beyond existing programs where data requirements can be predicted
* is readily extensible to account for future developments and data requirements
* is human-readable (although it is assumed that in almost all cases standard descriptions will be machine-generated and processed)
* is XML-based, and provides a schema for validation of documents and the use of schema compilers such as XML-beans for the production of schema-based SDD tool generation.
* It facilitates:
* lossless porting of data between standard-aware applications
* achievable progressive markup of legacy descriptions, particularly natural-language descriptions
* comparability and combinability of alternate descriptions of any one taxon
* efficient reusable descriptions serving multiple purposes
* archiving and sharing of raw and processed data
*5. Strategy*
The most effective strategy since 2001 has been found to be face-to-face [[MeetingMinutes][meetings]]. These meetings help to focus and to take uncertainties in the way to go, which can not be purely resolved by logical argument, into account. As funding allows, the Core Group will meet at least once annually between the annual TDWG meeting, and hold an open meeting at each TDWG meeting. Members of the Core Group will share their experiences implementing SDD-compliant software and as widely as possible promote open source software compliant to the standard.
The use of Wikis has been similarly found to be an effective strategy of documentation and the use of source code repositories an effective way to manage evolution of the formal representation of the standard.
The Core Group manages its development with a source code control system, presently the Subversion repository mentioned above. When a point of stability is reached, a Release Candidate is frozen. Development continues through a series of Release Candidates until a version is deemed acceptable for release. Releases are copied to the wiki for public comment and ultimate submission to TDWG for approval.
*6. Becoming Involved*
<!-- Commented out:
* TEMPLATE QUESTION: Required for Interest Groups and recommended for Task Groups.
* TEMPLATE QUESTION: How could an individual reading this help the group?
* TEMPLATE QUESTION: What skills are currently in need
* TEMPLATE QUESTION: Who should be contacted?
-->
All interested parties are particularly encouranged to comment on the SDD Schema and its ancillary supplementary documentation and sample documents. and to contribute further to these collections at the SDD wiki. Membership in the group is open to any interested parties, and anyone implementing SDD-compliant software becomes a de-facto member of the core group if they wish to actively participate in development of the schema and its document sets. SDD development and understanding rests on two skills: experience with the representation technologies involved and experience with taxonomic identification and description tools. Usually it is most productive if there is substantial experience in one and at least slight experience in the other of these areas.
The core SDD group is considering defining a subset, "SDD Lite" of the current schema, with the particular goal of producing a representation in RDF of the main concerns of SDD, in furtherance of TDWG's goal to have RDF representations of its major ontologically-related standards. Hence, the SDD Interest Group especially seeks people interested, and with suitable experience in, the use of Semantic Web technologies for describing taxa.
*7. History and context*
Versions:
* Current Standard: Version 1 (endorsed by TDWG October 2005)<br/>
* Most Recent Version: [[Version 1dot1][SDD Version 1.1]] (April 28, 2006)<br/>
* Working Draft Version: Version 1.2
#SddBackground History and background
TDWG endorsed the DELTA (Descriptive Language for Taxonomy) format as a standard for representation of taxonomic descriptions in the 1980's. The SDD subgroup was established 1998 as a subgroup of the Taxonomic Databases Working Group (TDWG, www. tdwg.org) of the International Union of Biological Sciences (IUBS), in response to recognition that a program-independent, non-proprietary standard based on current data interchange techniques was needed.
The subgroup has met many times since 1998, and conducted discussions by [[http://www.diversitycampus.net/Projects/TDWG-SDD//SDD-EmailList.html][email list]] and wiki pages. It has considered the needs of a wide variety of existing programs that manage, produce and consume biological descriptions, as well as incorporating new ideas that may be implemented in the future.
The SDD subgroup began discussing issues and scoping the standard through an email discussion group established in November 1999 (see the [[http://www.diversitycampus.net/Projects/TDWG-SDD//SDD-EmailList.html][SDD email list archives]]). This resulted in broad participation, but as a result of an extremely wide spectrum of expectations and approaches the discussion did not make substantial progress or convergence.
The most effective strategy since 2001 has been found to be face-to-face [[MeetingMinutes][meetings]]. These meetings help to focus and to take uncertainties in the way to go, which can not be purely resolved by logical argument, into account.
The major meetings so far were: Canberra, Nov. 2001; Sao Paulo, Oct. 2002; Paris, Feb. 2003; Lisbon, October 2003; Berlin, May 2004; Christchurch, Oct. 2004; St. Petersburg, Sept. 2005, and Berlin, April 2006. Over 60 people contributed to these discussions. However, the help, criticism and energy of Jacob Asiedu, Nicolas Bailly, Damian Barnier, Donald Hobern, Trevor Patterson, Guillaume Rousse, and Steve Shattuck is especially acknowledged.
Descriptive data, unlike specimen databases and name services, usually reside in many dispersed and independent data files. With few exceptions these are not provided by large organisations.
Descriptive data range from unstructured natural language to highly structured (coded) data. Each dataset typically has an independent ontology. SDD has been designed to accommodate the current complexity, but also provide means for further (voluntary) standardizations of ontologies.
The SDD (Structured Descriptive Data) xml schema defines a method to encode descriptive data in biology and other subjects. The primary goal of the design is to increase the knowledge and availability of knowledge about the diversity of life on earth. However, it may be used in many other areas (including medicine, pathology, archeology, anthropology) wherever objects or classes of objects are described for later reidentification. It is hoped that this standard will reach general acceptance to become a successor to existing standards like DELTA or NEXUS.
For the future we expect that the development of SDD forms a valuable contribution to future development of structured online monographs or species pages that include descriptive data as well as other biodiversity data.
*8. Summary*
<!-- Commented out:
* TEMPLATE QUESTION: A concise (<500 word) public summary of the group for use on the TDWG web site and in general literature about TDWG.
-->
The Interest Group for Structured Descriptive Data (SDD) designs, proposes, and maintains, standards for the representation of descriptive data about taxa and specimens. It is used particularly in support of interoperability and exchange mechanisms for software packages and web services handling descriptive data (e. g., "species banks" and interactive identification). Its principal target audience comprises developers of software, databases and web sites supporting the identification and description of organisms and taxa in the field or the laboratory. In turn, the audience for such software tools might be scientists (including taxonomists and systematists as well as ecologists or people in conservation work), practioners (including quarantine officers, workers in disease control), educators (primary as well as secondary level), or decision makers concerned with precise descriptions of the biota of the planet.
*9. Resources*
* The primary home address for communications of this group is: *[[http://wiki.tdwg.org/twiki/bin/view/SDD/WebHome]]*
* Current services using SDD are
* [[http://www.lucidcentral.org/][Lucid]],
* [[http://efg.cs.umb.edu/][UMASS-Boston Electronic Field Guide Project (EFG)]],
* [[http://www.identifylife.org/][IdentifyLife]],
* [[http://iris.biosci.ohio-state.edu/hymenoptera/][Hymenoptera On-Line Database]] and
* [[http://www.isrl.uiuc.edu/~openkey/][OpenKey]]
*10. OUTPUTS and OUTCOMES (and timeframe):*
The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the [[SDD.WebHome][SDD Wiki]].
SDD 1.1 is nearing completion at the end of 2006 and will be proposed for adoption in 2007.
Investigation of, and a draft proposal for, an RDF representation of a subset of SDD will be completed in 2007.
d5 5
a9 1
SDD is under investigation by some members of the Core Group as a vehicle for incremental markup of taxonomic treatments in digitized legacy systematics literature.
@
1.18
log
@none
@
text
@d1 2
@
1.17
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1167424823" format="1.1" version="1.17"}%
d45 3
d60 3
a62 1
The Interest Group maintains a Subversion repository whose current location is always found on the SDD Wiki. A central goal of the Interest Group is to keep contributions to the corpus of applications available in this or a similar repository ##I don't understand this - I believe applications will normally NOT be in the repository##. It will be reviewed semi-annually by the Core Group for its currency and relevance to the goals of the SDD standard.
d85 2
a144 2
==NUMBERING?==
a148 1
The SDD Standard is currently relatively immature but stable. It is expected that ongoing experience will resultin in further developments.
d150 1
a150 1
SDD 1.2 is nearing completion at the end of end of 2006 and will be proposed for adoption in 2007.
@
1.16
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="BobMorris" date="1167421460" format="1.1" version="1.16"}%
d20 1
d24 1
d46 1
d53 1
d57 1
a57 1
The Interest Group maintains a Subversion repository whose current location is always found on the SDD Wiki. A central goal of the Interest Group is to keep contributions to the corpus of applications available in this or a similar repository. It will be reviewed semi-annually by the Core Group for its currency and relevance to the goals of the SDD standard
d82 1
d87 1
d122 1
d124 1
d126 1
a126 1
The Interest Group for Structured Descriptive Data (SDD) designs, proposes, and maintains, standards for the represenation of descriptive data about taxa and specimens. Its princpal target audience comprises developers of software, databases and web sites supporting the identification and description of organisms and taxa in the field or the laboratory. In turn, audience for such tools might be scientists, educators, or decision makers concerned with precise descriptions of the biota of the planet.
d138 1
a138 7
*9. AUDIENCE:*
Current and future users of SDD-enabled systems include taxonomists and systematists, ecologists, people in conservation agencies, school teachers, naturalists, quarantine officers, workers in disease control, etc.
In its direct form SDD is used by developers of software addressing these audiences. It is used particularly in support of interoperability and exchange mechanisms for software packages and web services handling descriptive data (e. g., "species banks" and interactive identification).
SDD is under investigation by some members of the Core Group as a vehicle for incremental markup of taxonomic treatments in digitized legacy systematics literature.
d150 1
a150 6
@
1.15
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1164816979" format="1.1" version="1.15"}%
d9 1
a9 1
* (The convener is the principle point of contact for group member or external people interested in collaboration. It is the person responsible for reporting to TDWG and the TDWG Executive Committee on the group&#8217;s activities.)
d13 1
a13 1
* Bob Morris (UMAss, Boston, USA),
d24 1
a24 1
In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon. or occasionally of an individual specimen. They may be simple, short and written in plain language, as when used for a popular field guide, or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. Dichotomous keys are specialised identification tools comprising fragments of descriptive data arranged in couplets forming a branching tree. Each fragment (traditionally called a "lead") comprises a usually small natural-language description. Coded descriptions comprise highly structured data used in computer identification and analysis programs such as Lucid (<a href="http://www.lucidcentral.org" rel="nofollow">www.lucidcentral.org</a>) , DELTA and a suite of phylogenetic analysis programs such as PAUP. Raw data descriptions (Box 1.2.4) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
d28 3
a30 3
* terminologies (e. g., characters and states, modifiers, character trees with higher concepts)
* character ontologies (currently through char. trees, plans for more fundamental ontologies are planned for the future)
* structured (coded) data
d53 2
d73 1
a73 3
The SDD subgroup began discussing issues and scoping the standard through an email discussion group established in November 1999 (see the [[http://www.diversitycampus.net/Projects/TDWG-SDD//SDD-EmailList.html][SDD email list archives]]). This resulted in broad participation, but as a result of an extremely wide spectrum of expectations and approaches the discussion did not make substantial progress or convergence.
The most effective strategy since 2001 has been found to be face-to-face [[MeetingMinutes][meetings]]. These meetings help to focus and to take uncertainties in the way to go, which can not be purely resolved by logical argument, into account.
d75 1
a75 1
The use of Wikis has been similarly found to be an effective strategy of documentation.
d83 4
d100 4
d118 2
d124 1
a124 1
* [[http://wiki.cs.umb.edu/][EFG]],
a128 2
-----
-----
d130 1
a130 1
*MATERIAL BELOW HAS NOT YET BEEN REWORKED INTO THE TEMPLATE ABOVE:*
d132 1
a132 1
*9. AUDIENCE:* Current and future users of SDD-enabled systems include taxonomists and systematists, ecologists, people in conservation agencies, school teachers, naturalists, quarantine officers, workers in disease control, etc.
d136 16
a152 1
*10. OUTPUTS and OUTCOMES (and timeframe):* The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the [[SDD.WebHome][SDD Wiki]].
a153 1
The SDD Standard is currently relatively immature but stable. It is expected that ongoing experience is resulting in further developments.
@
1.14
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="BobMorris" date="1164770843" format="1.1" version="1.14"}%
d3 1
a3 1
---+ SDD CHARTER
d5 1
a5 1
*PROPOSED CHARTER, put up for comment!*
d7 16
a22 1
*1. NAME:* Structured Descriptive Data (SDD, a TDWG subgroup)
d24 1
a24 4
*2. VERSION HISTORY* <br/>
Current Standard: Version 1 (endorsed by TDWG October 2005)<br/>
Most Recent Version: [[Version 1dot1][SDD Version 1.1]] (April 28, 2006)<br/>
Working Draft Version: Version 1.2
d26 1
a26 29
*3. CONVENER:* Gregor Hagedorn, Biologische Bundesanstalt f<>r Land und Forstwirtschaft, K<>nigin-Luise-Str. 19, 14195 Berlin, Germany. Email: name [at] bba.de, replace "name" with "g.hagedorn"
*4. CORE MEMBERS:* Current core members of the group are:
Gregor Hagedorn (BBA, Germany),
Bob Morris (UMAss, Boston, USA),
Kevin Thiele (University of Queensland, Australia),
Bryan Heidorn (University of Illinois, USA)
Further contributors are acknowledged under [[#SddBackground]["Background"]], below.
*5. HOME ADDRESS:* [[http://wiki.tdwg.org/twiki/bin/view/SDD/WebHome]]
*6. PURPOSE:* To develop standard computer-based mechanisms for expressing and transferring descriptive information about biological organisms or taxa (as well as similar entities such as diseases), including terminologies, ontologies, descriptions, identification tools and associated resources.
#SddBackground *7. BACKGROUND:* TDWG endorsed the DELTA (Descriptive Language for Taxonomy) format as a standard for representation of taxonomic descriptions in the 1980's. The SDD subgroup was established 1998 as a subgroup of the Taxonomic Databases Working Group (TDWG, www. tdwg.org) of the International Union of Biological Sciences (IUBS), in response to recognition that a program-independent, non-proprietary standard based on current data interchange techniques was needed.
The subgroup has met many times since 1998, and conducted discussions by [[http://www.diversitycampus.net/Projects/TDWG-SDD//SDD-EmailList.html][email list]] and wiki pages. It has considered the needs of a wide variety of existing programs that manage, produce and consume biological descriptions, as well as incorporating new ideas that may be implemented in the future.
The major meetings so far were: Canberra, Nov. 2001; Sao Paulo, Oct. 2002; Paris, Feb. 2003; Lisbon, October 2003; Berlin, May 2004; Christchurch, Oct. 2004; St. Petersburg, Sept. 2005, and Berlin, April 2006. Over 60 people contributed to these discussions. However, the help, criticism and energy of Jacob Asiedu, Nicolas Bailly, Damian Barnier, Donald Hobern, Trevor Patterson, Guillaume Rousse, and Steve Shattuck is especially acknowledged.
Descriptive data, unlike specimen databases and name services, usually reside in many dispersed and independent data files. With few exceptions these are not provided by large organisations.
Descriptive data range from unstructured natural language to highly structured (coded) data. Each dataset typically has an independent ontology. SDD has been designed to accommodate the current complexity, but also provide means for further (voluntary) standardizations of ontologies.
The SDD (Structured Descriptive Data) xml schema defines a method to encode descriptive data in biology and other subjects. The primary goal of the design is to increase the knowledge and availability of knowledge about the diversity of life on earth. However, it may be used in many other areas (including medicine, pathology, archeology, anthropology) wherever objects or classes of objects are described for later reidentification. It is hoped that this standard will reach general acceptance to become a successor to existing standards like DELTA or NEXUS.
For the future we expect that the development of SDD forms a valuable contribution to future development of structured online monographs or species pages that include descriptive data as well as other biodiversity data.
*8. SCOPE:* SDD documents may be used to express descriptions of biological taxa, specimens, and non-biological objects or classes.
d43 7
a49 1
*9. AUDIENCE:* Current and future users of SDD-enabled systems include taxonomists and systematists, ecologists, people in conservation agencies, school teachers, naturalists, quarantine officers, workers in disease control, etc.
d51 1
a51 1
In its direct form SDD is used by developers of software addressing these audiences. It is used particularly in support of interoperability and exchange mechanisms for software packages and web services handling descriptive data (e. g., "species banks" and interactive identification).
d53 15
a67 3
*10. OUTPUTS and OUTCOMES (and timeframe):* The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the [[SDD.WebHome][SDD Wiki]].
The SDD Standard is currently relatively immature but stable. It is expected that ongoing experience is resulting in further developments.
d69 1
a69 3
Current services using SDD are [[http://www.lucidcentral.org/][Lucid]], [[http://wiki.cs.umb.edu/][EFG]], [[http://www.identifylife.org/][IdentifyLife]], [[http://iris.biosci.ohio-state.edu/hymenoptera/][Hymenoptera On-Line Database]] and [[http://www.isrl.uiuc.edu/~openkey/][OpenKey]]
*11. STRATEGY:* (What general approaches, principles or strategies are to be used by the group to achieve the outputs and outcomes)
d76 25
d102 16
d119 2
a120 1
----
d122 1
a122 1
__Material to be perhaps incorporated:__
d124 3
a126 1
The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
a127 1
<b>The SDD Standard:</b>
d129 1
a129 14
* provides a flexible, platform-independent data structure for the capture and storage of taxonomic descriptions
* provides data structures for the support of multi-entry and authored polytomous identification keys
* comprises a superset of data requirements of all known programs managing descriptive data
* provides extension beyond existing programs where data requirements can be predicted
* is readily extensible to account for future developments and data requirements
* is human-readable (although it is assumed that in almost all cases standard descriptions will be machine-generated and processed)
* is XML-based, and provides a schema for validation of documents and the use of schema compilers such as XML-beans for the production of schema-based SDD tool generation.
<b>It facilitates:</b>
* lossless porting of data between standard-aware applications
* achievable progressive markup of legacy descriptions, particularly natural-language descriptions
* comparability and combinability of alternate descriptions of any one taxon
* efficient reusable descriptions serving multiple purposes
* archiving and sharing of raw and processed data
d131 1
a131 1
Motivation: In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon. or occasionally of an individual specimen. They may be simple, short and written in plain language, as when used for a popular field guide, or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. Dichotomous keys are specialised identification tools comprising fragments of descriptive data arranged in couplets forming a branching tree. Each fragment (traditionally called a "lead") comprises a usually small natural-language description. Coded descriptions comprise highly structured data used in computer identification and analysis programs such as Lucid (<a href="http://www.lucidcentral.org" rel="nofollow">www.lucidcentral.org</a>) , DELTA and a suite of phylogenetic analysis programs such as PAUP. Raw data descriptions (Box 1.2.4) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.@
1.13
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1164712915" format="1.1" version="1.13"}%
d24 1
a24 1
*5. HOME ADDRESS:* [[http://wiki.cs.umb.edu/twiki/bin/view/SDD/]]
@
1.12
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="BobMorris" date="1164474397" format="1.1" version="1.12"}%
d101 1
a101 1
Motivation: In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon. or occasionally of an individual specimen. They may be simple, short and written in plain language, as when used for a popular field guide, or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. Dichotomous keys are specialised identification tools comprising fragments of descriptive data arranged in couplets forming a branching tree. Each fragment (traditionally called a "lead") comprises a usually small natural-language description. Coded descriptions comprise highly structured data used in computer identification and analysis programs such as Lucid (<a href="http://www.lucidcentral.org" rel="nofollow">www.lucidcentral.org</a>) , DELTA and a suite of phylogenetic analysis programs such as PAUP. Raw data descriptions (Box 1.2.4) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
@
1.11
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1164207928" format="1.1" version="1.11"}%
d22 1
a22 1
Further contributors are acknowledged under "Background", below.
d28 1
a28 1
*7. BACKGROUND:* TDWG endorsed the DELTA (Descriptive Language for Taxonomy) format as a standard for representation of taxonomic descriptions in the 1980's. The SDD subgroup was established 1998 as a subgroup of the Taxonomic Databases Working Group (TDWG, www. tdwg.org) of the International Union of Biological Sciences (IUBS), in response to recognition that a program-independent, non-proprietary standard based on current data interchange techniques was needed.
d67 1
a67 1
Current services using SDD are [[http://www.lucidcentral.org/][Lucid]], [[http://wiki.cs.umb.edu/][EFG]], [[http://www.identifylife.org/][IdentifyLife]]. ## ADD Bryans links, Johnsons (hymenoptera online?),
d87 2
a88 1
* comprises a superset of data requirements of all existing programs
d92 1
a92 1
* is XML-based, and provides a schema for validation of documents.
d97 2
a98 2
* comparability and, if possible, combinability of alternate descriptions of any one taxon
* efficient multi-tasking of descriptions (one description serving alternate purposes)
d101 1
a101 1
Motivation: In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised descriptions of a taxon (or occasionally of an individual specimen). They may be simple, short and written in plain language (if used for a popular field guide), or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment. Dichotomous keys are specialised identification tools comprising fragments of descriptive data arranged in couplets forming a branching tree. Each fragment (lead) comprises a small (occasionally verbose) natural-language description. Coded descriptions comprise highly structured data used in computer identification and analysis programs such as Lucid (<a href="http://www.lucidcentral.org" rel="nofollow">www.lucidcentral.org</a>) , DELTA and a suite of phylogenetic analysis programs such as PAUP. Raw data descriptions (Box 1.2.4) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format. The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases.
@
1.10
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1146662668" format="1.0" version="1.10"}%
d43 15
a57 15
* SDD documents may include all or some of the following:
* terminologies (e. g., characters and states, modifiers, character trees with higher concepts)
* character ontologies (currently through char. trees, plans for more fundamental ontologies are planned for the future)
* structured (coded) data
* sample data (e. g., measurements)
* unstructured natural language data
* natural language data with markup
* dichotomous or polytomous keys
* resources associated with descriptions (e. g., images, references, links)
* SDD is currently not designed to accommodate:
* molecular sequence and other genetic data (although these may be considered in future versions)
* occurrence and specimen data and representations of these (e. g., distribution maps)
* complex ecological data such as models and ecological observations
* organism interaction data like host-parasite, plant-pollinator, predator-prey
* nomenclatural and formal systematic (rank) information
d80 1
a80 3
Material to be perhaps incorporated
http://www.tdwg.gbif.org/tdwg/standards/id/82
d86 6
a91 6
* provides a flexible, platform-independent data structure for the capture and storage of taxonomic descriptions
* comprises a superset of data requirements of all existing programs
* provides extension beyond existing programs where data requirements can be predicted
* is readily extensible to account for future developments and data requirements
* is human-readable (although it is assumed that in almost all cases standard descriptions will be machine-generated and processed)
* is XML-based, and provides a schema for validation of documents.
d94 5
a98 5
* lossless porting of data between standard-aware applications
* achievable progressive markup of legacy descriptions, particularly natural-language descriptions
* comparability and, if possible, combinability of alternate descriptions of any one taxon
* efficient multi-tasking of descriptions (one description serving alternate purposes)
* archiving and sharing of raw and processed data
a100 1
@
1.9
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1146582352" format="1.0" version="1.9"}%
d77 27
@
1.8
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1145872115" format="1.0" version="1.8"}%
d7 1
a7 1
*1. NAME:* TDWG Structured Descriptive Data (SDD)
d14 1
a14 1
*3. CONVENER:* Gregor Hagedorn, Biologische Bundesanstalt f<>r Land und Forstwirtschaft, K<>nigin-Luise-Str. 19, 14195 Berlin, Germany. Email: g.hagedorn [at] bba.de
@
1.7
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1144744328" format="1.0" version="1.7"}%
d11 2
a12 2
Most Recent Version: [[Version 1.01][SDDVersion101]] (Berlin, April 6, 2006)<br/>
Working Draft Version: Version 1.1
d63 1
a63 1
*10. OUTPUTS and OUTCOMES (and timeframe):* The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the [[Main][SDD Wiki]].
@
1.6
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1144605816" format="1.0" version="1.6"}%
d5 1
a5 1
*PROPOSED CHARTER, put up for comment*
d11 1
a11 1
Most Recent Version: [[Version 1.01][SDDVersion101] (Berlin, April 6, 2006)<br/>
d22 1
a22 1
Further contributors are acknowledged under "Strategy", below.
d32 2
d63 1
a63 1
*10. OUTPUTS and OUTCOMES (and timeframe):* The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the [[Main][SDD Wiki].
d73 1
a73 1
The most effective strategy since 2001 has been found to be face-to-face [[MeetingMinutes][meetings]] (Canberra, Nov. 2001; Sao Paulo, Oct. 2002; Paris, Feb. 2003; Lisbon, October 2003; Berlin, May 2004; Christchurch, Oct. 2004; St. Petersburg, Sept. 2005, and Berlin, April 2006). Over 60 people contributed to these discussions. However, the help, criticism and energy of Jacob Asiedu, Nicolas Bailly, Damian Barnier, Donald Hobern, Trevor Patterson, Guillaume Rousse, and Steve Shattuck is especially acknowledged.
@
1.5
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1144329618" format="1.0" version="1.5"}%
d43 1
a43 1
* character ontologies
d50 1
a50 1
* SDD is not designed to accommodate:
d53 2
a54 1
* complex ecological (interaction) data such as models and ecological observations
@
1.4
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="KevinThiele" date="1144318599" format="1.0" version="1.4"}%
d5 1
a5 1
*1. NAME:* TDWG Structured Descriptive Data (SDD)
d7 6
a12 3
*2. VERSION HISTORY*
<br>Current Standard: Version 1 (endorsed by TDWG October 2005)
<br>Most Recent Working Version: Version 1.4 (Berlin, April x 2006)
d17 1
a17 1
Gregor Hagedorn,
d22 2
d26 1
a26 1
*6. PURPOSE:* To develop standard computer-based mechanisms for expressing and transferring descriptive information about biological organisms or taxa (and similar entities such as diseases), including terminologies, ontologies, descriptions, identification tools and associated resources.
d28 1
a28 1
%GREEN%{The SDD group is chartered to propose standard mechanisms for the representation and exchange of information about describing taxa and the interactions between them.}%ENDCOLOR%
d30 1
a30 1
*7. BACKGROUND:* TDWG endorsed the DELTA (Descriptive Language for Taxonomy) format as a standard for representation of taxonomic descriptions in the 1980's. The SDD subgroup was established in September 1998 in response to recognition that the DELTA data format was no longer adequate as a standard, partly because it is tightly tied to a particular set of applications (the DELTA programs), while a standard should be program-independent and non-proprietary.
d32 1
a32 1
The subgroup has met many times since 1998, and conducted discussions by email list and wiki page. It has considered the needs of a wide variety of existing programs that manage, produce and consume biological descriptions, as well as incorporating new ideas that may be implemented in the future.
d34 1
a34 3
Descriptive data, unlike specimen databases and name services, usually reside in many small, dispersed, independent data files, each with an independent ontology (and often structure), and range across a wide variety of levels of atomization, from unstructured natural language to highly structured (coded) data. SDD has been designed to accommodate this complexity.
*8. SCOPE:* SDD documents may be used to express descriptions of biological taxa, specimens, and non-biological objects.
d36 1
a36 8
SDD documents may include all or some of the following:
* terminologies (e.g. characters and states , modifiers etc)
* character ontologies
* structured (coded) data
* sample data (e.g. measurements)
* unstructured (e.g. natural language) data
* dichotomous or polytomous keys
* resources associated with descriptions (e.g. images, references, links)
d38 19
a56 5
SDD is not designed to accommodate:
* molecular sequence and other genetic data (although these may be considered in future versions)
* occurrence and specimen data and representations of these (e.g. distribution maps)
* complex ecological (interaction) data such as models and ecological observations
* nomenclatural and formal systematic (rank) information
d58 1
a58 1
%GREEN%{SDD standards specify, but are not limited to, the expression of classical morphological characters and their states, as well as the relations between those characters and states.}%ENDCOLOR%
d60 1
a60 1
*9. AUDIENCE:* SDD is currently of most use for authors of software used for interactive identification systems and distributed databases of descriptive data (speciesbanks). It is used as an exchange mechanism between existing software packages and web services handling descriptive data and interactive identification (currently [[http://www.lucidcentral.org/][Lucid]], [[http://wiki.cs.umb.edu/][EFG]], [[http://www.identifylife.org/][IdentifyLife]])
d62 1
a62 1
%GREEN%{Uses of SDD standards will include the exchange of data between interactive identification systems as well as applications that may investigate how species differ across geography, time, and phylogeny.}%ENDCOLOR%
d64 1
a64 1
*10. OUTPUTS and OUTCOMES (and timeframe):* The principal products of the SDD group are the SDD standard (SDD 1.0 endorsed by TDWG 2005), and the discussion framework captured on the SDD Wiki (url).
d66 1
a66 3
The SDD Standard is currently relatively immature but stable.
Future planned developments(?)
d68 1
a68 1
*11. STRATEGY:* (What general approaches, principles or strategies are to be used by the group to achieve the outputs and outcomes)
d70 1
a70 1
The SDD subgroup began discussing issues and scoping the standard through an email discussion group in November 1999 (see the SDD email list archives). Considerable progress has been made at face-to-face meetings amongst a small group of core contributors, in Nov. 2001 (Canberra), Oct. 2002 (Sao Paulo), Feb. 2003 (Paris), October 2003 (Lisbon), May 2004 (Berlin), Oct. 2004 (Christchurch), Sept 2005 (St. Petersburg) and April 2006 (Berlin).
d72 1
a72 1
Discussion has mostly been restricted to a small, dedicated group (the core contributors plus programmers working on SDD implementations). Broader discussion was initially achieved through the email list, but transfer of the discussion to the wiki changed the dynamics to a small group of contributors (and a larger group of watchers).
@
1.3
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="BobMorris" date="1140988443" format="1.0" version="1.3"}%
d3 1
a3 1
SDD CHARTER
d5 1
a5 1
We need to fill in the following information for the TDWG Infrastructure Project!
d7 3
a9 1
1. __NAME:__ TDWG Structured Descriptive Data (SDD)
d11 1
a11 1
2. __VERSION HISTORY__ Version 1 (revision number according to Wiki revisions at the bottom of the page). Note: The group exists since 1998, but had no formal charter until 2006.
d13 5
a17 1
3. CONVENER: Gregor Hagedorn, Biologische Bundesanstalt f<>r Land und Forstwirtschaft, K<>nigin-Luise-Str. 19, 14195 Berlin, Germany. Email: name@@bba.de (replace name with g.hagedorn)
d19 1
a19 1
4. CORE MEMBERS: The current core members of the group are G. Hagedorn, R. Morris (Boston, USA), K. Thiele (Australia) and B. Heidorn (USA) ###Who else? Please add!###
d21 1
a21 1
5. HOME ADDRESS: [[http://wiki.cs.umb.edu/twiki/bin/view/SDD/]]
d23 1
a23 1
6. PURPOSE: ###TODO### The goals of the group in non-technical language? %GREEN% The SDD group is chartered to propose standard mechanisms for the representation and exchange of information about describing taxa and the interactions between them. %ENDCOLOR%
d25 1
a25 1
7. BACKGROUND: ###TODO### The context or environment in which the group is established; the reason for its establishment %GREEN%%ENDCOLOR%
d27 1
a27 1
8. SCOPE: ###TODO### What is in scope, and what is not in scope if the latter helps to define the role of the group %GREEN% SDD standards specify, but are not limited to, the expression of classical morphological characters and their states, as well as the relations between those characters and states. %ENDCOLOR%
d29 3
a31 1
9. AUDIENCE: ###TODO### What is the audience or who are the clients of group activities or outputs? %GREEN% Uses of SDD standards will include the exchange of data between interactive identification systems as well as applications that may investigate how species differ across geography, time, and phylogeny.%ENDCOLOR%
d33 8
a40 2
10. OUTPUTS and OUTCOMES (and timeframe): What are the anticipated outcomes and/or outputs of the group and
what is their timeframe? ###TODO###
d42 5
a46 1
11. STRATEGY: ###TODO### What general approaches, principles or strategies are to be used by the group to achieve the outputs and outcomes
d48 17
a64 1
-- Main.GregorHagedorn - 23 Feb 2006
@
1.2
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1140693900" format="1.0" version="1.2"}%
d17 1
a17 1
6. PURPOSE: ###TODO### The goals of the group in non-technical language?
d19 1
a19 1
7. BACKGROUND: ###TODO### The context or environment in which the group is established; the reason for its establishment
d21 1
a21 1
8. SCOPE: ###TODO### What is in scope, and what is not in scope if the latter helps to define the role of the group
d23 1
a23 1
9. AUDIENCE: ###TODO### What is the audience or who are the clients of group activities or outputs?
@
1.1
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="GregorHagedorn" date="1140685493" format="1.0" version="1.1"}%
d7 1
a7 1
1. The NAME of the (Interest or Task) Group: A brief name that describes the activity of the group
d9 1
a9 1
2. VERSION HISTORY of the Charter document: Date, author and version of the document
d11 1
a11 1
3. CONVENER: Name and contact details
d13 1
a13 1
4. CORE MEMBERS: The founding/core members proposed for the group
d15 1
a15 1
5. HOME ADDRESS: The base address for information and resources of the group?
d17 1
a17 1
6. PURPOSE: The goals of the group in non-technical language?
d19 1
a19 1
7. BACKGROUND: The context or environment in which the group is established; the reason for its establishment
d21 1
a21 1
8. SCOPE: What is in scope, and what is not in scope if the latter helps to define the role of the group
d23 1
a23 1
9. AUDIENCE: What is the audience or who are the clients of group activities or outputs?
d26 1
a26 1
what is their timeframe?
d28 1
a28 1
11. STRATEGY: What general approaches, principles or strategies are to be used by the group to achieve the outputs and outcomes
@