SDD Part 0: Introduction and Primer to the SDD Standard

2.2 SDD for coded sample descriptions

Sample data descriptions (Box 2.1.1.2) usually comprise repeated measurements of parts of individual specimens, and are the basis from which the more abstracted descriptions in natural language and coded descriptions are derived. Few taxonomists consistently record and archive their raw data in a standardised format.

Box 2.2.1 - Example of sample (specimen) descriptive data

Specimen Spore length Spore width Spore colour
1 2 3 4 5 1 2 3 4 5
TJM45337 12 13 12 15 11 8 8 7 6 6 brown
TLM33466 15 18 17 17 15 10 8 9 9 10 yellow

Coded sample descriptions record the range of characteristics found in an individual, as opposed to a class or taxon (e.g. species, genus etc.). To record summary data for taxonomic levels higher than the individual, see the topic Using SDD for coded summary descriptions.

A coded sample description requires three essential items: the identifiers of the specimens being described, a set of descriptors (characters and states) used to describe the specimens, and the coded descriptions themselves.

A simple SDD instance document representing part of the sample data above has the basic structure shown below and in Example 2.2.1.

coded_sample_descriptions.gif

Example 2.2.1 - A simple coded sample description

<?xml version="1.0" encoding="UTF-8"?>
<Datasets xsi:schemaLocation="http://ns.tdwg.org/UBIF/2006 http://www.lucidcentral.org/2006/SDD/SDD1.1-RC1/SDD.xsd" xmlns="http://ns.tdwg.org/UBIF/2006" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
   <TechnicalMetadata created="2006-06-22T11:19:56.007+10:00">
      <Generator name="by hand" version="1"/>
   </TechnicalMetadata>
   <Dataset xml:lang="en-us">
      <Representation>
         <Label>sample data example</Label>
      </Representation>
      <Specimens>
         <Specimen id="sp1">
            <Representation>
               <Label>TJM45337</Label>
            </Representation>
         </Specimen>
         <Specimen id="sp2">
            <Representation>
               <Label>TLM33466</Label>
            </Representation>
         </Specimen>
                        ...etc
      </Specimens>
      <Characters>
         <QuantitativeCharacter id="ch1">
            <Representation>
               <Label>Spore length</Label>
            </Representation>
         </QuantitativeCharacter>
         <CategoricalCharacter id="ch3">
            <Representation>
               <Label>Spore colour</Label>
            </Representation>
            <States>
               <StateDefinition id="s1">
                  <Representation>
                     <Label>brown</Label>
                  </Representation>
               </StateDefinition>
               <StateDefinition id="s2">
                  <Representation>
                     <Label>yellow</Label>
                  </Representation>
               </StateDefinition>
                                        ...etc
            </States>
         </CategoricalCharacter>
                        ...etc
      </Characters>
      <CodedDescriptions>
         <CodedDescription>
            <Representation>
               <Label>Specimen data for fungal specimens</Label>
            </Representation>
            <SampleData>
               <SamplingEvent id="TJM45337">
                  <SamplingUnit>
                     <Quantitative ref="ch1" value="12"/>
                  </SamplingUnit>
                  <SamplingUnit>
                     <Quantitative ref="ch1" value="13"/>
                  </SamplingUnit>
                  <SamplingUnit>
                     <Quantitative ref="ch1" value="12"/>
                  </SamplingUnit>
                  <SamplingUnit>
                     <Quantitative ref="ch1" value="15"/>
                  </SamplingUnit>
                  <SamplingUnit>
                     <Quantitative ref="ch1" value="11"/>
                  </SamplingUnit>
                  <SamplingUnit>
                     <Categorical ref="ch3">
                        <State ref="s1"/>
                     </Categorical>
                  </SamplingUnit>
               </SamplingEvent>
                                       ...etc
            </SampleData>
         </CodedDescription>
      </CodedDescriptions>
   </Dataset>
</Datasets>

For more information on defining sampling units using the <Specimens> element, see the topic Defining specimen names. For more information on defining characters and states using the <Characters> element, see the topic Defining characters and states.

Note that characters can also be arranged into hierarchies. See the topic Defining character hierarchies for more information.

The <Representation> element provides a label for the description. This may be useful if the instance document includes multiple descriptions for different purposes, or is intended for publication in multiple languages (see the topic Language support in SDD.

The <SamplingEvent> includes elements specifying the timing and location of the sample data. For more information see the topic Elements within <SamplingEvent>

Characters used in the description are listed under <SampleData>. SDD distinguishes between different kinds of characters (see the topic Defining characters and states for more information). For categorical characters (characters with states) the states occurring in the taxon being described are listed by reference. In the example given above, the specimen TJM45337 is described as having brown spores (state s1 of character ch3) while specimen TLM33466is described as having yellow spores (state s2 of character ch3). Note that states that are not listed are inferred to not occur in the taxon being described.

-- KevinThiele - 06 Jul 2006