wiki-archive/twiki/temp-gjr/BDI/SDD/SecondaryClassifiersWithinC...

522 lines
53 KiB
Plaintext
Raw Permalink Blame History

%META:TOPICINFO{author="GarryJolleyRogers" date="1259118878" format="1.1" version="1.16"}%
%META:TOPICPARENT{name="ClosedTopicSchemaDiscussionSDD09"}%
---+!! %TOPIC%
This topic is an attempt to find a general solution for TheProblemOfSex. I am still struggling with it and very much hope that you can help by commenting, including your feelings about the options.
I am not sure whether this should be on the agenda for Berlin, but in the longer run I believe we need a solution. It has not been discussed at any meeting so far (although see the related GeographicalRestrictions), perhaps because the inclusion of diagnostic keys only occurred in Lisbon and this really precipitates this issue. Main.GregorHagedorn
---
%GREEN%I believe SecondaryClassifiersProposal addresses many of the issues. Please also look there. -- Main.BobMorris - 30 Apr 2004 %ENDCOLOR%
---
I suggest that you preferably look at the Word document (attached as zip file at the end). The Word document is in tracking mode, so you can write directly into it, and upload the file including the comments. I am pasting the converted and manually edited text of the document here as well if you want to add your comments here. However, I am not sure everything converted OK. I did this as a test, but I believe conversion is way to labor intensive to be feasable for future WIKI discussions on such documents.
---
<h2>Systematic variation within classes (sex, stages, etc.)</h2>
<p style="background:yellow">[&hellip; we had quite a few discussions about sex and stage on the WIKI (e. g. http://wiki.cs.umb.edu/twiki/bin/view/BDI.SDD_/TheProblemOfSex. I believe this is one of the solvable things and that it should now be solved in BDI.SDD_ &hellip;!</p>
<p style="background:yellow">Currently I have worked through the beginning several times, but it is getting rough towards the end, when we come to conclusions and proposals. I am still undecided what the best strategy is, and I hope that you can help me with some comments, including your feelings on this&hellip;]</p>
<h3>Introduction</h3>
<p>When objects are classified to the most specific level recognized in the class hierarchy (in biology = species, subspecies, or variety), their descriptions are still not necessarily identical. Some differences are due to random effects in the individual history of an object, others are however systematically repeatable (and, in biology, genetically coded). The most important types of genetically coded intra-class variation are polymorphisms and systematic changes occurring during developmental or life-cycle stages of an object (Table 1).</p>
<p><strong>Table</strong> <a id="Tab_fWithinClassVarBioMusic" name="Tab_fWithinClassVarBioMusic"></a><b>1</b><strong>.</strong> Examples for classification systems and sources of intra-class-variation in biology and the study of musical instruments</p>
<table border="1" cellspacing="0" cellpadding="0" width="100%" style='width:100.0%;border-collapse:collapse;border:none'>
<tr>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Classification system<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Biological organisms<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Musical instruments<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'><strong>Phylogenetic / Inherited<br />
</strong> (&rarr; multiple characteristics<br />
are linked)<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Evolutionary history<br />
/taxonomic classification<br />
(e. g., order/family/genus)<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Craftsmanship, technological, or industrial<br />
traditions of instrument creation<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'><strong>Operational</strong> (arbitrarily based<br />
on a single characteristic)<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Tree/shrub/herb,<br />
water vs. land plants<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Sachs-Hornbostel (idiophones,<br />
membranophones, chordophones,<br />
aerophones, electrophones)<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'> <br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'> <br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'> <br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Source of further variation<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Biological organisms<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Musical instruments<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'><strong>Individual history</strong><br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'> <br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'> <br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>
<p style='margin-left:9.95pt;text-indent:-9.95pt'><strong>a) chance effects</strong></p>
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Scarring of skin, mutilations<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Scratching, discoloration<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>
<p style='margin-left:9.95pt;text-indent:-9.95pt'><strong>b) systematic responses<br />
to the environment</strong></p>
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Phenotypic responses like flowering<br />
time or variable shape to maximize<br />
resource utilization,<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Response to humidity or<br />
submerging in water<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>
<p style='margin-left:9.95pt;text-indent:-9.95pt'><strong>c) essential and<br />
repeatable history</strong></p>
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Developmental stages:<br />
e. g., egg/embryo, larva, adult;<br />
Life-cycle stages:<br />
e. g., gametophyte, sporophyte<br />
<br />
</td>
<td valign="top" style='border:none;padding:0mm 5.4pt 0mm 5.4pt'>Phases in the construction of<br />
an instrument<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'><strong>Genetic polymorphism</strong><br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Sexes or blood types (= multiple alleles<br />
for a gene present within populations)<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Perhaps: decorative styles stretching across<br />
multiple instrument types and traditions<br />
<br />
</td>
</tr>
</table>
<p>A characteristic that is variable within a class is not necessarily uninformative for diagnostic purposes. If a plant has both red and white flowers and other plant species have yellow, blue, red, or white colors, specifying the flower color of an object reduces the set of remaining classes in identification. A description "flowers red or white" is a meaningful part of a diagnostic class description.</p>
<p>However, certain kinds of polymorphisms change highly systematically. A description "sex male" is meaningful for an object, but "sex female or male" is not a meaningful part of a class description, since the two sexes by definition occur together. Similarly, recording the presence of life stages may or may not be meaningful, depending on the taxonomic scope and whether all classes have a larval and an adult stage. This problem of character "saturation" (= all potential character states present) can be automatically detected if a character has been recorded either for all classes or for a sufficient sample of objects. It normally does not require the recording of additional information.</p>
<p>Another problem specific to intra-class variation is, however, more difficult to solve. Some of the characteristics already mentioned form an operational classification system. In biology these secondary systems are independent of the primary system of taxonomic names. The most frequently encountered examples of such "secondary classifiers" are designations of sex (male/female), generation (e. g., spring/summer), and life cycle or development stage (e. g., larva, adult). The values of such classifiers are not directly observable characters, but rather typify sets of correlated character expressions. Objects with different classifier values will have moderately or strongly different descriptions.</p>
<p>If for a secondary classifier like "sex" the object descriptions differ only in expected characteristics (the sex organs), the values of the classifiers are suppressed in class/taxon descriptions. Other weakly correlated characteristics (e. g., males being a little smaller than females) will be presented as a generalized description (e. g., as a size range including both sexes). However, when many diagnostically relevant characteristics (e. g. wing pattern of butterflies or bird plumage), or (for someone having some experience with a taxonomic group) unexpected characteristics) differ between sexes, separate descriptions will be prepared. This case, where more than the sexual organs differ between sexes, is called "sexual dimorphism".</p>
<p>Again, however, the values will not be part of the description, but will be used to group or structure the descriptions. Depending on the amount of differences, the grouping may precede the class/taxon name be a subheading within class/taxon descriptions, or only an annotation at individual descriptive statements (Table 2). Furthermore, if different sexes or life cycle stages are keyed out separately in diagnostic keys, the classifier values are usually added to the name that is keyed out.</p>
<p><strong>Table</strong> <a id="Tab_fOptionalClassifierPresentations" name="Tab_fOptionalClassifierPresentations"></a><b>2</b><strong>.</strong> Examples for different presentations of sex and life cycle stage classifiers.</p>
<table border="1" cellspacing="0" cellpadding="0" width="100%" style='width:100.0%;border-collapse:collapse;border:none'>
<tr>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Stage grouping preceding<br />
class/taxon name<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Stage as subheading<br />
within description<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'>Sex as annotation<br />
within description<br />
<br />
</td>
</tr>
<tr>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'><strong>Larval descriptions</strong><br />
<i>Colias alfacariensis</i> Ribbe 1905<br />
<i>Colias crocea</i> (Geoffroy, 1785)<br />
<strong>Adult butterfly</strong><br />
<i>Colias alfacariensis</i> Ribbe 1905<br />
<i>Colias crocea</i> (Geoffroy, 1785)<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'><i>Colias alfacariensis</i> Ribbe 1905<br />
Distribution: &hellip;<br />
Common characteristics: &hellip;<br />
Larva: &hellip;<br />
Adult (imago): &hellip;<br />
<br />
</td>
<td valign="top" style='border:none;border-bottom:solid windowtext 1.0pt; padding:0mm 5.4pt 0mm 5.4pt'><i>Colias alfacariensis</i> Ribbe 1905<br />
&hellip;<br />
<strong>Larva:</strong> Size &hellip; body green, &hellip;<br />
<strong>Adult (imago)</strong>: &hellip;<br />
Size &hellip;, wings white<br />
(females) or clouded<br />
yellow (males)<br />
<br />
</td>
</tr>
</table>
<p>Storing the information about classifiers as character data is satisfactorily for object descriptions, but not for class descriptions. Although sets of correlated characters can be detected algorithmically, it is very difficult to impossible to detect which of the correlated "characters" are truly observable characters, and which "characters" summarize and generalize sets of character correlations.</p>
<p>Before proposing an information model for secondary classifiers like sex, generation, or stages, it must first be decided whether it is appropriate to generalize these to a single concept. As a first step, definitions of the most important classifier concepts in biology will be discussed.</p>
<h3>Mating type and sex</h3>
<p>Many organism have a breeding system involving multiple <strong>mating types</strong> as a mechanism to improve outcrossing (= prevent or reduce inbreeding). Mating types may be classified as: <strong>Sex</strong> and morphological or physiological <strong>self-incompatibility systems</strong>. Note that instead of using "mating type" as a generalized term (i. e. including sex), many authors use it when referring to compatibility types (this statement is based on a pers. study using internet search mechanisms). The reason for the latter usage mainly seems to be that authors work on taxonomic groups that do not show a differentiation into sexes (e. g., yeasts).</p>
<p>In biological usage, <strong>sex</strong> is defined as the sum of morphological and behavioral features that distinguish organisms on the basis of their reproductive function (EB 2001, CED 1992). The concept of sex is limited to two different sexes ("male", "female"); however, the combination "hermaphrodite" (a single organism being both male and female) and the absence or sex may also be considered states. In contrast, the number of compatibility types differs strongly among organism groups, as do the names used for individual types (e. g., "+"/"&ndash;", "A"/"a"/"alpha", "b1"/"b2"/"b3"). All mating types are usually genetically determined (an exception is, e. g., the marine worm <i>Bonellia</i> with environmental sex-determination, EB 2001).</p>
<p>In many animals, either sex is the only mating type, or sex and <strong>self-incompatibility system</strong> are always correlated. The difference between the two concepts can, e. g., be seen in plants like <i>Nicotiana</i> that are sexual hermaphrodites in having both anthers and gynoecium in each individual, but have a <strong>physiological self-incompatibility system</strong> to prevent inbreeding. Similarly, fungi may produce differentiated male and female organs on the same thallus but remain self-incompatible (heterothallic) due to a separate physiological self-incompatibility system. Most fungi or algae have no morphologically identifiable sex system and are classified only according to their self-incompatibility system (which is often only called "mating type").</p>
<p>An example of a <strong>morphological self-incompatibility system</strong> (= heteromorphy) is the heterostyly in plants (e. g., in <i>Primula</i> species: distyly or in <i>Lythrum salicaria</i> and <i>Eichhornia:</i> tristyly). This mechanism is independent of the sex system, but closely linked with a physiological incompatibility system where present (Richards 1986).</p>
<h3>Generations, life cycle, and developmental stages</h3>
<p>The term <strong>generation</strong> is relative consistently used in biology and involves a cycle of reproduction. Although different generations are often genetically different (especially after sexual reproduction), this is not a necessity. Reproduction may be vegetative (e. g., parts of a plant break off, are dispersed, and root again forming the next generation). In single celled organisms generation and cell division are synonymous. The essential definition of "generation" thus denotes a reduction to dispersal or persistence stage and the consequential regrowth of the full organism.</p>
<p><strong><br />
Life cycle</strong> or <strong>developmental stages</strong> always denote an aspect of temporal development. Life cycle may be defined as "the series of changes in the life of an organism, including reproduction" (EB 2001: dictionary). Two kinds of life cycles exist (EB 2001: "life cycle"):</p>
<ul class="compact">
<li>All stages occur within the life of an individual organism (single-generational life cycle). The life cycle may be<ul
class="compact">
<li>truly having only a single generation, as in bacteria (haplontic life cycle), or</li>
<li>an alternation of haploid and diploid generations, one of which is so highly reduced that it is no longer considered a separate generation (diploid zygote directly undergoing meiosis, or haploid, undifferentiated gametes).</li>
</ul>
<li>The stages include several generations to complete a full life cycle (multigenerational life cycle).</li>
</ul>
<p><br />
Within a single generation <strong>developmental stages</strong> (or phases) may occur. These may either partition a continuous variation (e. g., embryo, baby, youth, and adult) or may relate to distinct structural changes (e. g., egg, larva, pupa, imago in holometabolic insects). The term "life cycle stage" is often used as a synonym of developmental stage (which conforms to the dictionary definition cited above). This causes no problem in organisms that complete their life cycle in a single generation, but appears unfortunate in organisms having a multigenerational life cycle but also developmental stages.</p>
<p>In the case of multigenerational life cycles, the term "life cycle stage" is dominant over the use of "generation". For example, in the red algae <i>Polysiphonia</i> the haploid generation (gametophyte) is differentiated into male and female individuals, the following two diploid generations (carposporophyte, tetrasporophyte) are not sexually differentiated. All three generations are considered "life cycle stages". The practical use of "generation" as a classifier concept is thus restricted to organisms with a single-generational life cycle. An example are the spring and summer generations of some butterflies that are markedly differently colored, e. g. "<i>Araschnia levana</i> gen. vern." versus "<i>A. levana</i> gen. aest." (seasonal dimorphism).</p>
<p>A special problem is the dikaryotization of many basidiomycetes. After the sexual partners have fused, the new nucleus divides and propagates itself through an existing cellular structure (the previously monokaryotic hyphae). It is unclear whether this should be considered a generation because of the genetic change, a life cycle stage because of the change in ploidy, or a developmental stage.</p>
<p>This dikaryotization is also involved in the life cycle of the rust fungi, which is a good practical "data challenge" for modeling the classifiers. The entire life cycle of many rust fungi (e. g., <i>Puccinia graminis</i>) includes five different spore types (pycnospores, aeciospores, uredospores, teliospores, basidiospores). Each spore type has to be described separately and thus needs a classifier to distinguish the descriptions. The spore types relate to two full generations (1. pycnospores + aeciospores, 2. uredospores + teliospores) plus one reduced generation (the basidiospore-producing phragmobasidium after germination of teliospore). The first generation is initially monokaryotic, but is later dikaryotized in a sexual process in which the pycnospores function as gametes. It then produces dikaryotic aeciospores that create the second generation on an alternative host plant. In this second generation, the uredospores create new infections that are second generation
individuals indistinguishable from those created by aeciospores. Thus, a secondary epidemic life cycle exists in addition to the complete life cycle involving the other spore stages. Thus, in rust fungi the dominant classifier concept involves aspects of developmental stages, generations, and sex.</p>
<h3>Other classifier concepts</h3>
<p style="background:yellow">[@@It is an important point to find further cases to be able to decide on an appropriate generalization term for these "classifier concepts". The best strategy to find additional cases is to imagine what groupings within a taxon might be keyed out separately in keys. I can imagine that keys may also key out morphological variants that have no taxonomic rank. Can anybody provide an example?]</p>
<p>Other concepts that exhibit similar classification or grouping properties in descriptive data are:</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> Social insects such as ants, bees, termites, and wasps have morphologically differentiated individuals belonging to different castes (queen, workers, soldiers, etc.). The castes are a polymorphism between generations which cannot be treated as life cycle stages, because most individuals are sterile and die without progeny. Instead, they may be viewed as polymorphic generations. The individual differences are caused by responses to nutrition during early development, i. e. to environmental factors. In contrast to the seasonal dimorphism, however, the frequencies of individuals in a population are largely under genetical control because the environment itself is controlled by the behavior of the population.</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> Descriptions may be based on living or dead material. Many characteristics can only be observed when living (e. g., in <i>Orbilia,</i> see Baral 1992).</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> Descriptions based of different preservation methods, such as drying or ethanol conservation.</p>
<h3>Generalized term for sex, generation, life cycle stages, etc.</h3>
<p>The various classifier concepts discussed above all describe why multiple classes of descriptions may exist within the most specific class defined in the primary classification system. It seems advisable for a descriptive data information model to provide a generalized mechanism rather than individually treating specific classification systems like sex, life cycle stages, etc.</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The number of secondary classification systems is relatively large</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The model would become specific to biological descriptions</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The individual classification systems may be interrelated in complex ways as has been shown in the example of the castes of social insects or the spore stages of rust fungi.</p>
<p>No existing generalized term for such classifier concepts discussed could be found. An internet search for a generalized name for at least sex, generation, and life cycle stages was unsuccessful. The following definition is therefore proposed:</p>
<p style="background:yellow">[@@Request to reviewers: Please inform me, if you know of a discussion of this problem!]</p>
<p><strong>Secondary classifiers</strong> = a classification that may be required in addition to the primary class names (which may in biology be taxon names or non-taxonomic names like disease names). Secondary classifiers provide an opportunity to add further naming dimensions to the descriptions. They are, however, not necessarily nested within the primary class names. Multiple secondary classifiers (and for a single classifier concept, multiple values = states) can be added to each class name reference.</p>
<p><strong><span style="background:yellow">[@@ I am not entirely happy and find "secondary classifiers" not truly intuitive. It is the best I could come up with!</span></strong></p>
<p><strong><span style="background:yellow">Also note:</span></strong> <span style="background:yellow">the last point is debatable. It will occur primarily if descriptions are generalized. For example, the descriptions of the second, third, and fourth instar may be so similar, that they are joined in a single generalized description. This would, however, be a non-persistent report. It is unclear whether such data would also have to be recorded. Please comment!]</span></p>
<p><strong>Annotated collection of other candidate terms <span style="background:yellow">[@@Please comment or add!]</span>:</strong></p>
* "Classifiers" alone is too general.
* "Non-taxonomic classifiers" is inappropriate, the primary class names may already be non-taxonomic, as in disease names (also BDI.SDD_ aims to create a general model applicable without reference to biological terminology).
* "Determinants/classification determinants"?
* "Description classifiers" &ndash; perhaps more intuitive than "secondary"?
* "Phenotypical classifiers" would be confusing, since phenotypic is usually considered and antonym to genotypic. Classifier concepts may be phenotypic (environmental sex determination), genotypic (genotypic sex determination), or ontogenetic (development stages).
<h3><a id="ch_SecClassifierRelatedChars" name="ch_SecClassifierRelatedChars">Classifier-related characters</a></h3>
<p>A confusing aspect of classifiers is that &ndash; although the values do not contribute to the class descriptions &ndash; the existence of values or their frequency is part of the descriptive knowledge expressed in descriptive databases. The frequency of males and females is a property of classes/taxa (e. g., in social hymenoptera), and different classes/taxa may have different development or life cycle stages (e. g., reduced forms of the full heteroecious rust life cycle, or neoteny in animals). Such information may perhaps be considered separate characters, i. e. distinct from a "secondary classifier" mechanism.</p>
<p>In theory, the frequency of sexes could be calculated from descriptions that have a male/female sex classification. In practice this will not be possible, since the sampling of descriptions in a database (and of specimens in a collection) is highly non-random. Although presence/absence suffers less from sampling bias, complete and systematic bias (e. g., the database contains only adults) is not infrequent. Thus, classifier-related characters will normally have to be recorded independently from the data recorded in some kind of classifier mechanism.</p>
<p>Note that some classifier-related characters are often omitted from descriptions optimized for identification, because they are inconvenient to study (e. g., requiring observation over prolonged periods or population sampling). This is, however, no unique property of classifier-related characters. In the BDI.SDD_ model, the convenience of a character for identification purposes is separately recorded ("rated", compare section @@). Furthermore, some classifier-related characters are quite convenient, e. g. "sex status" with the states "monoclinous (having male and female organs in the same flower)" and "diclinous (in different flowers)".</p>
<p>On the other hand, classifier-related characters have an influence on classifiers. If a "life cycle type" character of plants has the states "annual, biennial, perennial", a possible life cycle stage "plant in the second year" is inapplicable for "annual". Similarly, if "heterostyly" has the states "monostylous, bistylous, tristylous", and a related heterostyly classifier the values "short, medium, long style", the entire classifier would not be applicable for heterostyly = monostylous, and only the values "short" and "long style" would be applicable for heterostyly = bistylous.</p>
<h3>Existing models of handling secondary classifiers</h3>
<p>The special properties of sex, generation, life cycle stages are not discussed in the CSIRO DELTA or <nop>DeltaAccess documentation (Dallwitz &amp; al. 2000a, Dallwitz &amp; al. 2000b, Hagedorn 1997). <span style="background:yellow">[@@ ?? Kevin: does Lucid have any mechanism relevant here?]</span> It is customary to model them by providing separate characters for each life cycle stage (resulting in five sets of spore characters in the case of rust fungi), treating them as a character, or adding them to the item name.</p>
<h5>Secondary classifiers nested within class names</h5>
<p>Secondary classifiers like sex and life cycle stages may be considered part of the class name, i. e nested within the taxonomic hierarchy. In applications based on the DELTA information model, the item names for larvae and adults of the monarch butterfly may be "<i>Danaus plexippus</i> (larvae)" and "<i>Danaus plexippus</i> (imago)". Some databases may even treat them explicitly as "pseudo-ranks" (see Bob Morris' comment in <a href="http://wiki.cs.umb.edu/twiki/bin/view/BDI.SDD_/TheProblemOfSex">http://wiki.cs.umb.edu/twiki/bin/view/BDI.SDD_/TheProblemOfSex</a>).</p>
<p>If added to the item name, DELTA applications will not be able to distinguish the added classifier information from an infraspecific taxon. An advantage of this method is that it allows using the "variant item" mechanism: In addition to a main item description, additional descriptions containing only those characters that differ from the main item may be added as variant items in DELTA. This can be used to simplify the recording of those parts of the descriptions that differ according to classifier values. However, since the variant item mechanism is limited to a single hierarchical level, it is not possible to treat sexes of infraspecific items using this mechanism (or the mechanism is not available for infraspecific taxa).</p>
<p><img border="0" width="562" height="197" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image001.gif" /></p>
<p><strong>Figure</strong> <a id="Fig_fSexAsTaxon" name="Fig_fSexAsTaxon"></a><b>1</b><strong>.</strong> Treating sex as an infraspecific taxon works well on the side of descriptions, but requires to add two new "pseudo-taxa" to each taxon, both in the list of class (= taxon) names (which is referenced by descriptions) and in the class hierarchy.</p>
<p>In a system like DELTA that implements the name of description as an unconstrained string, adding sex or stage information to the name is a feasible solution. However, if the class names of descriptions are formalized and handled through references to a formal list of class names (which in BDI.SDD_ provides only local proxy objects, that again reference external nomenclatural databases), this approach soon becomes highly undesirable (Fig. 1). The following major problems can be identified:</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> For each identifiable class additional dependent classes for each sex or life cycle stage must be introduced. Furthermore, it is possible to identify the sex and stage of a butterfly as female imago, but the taxon only to family level. If classifiers are handled as additional ranks of the taxonomic hierarchy, male/female and larva/imago "pseudo-taxa" would have to be added to higher taxa as well as to species or infraspecific taxa to allow such identifications.</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> These additional "pseudo-classes" would also have to be added to the class hierarchy definition. This may be an automatic process, but formally the information that "has to be expressed. As humans we consider the fact that "<i>Danaus plexippus</i> (larvae)" can be generalized to "<i>Danaus plexippus</i>" automatic, but it involves a parsing of the string and semantic knowledge that allows us to distinguish between a classifier "(larvae)" and a taxonomic author name is the same position.</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The class name would become language specific. Taxonomic classes would require different names in German and English (this problem is not entirely specific to classifiers, it is generally present if diseases instead of taxa are described).<span style="background:yellow">[@@in fact I believe BDI.SDD_ 0.9 has a problem here, see Wiki topic LanguageSpecificClassNames (http://wiki.cs.umb.edu/twiki/bin/view/BDI.SDD_/LanguageSpecificClassNames)!]</span></p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The taxonomic hierarchy is naturally nested. Classifiers act as separate dimensions independent of this hierarchy (Figs. 2 and 3). Although in general any single dimension that is independent of a hierarchy may also be viewed as nested within the hierarchy, in the presence of more than one classifier arbitrary nesting will have to be made (Fig. 4).</p>
<table border="0" cellspacing="0" cellpadding="0" style='border-collapse:collapse'>
<tr>
<td width="302" valign="top" style='width:204.95pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><img border="0" width="230" height="168" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image002.gif" /></p>
</td>
<td width="343" valign="top" style='width:233.0pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><img border="0" width="265" height="168" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image003.gif" /></p>
</td>
</tr>
<tr>
<td width="302" valign="top" style='width:204.95pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><strong>Figure</strong> <a id="Fig_fSexNestedTax" name="Fig_fSexNestedTax"></a><b>2</b><strong>.</strong> Visualization of the nested nature of the taxonomic hierarchy, with 2 sub<75>species, 3 species, and 2 genera in a family.</p>
</td>
<td width="343" valign="top" style='width:233.0pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><strong>Figure</strong> <a id="Fig_fSexTaxCrossStage" name="Fig_fSexTaxCrossStage"></a><b>3</b><strong>.</strong> Developmental life cycle stages run across the taxonomic hierarchy; the stage concept does not depend on taxa (although the presence of a stage may depend on the taxon).</p>
</td>
</tr>
</table>
<p>Furthermore, the classifier dimensions may or may not be dependent (Fig. 5):</p>
<ul>
<li>In humans or butterflies, sex and development stage are entirely independent</li>
<li>In the red algae <i>Polysiphonia</i> (see above), only one of the three life cycle stages (generations) is sexually differentiated. The classifier related character "Sex presence" and the dependent sex classifier are nested within stage.</li>
</ul>
<p>Another problem is that for reporting, the classifiers may have a higher grouping priority than the entire class hierarchy (e. g., for caterpillar and butterfly stages separate descriptions and diagnostic keys are presented, compare Table 2, p. 1). Although it is possible that software may support this, it is an operation unnatural for hierarchical arrangements and is not required for the naturally nested taxonomic hierarchy.</p>
<p>One possible solution would be to handle classifiers in an unconstrained string introduced in addition to the formal class name reference. This would avoid many problems noted above, but would not allow any classifier specific processing like producing generalized descriptions for sex, but not for stage.</p>
<table border="0" cellspacing="0" cellpadding="0" style='border-collapse:collapse'>
<tr>
<td width="322" valign="bottom" style='width:218.95pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><img border="0" width="230" height="168" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image004.gif" /></p>
</td>
<td width="322" valign="bottom" style='width:219.0pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><img border="0" width="270" height="82" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image005.gif" /><br />
<br />
<b><span style='font-size:10.0pt'><img border="0" width="270" height="82" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image006.gif" /></span></b></p>
</td>
</tr>
<tr>
<td width="322" valign="top" style='width:218.95pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><strong>Figure</strong> <b>4</b><strong>.</strong> Sex and stage arbitrarily nested inside the taxonomic hierarchy. Males and females of different taxa or stages are assumed to have no relation or similarity.</p>
</td>
<td width="322" valign="top" style='width:219.0pt;padding:0mm 3.5pt 0mm 3.5pt'>
<p><strong>Figure</strong> <b>5</b><strong>.</strong> The dimensions of sex and life cycle stages may be dependent and nested (top; e. g. red algae) or independent (bottom; e. g. butterflies) of each other.</p>
</td>
</tr>
</table>
<h5>Secondary classifiers as normal characters</h5>
<p>Secondary classifiers may be considered normal characters (as "shape" or "color"). This approach is probably rarely found in DELTA data sets (but compare the section "Classifier-related characters", above). However, Prometheus II (<nop>McDonald &amp; al., submitted)<span style="background:yellow">[@check!]</span> explicitly considers sex and life cycle as "qualitative description elements", i. e. as properties of structures = as UM or OM character in the sense used by DELTA. <span style="background:yellow">[@currently the actual source is from the <nop>PowerPoint talk; I will try to find a printed reference. Or: T. Paterson, pers. comm.//@@Note to Trevor: I could not find this in the publication, is it correct?]</span></p>
<p>Using normal characters to express classifier information has the advantage that applications have no additional implementation tasks because existing mechanisms are used. However, it has serious problems in that:</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> Secondary classifiers are important factors when aggregating specimen data, or generalizing multiple taxon descriptions to higher taxon descriptions. If the aggregation/generalization algorithm can test which observations belong to secondary classifiers like sex or stage, it could make rule-based decisions whether to ignore sex or stage differences, or whether to create separate descriptions for them.</p>
<p>&#9679;<span style='font:7.0pt "Times New Roman"'> </span> The solution does not work for guided keys (e. g., larvae and adults of the monarch butterfly are keyed out in separate places in a single key, or in separate keys).</p>
<p>The first problem could be solved by defining an additional flag for certain state sets, indicating which define secondary classifiers. However, no satisfactory solution seems to exist for the problem of dealing with guided keys.</p>
<h5>Secondary classifiers modeled through character sets</h5>
<p>This solution is usually used in cases where the descriptions of different stages are drastically different, perhaps the stages are even structurally different (e. g. caterpillar and butterfly). An entirely separate set of characters is prepared for each stage (Fig. 6). Because of the fundamental differences, only a limited amount of characters (overall size, DNA) are truly duplicated. For these characters no generalization analysis is possible without adding additional information to the terminology.</p>
<p>An extreme case, where almost all characters are duplicated, is the description of the life cycle and spore stages of rust fungi (Fig. 7). The abstraction of the spore stages is highly desirable here, both for analytical and for identification purposes. During identification, several rust spore stages are difficult or impossible to distinguish based on their morphology alone.</p>
<p><img border="0" width="525" height="115" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image007.gif" /></p>
<p><strong>Figure</strong> <b>6</b><strong>.</strong> Character <20> description matrix where development stages are expressed through separate sets of characters.</p>
<p><img border="0" width="548" height="279" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image008.gif" /></p>
<p><strong>Figure</strong> <b>7</b><strong>.</strong> Character <20> description matrix where spore stages of rust fungi are expressed through separate sets of characters. Each set is assumed to contain the similar characters (length, width, shape, septation, wall thickness, surface ornamentation, etc.) that are specialized only through the spore stage they describe. One generalization dimension abstracts from objects to a class description. However, another desirable generalization dimension shown below the main matrix would generalize to a "generalized spore". The arrows show the generalization only for the first character in each set. The class description in the lower matrix combines both generalizations.</p>
<h5>Secondary classifiers modeled through extension of class references</h5>
<p>The introduction of a separate "secondary classifier" mechanism which is proposed for BDI.SDD_ is very similar to using normal characters to express secondary classifiers. The classifier characters are analogous to normal characters, but used in a separated context. This allows them to be treated differently when generalizing descriptive information (objects to class, classes to higher classes). Furthermore, the independent mechanism allows them to be added to the diagnostic keys as well.</p>
<p>The introduction of explicit secondary classifiers does not prevent the existence of classifier-specific character groups. However, they will only be necessary where structures or properties apply only to a certain sex or stage. In contrast to the character set model described in the previous section (Figs. 6 and 7), the existence of classifier does not force the duplication of characters (Fig. 8).</p>
<p><img border="0" width="558" height="115" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image009.gif" /></p>
<p><strong>Figure</strong> <b>8</b><strong>.</strong> Character <20> description matrix where development stages are designated using a secondary classifier mechanism. Some characters are applicable only to certain stages, but other characters are common to different stages. The generalization algorithm providing the class descriptions from object descriptions has detected that the common characters for different stages are strongly different. Thus, separate, stage-specific class descriptions have been prepared.</p>
<h3>Summary of requirements:</h3>
<ul class="compact">
<li>Classifiers are distinct from characters. However, classifier-related characters exist as well: <ul class="compact">
<li>Classifier-related characters can usually not be calculated based on the classifier values</li>
<li>Classifier-related characters require no special handling in identification.</li>
</ul></li>
<li>Dependency relations between classifiers and characters exist: <ul class="compact">
<li>Classifier-related characters may control the valid values for classifiers (heterostyly example)</li>
<li>Classifiers may control characters (e. g., only part of the life cycle stages may have sexual differentiation)</li>
</ul></li>
<li>Classifier concepts are not limited to sex and life cycle or developmental stages.</li>
<li>Classifiers can not be handled as part of the taxonomic hierarchy</li>
<li>Classifiers should have representations for multiple audiences/languages.</li>
</ul>
<p style="background:yellow">[@must be further expanded!@]</p>
<p><strong>Problem:</strong> Whether object identification should be with or without classifier needs discussion! In biology an collected object may have multiple stages (e. g. on a single herbarium sheet). These may or may not be described together. Is it meaningful to have classifiers at all at the object identification? Currently preliminary added there, but I wonder whether they should not be removed!</p>
<p>What is the implicit assumption for a definition of Object? It seems reasonable to define it <em>not</em> as a specimen, but as an individual genetic unit on a preservation unit like a herbarium sheet. If that is so, do we still need multiple values for a single secondary classifier concept?</p>
<p><strong>Problem:</strong> Classifier information in keys may be specific to a class reference, which would be well handled by a classifier mechanism that is added to class references. However, equivalent information may apply to the entire key (which may only deal with larval stages of insects). Although this will be clear to humans from reading the key label, it would be highly desirable to also provide a machine-readable definition. Adding classifier information also to entire keys further complicates the model and evaluation of data. Is this avoidable in some other way of handling classifier information?</p>
<p>Class references are thought to need an additional secondary classifier mechanism in descriptions and diagnostic keys and they may be desirable in the identification of objects (= specimens; Fig. 9). Class references remain without classifiers in the definition of class (= taxonomic) hierarchy and class synonyms (Fig. 10).</p>
<p><img border="0" width="533" height="263" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image010.gif" /></p>
<p><strong>Figure</strong> <b>9</b><strong>.</strong> Visualization of objects with class references that require an additional secondary classifier mechanism (sex, life cycle, or developmental stage).</p>
<p><img border="0" width="469" height="185" src="%ATTACHURLPATH%/d20_tempsecclassdraft3_image011.gif" /></p>
<p><strong>Figure</strong> <b>10</b><strong>.</strong> Only the class references in the class hierarchy (in biology = taxonomic hierarchy) and for the definition of synonyms do not require the secondary classifier mechanism (compare Fig. 9).</p>
<h3>Proposal for the BDI.SDD_ model (version 0.91)</h3>
<p style="background:yellow">[@@ The basic options are probably:</p>
<p style="background:yellow">a) free-form field at the taxon result object in a key (which would have to be manually translated into each language)</p>
<p style="background:yellow">b) specialized generalized "micro-description" facility for classifiers alone, at each keyed out or described taxon object</p>
<p style="background:yellow">c) use normal characters and add some flagging to allow detection of classifiers; plus provide a "micro-description" facility in the keys (where no character data are normally available)</p>
<p style="background:yellow">d) and provide a generalized ontology at the general concept/character state facility to recognize sex and stages (i. e. similar to proposal 2, above)?]</p>
<p>Secondary classifiers like sex and stage are handled by a specialized mechanism that is designed as an extension to standard class name references. The standard class reference type is used to model taxonomic hierarchy and synonymizations (which inside the description model only mirror data from external nomenclature or taxonomy providers). The extended class reference type provides an additional sequence of secondary classifier values. These are references to concept states defined at concept nodes. They do not refer to <em>character states</em>! The extended class reference type is used to (see Fig. 9):</p>
<ul class="compact">
<li>identify specimen objects with a taxon name,</li>
<li>define the taxon name results in guided keys</li>
<li>define the name of a taxon or class that is described in coded or nat. language descriptions.</li>
</ul>
<p>TODO: Add new concept tree type "secondary classification concepts"</p>
<p><strong>Question:</strong> which modifiers would be necessary at secondary classifiers?</p>
---
(End of document pasted directly. Also a zipped [[%ATTACHURL%/D20_TempSecClassDraft3.zip][RTF version that includes all figures]] is provided.)
-- Gregor Hagedorn - 26. April 2004
Polymorphisms, due to sexual differences, life cycle, developmental stages and other factors, frequently appear to be a problem, because we wish to assign the organisms to the same taxonomic category while the descriptions of the members of the category may vary widely. This conflict can be attributed to the interrelationship between function or purpose of the taxonomic category and the description. The taxonomic category can lead to a descriptions or a description may lead to the conclusion that an item belongs to a particular taxonomic category. This relationship however does not mean that the taxonomic category and the descriptions are synonymous or in most other aspects equivalent. The description is simply list of list of observable attributes. Groups of these attributes may appear only within individual <20>phases<65> of an organism while always being associated with the taxonomic category. All characteristics from any phase are <20>true<75> of the taxonomic category but we wish to organize them as to these correlated characters as well as make explicit the existence of the phases. While this may not help at any point in time for an identification it will help over time.
Advantages of <20>phase<73> representation type: Each of the three representations <20>Stage grouping proceeding taxonomic name<6D>, <20>Stage as subheading within a description<6F>, and <20>Sex/stage as annotation within a description<6F> have communicative advantages and disadvantages. I believe that, in the end for ease of writing, authors of descriptions will continue view the organization of these attribute in all three ways and more. <20>Stage grouping proceeding taxonomic name<6D> has the advantage of allowing one to focus on the attributes of that one stage. After all, a person encountering one individual of a species will only encounter it in one of its <20>phases<65>. This allows a reader to pick a <20>phase<73> or <20>stage<67> first, to pick which key to use. For example, go to the <20>larval descriptions<6E> part of the key if you have larva and not an adult. <20>Stage as subheading within a description<6F> presumes the including of attributes from other <20>phases<65> or <20>stages<65> providing a more compete picture of a population over time. This representation also allows a reader to more easily compare <20>stages.<2E> <20>Sex/stage as annotation within a description<6F> is best for cases where most attributes are shared among the <20>stages<65> and only a few are dimorphic.
Transformational equivalence: The choice we make in for the data structures is in part determined by an evaluation of the transformational equivalence of the representations. All other things being equal, we should choose the representation that can be mechanically transformed into all three frameworks above. It is the job of the application to make the transformations for the user. The current BDI.SDD_ framework and many other TDWG standards are taxonomic category <20>centric. They are organized around these concepts. This makes it difficult to represent the first type <20>Stage grouping proceeding taxonomic name<6D>, in BDI.SDD_ as Gregor points out with the introduction of <20>pseudo-taxa<78> to address the issue of two sexes under the same taxa. <br />
Species 1<br />
Species 1 (female)<br />
Species 2 (male)<br />
We can image easily an application that for the convenience of the reader could reorganize a key in the other forms into this form. So we can disregard this representation from consideration for the BDI.SDD_ internal structure. BDI.SDD_ could support both of the other two.
We might support two mechanisms. One mechanism (cf. <20>character sets<74> in Gregor<6F>s description) might be to use a concept (tree node) type label to support <20>Stage as subheading within a description<6F>. This may be a variant of Gregor<6F>s option <20>d<EFBFBD> on the last page of the discussion. This typing system or maybe just the labels require an answer to Gregor<6F>s question about the naming of these classifiers.
Option <20>c<EFBFBD> is most appropriate in cases where few characteristics differ between the <20>phases<65> or <20>stages<65> to be differentiated.
-- Bryan Heidorn April 27, 2004
%META:FILEATTACHMENT{name="D20_TempSecClassDraft3.zip" attr="" comment="zipped rtf file, including all figures " date="1083005151" path="C:\Data\Desktop\DESCR\D20_TempSecClassDraft3.zip" size="210309" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image001.gif" attr="h" comment="" date="1083005220" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image001.gif" size="4449" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image002.gif" attr="h" comment="" date="1083005286" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image002.gif" size="1865" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image003.gif" attr="h" comment="" date="1083005322" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image003.gif" size="2176" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image004.gif" attr="h" comment="" date="1083005347" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image004.gif" size="2960" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image005.gif" attr="h" comment="" date="1083005363" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image005.gif" size="1367" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image006.gif" attr="h" comment="" date="1083005379" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image006.gif" size="1382" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image007.gif" attr="h" comment="" date="1083005397" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image007.gif" size="4011" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image008.gif" attr="h" comment="" date="1083005410" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image008.gif" size="7930" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image009.gif" attr="h" comment="" date="1083005430" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image009.gif" size="5050" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image011.gif" attr="h" comment="" date="1083005478" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image011.gif" size="3015" user="GregorHagedorn" version="1.1"}%
%META:FILEATTACHMENT{name="d20_tempsecclassdraft3_image010.gif" attr="h" comment="" date="1083005511" path="C:\WINGH\Temp\d20_tempsecclassdraft3_image010.gif" size="7724" user="GregorHagedorn" version="1.1"}%