wiki-archive/twiki/data/GUID/StateConceptsNeedGuids.txt

---+++ This is a discussion about character states and state concepts, and whether it would be useful to give them GUIDs

*from BobMorris*: There are lots of different ways to talk about 'blue', so giving a GUID to 'blue' instead of to an identified description of 'blue' somewhere seems to make one point of GUIDs, comparability, become odd.

   For color states and categorical states that might correspond to the partitioning of an interval, SDD already embeds concepts for what amounts to state concepts by providing optional mechanisms for mapping into something more explicit. Thus it is easy in SDD to map 'small' 'medium' and 'large' each to a specific numeric ranges within a particular Terminology. Color states, such as 'blue' can be mapped to a polyhedron in a color space. (The current SDD may be a little underspecified in this regard).  As I recall, those are the only mappings, but some things that are imprecisely specified in   human-centric glossaries, e.g. "ovate" could well be specified by measurement-based definitions. Thus providing an ovate concept which could profitably be used by, e.g. machine identification of leaves using image processing techniques, provided the concept had a GUID so that two such processing programs could understand whether they were using the same concept. This notion can probably be adopted for anything which can be observed using a protocol that can be adequately specified.

To the extent that an SDD Terminology (the part of an SDD document that specifies its controlled vocabulary) can be given a GUID, most of the objects within a Terminology inherit GUIDs by virtue of the fact that they have an identifier unique to the Terminology and the (small number of) distinguished Character types which constrain the container in which they live. A separate GUID on the Character would perhaps make  it easier to extract characters for reuse more than they are now, but I don't see that this applies to States. However, as far as the aforesaid mappings go, I suspect they meet many of the requirements for an "Observed Value Concept"
should those requirements ever be identified. As KevinThiele observed in establishing this particular discussion, the likelhood that ecology property values and morphological observation states have a lot in common is very high.

Character concepts with GUIDS also probably fit already in SDD-described characters, but these may be tied to the Terminology more than state concepts. Not sure.

I still think I don't want a GUID on 'blue', certainly not on 'small' and probably not on 'ovate'.

BobMorris 28 January 2006

----

*from KevinThiele*: I think several different issues are raised here. The main one is the necessary (and uncomfortable) imprecision of categorical states as usually used in taxonomy. Ovate as usually used in taxonomy has considerable imprecision. Peter Stevens has often made this point and has called for greater use of actual numerics - in this case, if we could record length/width ratio and distance-to-widest-point/length ratio (plus, perhaps, a few angles), we would have a much better descriptor for many common leaf shapes - ovate, elliptic, obovate etc. We could go better still with elliptic Fourier analysis (see e.g. http://www.botany.utoronto.ca/faculty/dickinson/MorphometricMethods.HTML).

The problem is in the practice. It's a whole lot easier to say that a leaf is ovate than to define its ovateness, and unfortunately I don't think we can or should require our few remaining and elderly taxonomists to do more work than they already are in describing biodiversity. It's that damned imperfect world again.
[Ummm, this argument is often used against taxon concepts... BobMorris Jan 28 2006]
The imperfection becomes even greater when you consider the full suite of morphological descriptors in biology. Plan leaf shapes are a doddle to quantify compared with - virtually everything else.

So, I think imprecise and messy categorical states are here to stay. The question becomes - can we do more with them if they have GUIDs than if they don't?
[I don't at all mean to deny GUIDs on imprecise states, but rather that imprecise states deserve a concept every bit as much as, say, illegally named taxa. the concept would simply be a reference to a place where, e.g. "ovate" is imprecisely defined. If there are two or more such places with different possible interpretations, all the more need for an application or human to find both of them. BobMorris]

Bob's second point is that SDD is actually quite sophisticated (and may well become more so) in handling states and state concepts. His solution seems to be to put a GUID on an SDD document and leave the rest to an interpreter. This seems to me an inconvincing solution. The same could be said with respect to names and name concepts - give a GUID to ITIS, then the ITISGUID+ITISLocalID becomes globally unique. My problem is that I need to explode SDD documents and recombine elements from different documents. It's true that I can't be absolutely sure that ovate in one document is exactly the same as ovate in another - but that's the same for any pair of descriptions in the world now, and we get by reasonably OK.
[I waffled about "GUIDs by inheritance" to get it on the table. I do mention that finer GUIDing in an SDD document could well address the currently weak story in SDD about re-use of characters and states. Technically, I think just GUIDing deep inside SDD isn't quite enough to do that in SDD 1.0, but it has often been suggested that SDD2.0 would specifically address less-than-globally-shared terminologies. BobMorris 29 Jan 2006]

KevinThiele 29 Jan 2006

----
Archive entire twiki setup from owl as of today 2016-02-24 10:46:01 +00:00			`---+++ This is a discussion about character states and state concepts, and whether it would be useful to give them GUIDs`

			`from BobMorris: There are lots of different ways to talk about 'blue', so giving a GUID to 'blue' instead of to an identified description of 'blue' somewhere seems to make one point of GUIDs, comparability, become odd.`

			For color states and categorical states that might correspond to the partitioning of an interval, SDD already embeds concepts for what amounts to state concepts by providing optional mechanisms for mapping into something more explicit. Thus it is easy in SDD to map 'small' 'medium' and 'large' each to a specific numeric ranges within a particular Terminology. Color states, such as 'blue' can be mapped to a polyhedron in a color space. (The current SDD may be a little underspecified in this regard). As I recall, those are the only mappings, but some things that are imprecisely specified in human-centric glossaries, e.g. "ovate" could well be specified by measurement-based definitions. Thus providing an ovate concept which could profitably be used by, e.g. machine identification of leaves using image processing techniques, provided the concept had a GUID so that two such processing programs could understand whether they were using the same concept. This notion can probably be adopted for anything which can be observed using a protocol that can be adequately specified.

			To the extent that an SDD Terminology (the part of an SDD document that specifies its controlled vocabulary) can be given a GUID, most of the objects within a Terminology inherit GUIDs by virtue of the fact that they have an identifier unique to the Terminology and the (small number of) distinguished Character types which constrain the container in which they live. A separate GUID on the Character would perhaps make it easier to extract characters for reuse more than they are now, but I don't see that this applies to States. However, as far as the aforesaid mappings go, I suspect they meet many of the requirements for an "Observed Value Concept"
			`should those requirements ever be identified. As KevinThiele observed in establishing this particular discussion, the likelhood that ecology property values and morphological observation states have a lot in common is very high.`

			`Character concepts with GUIDS also probably fit already in SDD-described characters, but these may be tied to the Terminology more than state concepts. Not sure.`

			`I still think I don't want a GUID on 'blue', certainly not on 'small' and probably not on 'ovate'.`

			`BobMorris 28 January 2006`

			`----`

			from KevinThiele: I think several different issues are raised here. The main one is the necessary (and uncomfortable) imprecision of categorical states as usually used in taxonomy. Ovate as usually used in taxonomy has considerable imprecision. Peter Stevens has often made this point and has called for greater use of actual numerics - in this case, if we could record length/width ratio and distance-to-widest-point/length ratio (plus, perhaps, a few angles), we would have a much better descriptor for many common leaf shapes - ovate, elliptic, obovate etc. We could go better still with elliptic Fourier analysis (see e.g. http://www.botany.utoronto.ca/faculty/dickinson/MorphometricMethods.HTML).

			`The problem is in the practice. It's a whole lot easier to say that a leaf is ovate than to define its ovateness, and unfortunately I don't think we can or should require our few remaining and elderly taxonomists to do more work than they already are in describing biodiversity. It's that damned imperfect world again.`
			`[Ummm, this argument is often used against taxon concepts... BobMorris Jan 28 2006]`
			`The imperfection becomes even greater when you consider the full suite of morphological descriptors in biology. Plan leaf shapes are a doddle to quantify compared with - virtually everything else.`

			`So, I think imprecise and messy categorical states are here to stay. The question becomes - can we do more with them if they have GUIDs than if they don't?`
			`[I don't at all mean to deny GUIDs on imprecise states, but rather that imprecise states deserve a concept every bit as much as, say, illegally named taxa. the concept would simply be a reference to a place where, e.g. "ovate" is imprecisely defined. If there are two or more such places with different possible interpretations, all the more need for an application or human to find both of them. BobMorris]`

			Bob's second point is that SDD is actually quite sophisticated (and may well become more so) in handling states and state concepts. His solution seems to be to put a GUID on an SDD document and leave the rest to an interpreter. This seems to me an inconvincing solution. The same could be said with respect to names and name concepts - give a GUID to ITIS, then the ITISGUID+ITISLocalID becomes globally unique. My problem is that I need to explode SDD documents and recombine elements from different documents. It's true that I can't be absolutely sure that ovate in one document is exactly the same as ovate in another - but that's the same for any pair of descriptions in the world now, and we get by reasonably OK.
			`[I waffled about "GUIDs by inheritance" to get it on the table. I do mention that finer GUIDing in an SDD document could well address the currently weak story in SDD about re-use of characters and states. Technically, I think just GUIDing deep inside SDD isn't quite enough to do that in SDD 1.0, but it has often been suggested that SDD2.0 would specifically address less-than-globally-shared terminologies. BobMorris 29 Jan 2006]`

			`KevinThiele 29 Jan 2006`

			`----`