wiki-archive/twiki/data/Main/Metadata.txt

148 lines
12 KiB
Plaintext
Raw Normal View History

%META:TOPICINFO{author="FalkHuettmann" date="1222155169" format="1.1" version="1.8"}%
%META:TOPICPARENT{name="WebHome"}%
---++Metadata
This wiki has been established to facilitate discussion of the use of metadata standards within TDWG. It follows from the call issued to the TDWG membership by Eamonn O Tuama on 29 July 2008, as follows:
---++++Initial call
*Metadata Discussion Meeting at TDWG Conference 2008*
The Atlas of Living Australia and Global Biodiversity Information Facility are co-organisers of a special meeting on metadata to be held at the TDWG Conference in Fremantle, Australia, Sunday 19 October 2008, 14.00-17.00.
We propose to open the discussion to any aspect of metadata but with a particular focus on metadata standards, major projects, systems in use, the role of ontologies, and certain key questions surrounding metadata.
*Key Questions*
The key questions needing discussion include, but are not limited to,
the following:
1. What resources do we want to describe (all digital; some classes of digital; digital AND undigitised legacy; human experts, etc., record and dataset level)?
2. Can we make some preliminary core recommendations on metadata elements which all TDWG-related projects should maintain as a minimum?
3. How do we proceed to fuller recommendations on metadata standards and profiles, and web service interfaces, e.g., Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) - including appropriate use of PMH Sets - and Open Archives Initiative Object Reuse and Exchange (OAI-ORE)?
4. How do we handle natural language differences in metadata?
5. How can ontologies be effectively used with metadata?
6. How do we encourage production of good metadata?
7. How can a global, decentralised biodiversity network best share metadata?
8. Are there any other things we can do immediately to facilitate the interchange of metadata between our projects?
9. How should we avoid duplication of metadata records as they are harvested and aggregated in different locations?
10. Are there pilot activities around which we can test some of these ideas?
11. How should we establish ongoing collaboration within TDWG on this topic (wiki, mailing list, convenor, etc.)?
To make effective use of the limited time for such a broad subject, we propose to offer a semi-open format for the discussion. This will revolve around a series of maximum-5-minute, 2-PowerPoint-slides presentations from represented projects on i) current metadata strategy and standards; ii) issues they need to see addressed. We require pre-submission and circulation of these slides to help build a sense of collaboration around a process and to create an "Introductory Statement" that will guide and inform our discussions on the day. A metadata wiki will be established prior to the meeting to aid our discussions.
If you are interested in participating in the meeting, please contact the convenors Donald Hobern (Donald.Hobern@csiro.au) and Eamonn O Tuama (eotuama@gbif.org). If you wish to make a presentation, please submit your slides to the convenors by 1 September 2008.
-- Main.DonaldHobern - 28 Aug 2008
---++++Follow-up discussion
RogerHyam provided the following comments:
_There should be a question number zero. In fact question zero might be the only question worth asking at this stage and it is:_
_What will a metadata system enable us to do that we can't do without it? i.e. What is the primary use case for us gathering metadata? What are the requirements?_
_We can't answer any of the questions 1 to 11 unless we know what the system is supposed to enable us to do._
_I'd like to see half a dozen scenarios sketched out so that we can test (in thought experiments initially) how different proposals might play out. By scenario I am thinking of something a little more detailed than "Discover biodiversity data"._
_I'll fine tune my comments to say question zero should be "Why should be align our metadata efforts?"_
_My worry is that there is so little generality of purpose between the different projects. I get the feeling that people have some crazy/vague ideas of the questions they will be able to ask of big metadata repositories and expectations will need to be managed or there will be disappointment with the results._
DonaldHobern responded as follows:
_Over the last few years GBIF and others have worked to integrate specimen/observation records and have learned the benefit of national and international projects collaborating not only around agreed data exchange standards but also (belatedly) around tracking the reharvesting and aggregation of these data items via globally unique identifiers. In other words, we have had to address the issue of a complex world of primary sources, primary aggregators, secondary aggregators, and hybrids between these._
_We are now moving into a world in which a number of projects are working to manage a much wider range of information relevant to their taxonomic, regional or thematic subdomain. Examples include EOL, GBIF, ALA, OBIS, NBII, etc. Many of these projects need to manage metadata relating to the same data resources. Similarly we expect there to be much reuse of basic metadata by aggregation projects (e.g. for EOL to reuse metadata for local Australian resources aggregated by the ALA, or for the ALA to reuse metadata gathered for resources of relevance to Australia and aggregated by OBIS, uBio, MorphBank, EOL, etc._
_As we do this we will face all the same questions we faced with managing specimen/observation records. We want to avoid noisy feedback loops. We want to ensure that essential elements (e.g. IPR statements) are preserved as these metadata flow around. We want to be able to detect and exploit additional metadata (e.g. user annotations, domain specific indexes) as these appear._
_We also need to bear in mind that better global management of metadata relating to specimen/observation datasets will be another tool to help us improve our management of the individual specimen/observation records, and that a robust system for referencing taxonomic and nomenclatural datasets may help us all to be more explicit about our concept authorities._
-- Main.DonaldHobern - 28 Aug 2008
<EFBFBD>amonn <20> Tuama responded as follows:
I don't have much to add to what Donald says. Of course metadata has limitations but it serves a very important function in data curation and re-use. The value of metadata obviously depends on its richness/comprehensiveness. Yes, expectations will need to be managed but even having very basic "discovery" level metadata is very useful and if we can begin to build on the NCEAS/KNB approach of incorporating ontologies as well as user-supplied annotations then we might achieve something even better.
-- Main.EamonnOTuama - 2 Sept 2008
Bob Morris remarks:
A common cry of pain (including in the Plazi SPM project) I hear is "what should be minimal best practice for LSID resolution data/metadata" From an LSID client application such as our SPM generation, one specific thing that seems to be emerging is something like: Given a TaxonConcept LSID, could we programmatically detect to what, if any, best practices it conforms and exploit that. For example, it would be nice to know without parsing resolution output whether the resolver is respecting the voc/TaxonConcept deprecated elements and offering a Rank in a TaxonName element as recommended. So this might be approached by providing somewhere a meta-metadata mechanism that provides a list of URIs representing best-practices to which this resolution asserts it conforms. Client code of LSIDs (or whatever) could be written with designs that offer their highest utility for stuff which conforms to those best-practices, with possibly lower utility to those service that only meet the standards required of those. In principle, this should encourage not only community best-practices development, but also encourage service providers to follow such wherever they can. To go a little farther, there could be mechanisms developed that allow services to offer conformance to different, conflicting, practices.
-- Main.BobMorris - 30 Aug 2008
---++++<noautolink>PowerPoint</noautolink> presentations for Perth
* [[%ATTACHURL%/AADC_metadata.ppt][AADC_metadata.ppt]]: Australian Antarctic Data Centre - Metadata issues for TDWG 2008 (Dave Watts)
* [[%ATTACHURL%/ALA_Metadata.ppt][ALA_Metadata.ppt]]: ALA Metadata - Goals and Issues (Donald Hobern)
* [[%ATTACHURL%/gbif-metadata-strategy.ppt][gbif-metadata-strategy.ppt]]: Presentation for Perth: GBIF (<28>amonn <20> Tuama)
* [[%ATTACHURL%/Palanisamy_metadata_tdwg2008.ppt][Palanisamy_metadata_tdwg2008.ppt]]: Presentation for Perth: GBIF (Giri Palanisamy)
--------------------
Dear Colleagues,
below, I add my reasoning from a communciation with Eamonn for your information:
-----Original Message-----
From: Falk Huettmann [mailto:fffh@uaf.edu]
Sent: 16 September 2008 22:59
To: 'Eamonn O Tuama'
Subject: FH Metadata RE: [Fwd: [tdwg] Metadata Discussion Meeting at TDWG
2008 Conference]
Hi Eamonn,
thanks for the info. Too bad I cannot make it to that very meeting; it's too far away
for us.
Re. Metadata content and standards, I think the issue is clear (FGDC NBII certainly,
but NOT GCMD or a sub - or national standard such as the Aussie one.
The ISOs would be goal, as long as they stick to FGDC, or improve it).
Re. technicalities, sure. We need to know all of it, and ought to have leadership with a vision.
Re. how to implement Metadata needs: Via laws and funding. No negotiating, no grandfathering.
Many people mistake this issue with the traditional way of doing business.
What is needed: a) Infrastructure support for people to do Metadata, and b)
Awards and Benefits for people doing so; similar to a citation index for
P&T. These efforts are currently badly missing, e.g. in our gov. institutions and unis.
I enclose you a paper on the Metadata matter etc you might have seen
already. It includes issues I mention above.
Citation goes:
Huettmann, F. (2007). The digital teaching legacy of the International Polar
Year (IPY): Details of a present to the global village for achieving
sustainability. Eds A. M. Tjoa and R.R. Wagner. Proceedings 18th
International Workshop on Database and Expert Systems Applications (DEXA)
3-7 September 2007, Regensburg, Germany. IEEE Computer Society, Los
Alamitos, CA. Pages 673-677.
Please keep me posted.
I enjoy our communication & work; THANKS. Hope I can contribute better over
time.
Very best
Falk
Falk Huettmann PhD, Assistant Professor
-EWHALE lab- Biology and Wildlife Dept., Institute of Arctic Biology
419 IRVING I, University of Alaska Fairbanks AK 99775-7000 USA Email
fffh@uaf.edu Phone 907 474 7882 Fax 907 474 6716
%META:FILEATTACHMENT{name="AADC_metadata.ppt" attachment="AADC_metadata.ppt" attr="" comment="Presentation for Perth: AADC (Dave Watts)" date="1219925705" path="AADC_metadata.ppt" size="1288704" stream="AADC_metadata.ppt" user="Main.DonaldHobern" version="1"}%
%META:FILEATTACHMENT{name="ALA_Metadata.ppt" attachment="ALA_Metadata.ppt" attr="" comment="Presentation for Perth: ALA (Donald Hobern)" date="1219997887" path="ALA_Metadata.ppt" size="978944" stream="ALA_Metadata.ppt" user="Main.DonaldHobern" version="1"}%
%META:FILEATTACHMENT{name="gbif-metadata-strategy.ppt" attachment="gbif-metadata-strategy.ppt" attr="" comment="Presentation for Perth: GBIF (<28>amonn <20> Tuama)" date="1220356865" path="gbif-metadata-strategy.ppt" size="2026496" stream="gbif-metadata-strategy.ppt" user="Main.EamonnOTuama" version="1"}%
%META:FILEATTACHMENT{name="Palanisamy_metadata_tdwg2008.ppt" attachment="Palanisamy_metadata_tdwg2008.ppt" attr="" comment="Presentation for Perth: GBIF (Giri Palanisamy)" date="1221665055" path="Palanisamy_metadata_tdwg2008.ppt" size="9829888" stream="Palanisamy_metadata_tdwg2008.ppt" user="Main.EamonnOTuama" version="1"}%