wiki-archive/twiki/data/TIPAdmin/EnsuringLongTermPeservation...

149 lines
6.7 KiB
Plaintext

head 1.3;
access;
symbols;
locks; strict;
comment @# @;
1.3
date 2005.11.25.10.28.36; author RogerHyam; state Exp;
branches;
next 1.2;
1.2
date 2005.11.24.09.39.49; author RogerHyam; state Exp;
branches;
next 1.1;
1.1
date 2005.11.23.16.28.49; author RogerHyam; state Exp;
branches;
next ;
desc
@none
@
1.3
log
@none
@
text
@%META:TOPICINFO{author="RogerHyam" date="1132914516" format="1.1" version="1.3"}%
%META:TOPICPARENT{name="MeetingNotes"}%
---+ Ensuring Long-Term Preservation and adding Value to Scientific and Technical Data (PV2005)
Royal Society of Edinburgh.
Attended Main.RogerHyam. 21st to 23rd November 2005
---++ Summary
This three day conference had some relevant and some not so relevant talks. I have a hard copy of the papers presented available (which is quite comprehensive) if you need to know more.
I did not attend the final part of day 2 or day 3 as they papers didn't seem worth the trip. The first two days were very useful though. Here is a summary of bits that look relevent to us from some of the papers.
---++ Papers
---+++ Long-term preservation of chemical information
Peter Murray-Rust and Henry Rzepa.
Quite inspirational. Talking about how little information sharing there was in chemistry and issues of surrendering copyright to data when it is published. Important point were in connection with documentation:
* Everything scientific should be released under the Creative Commons license when possible (worth looking into as a TDWG default license option!)
* PDF documentation is 'semantically empty' and should be avoided.
* Tools were demonstrated for checking Chemistry publications
* Similar tools were in place to help people create more semantic publications - written as an extension to Excel (This may be something worth looking into - can we get people to use familiar tools to create more meaningful documentation).
* Only way to get people to stop using Word/PDF to do things in semantically poor way is to require the correct format and have tools for checking it.
* There is a hashing algorithm for generating a unique id for a chemical compound. (The same would be possible for a biological name but I don't think people would accept it...)
---+++ Data information and management system for the DFD multi-mission earth observation data
Kiemle et al
Very big data management project wanting to re-use as much infrastructure as possible between different satellite missions. Two take home message for us:
* A new satellite monitoring mission should just be a run of the mill use case for the system. This equates to: Incorporating a new type of biodiversity data should just be a run of the mill use case for the TDWG architecture.
* Stability of layers of design increase as you go up. i.e. Abstract service layer is more stable that object layer which is more stable than technical layer. (obvious but worth pointing out).
---+++ Performing a migration in the framework of the OAIS reference model: NSSDC Case Study.
Big problems with preservation of data and software but not documentation. Stressed strongly the need for preservation standards for documentation of data. (OAIS = Reference Model for an Open Archival Information System ISO 14721 (2003) )
---+++ Formalisation of material property data analysis with web ontology
* Mixed XML Schema and RDF/OWL.
* Comments from the floor that doing this was dodgy.
* Comment from the floor that there was already and ISO standard ISO10303 which had a dynamic data dictionary for material science.
---+++ Developing and using standards for data an information in science and technology
John Rumble et al
* Gave a good break down of why to have standards. Economics being top of the list and Nomenclature being a very important one.
---+++ The Semantic Planetary Data System
steve.hughes@@jpl.nasa.gov
* They archive data from the NASA missions to planets and moon.
* Lots of data from different places gathered since 1980s.
* Dictionary of terms.
* Want to expose the whole thing as RDF.
* Have pilot project up and running and seeking approval from their board of directors for full roll out beginning of next year.
* Talk actually given by Elaine Robinson (see below)
---+++ Adaptation and use of OpenGIS web technologies for multi-disciplinary access to planetary data
elaine.dobinson@@jpl.nasa.gov
* I didn't actually make this talk as was the only one worth going to on Wednesday but have read the associated paper.
* There is obvious overlap between this paper and the one above. Same institute, same data different approaches. This maps very closely to what we need I believe. Ontology stuff plus OGC standard stuff.
* Have written to the two authors asking if they could expand on how the two will fit together - or if they have even considered it. We have a load of overlap with them even though they have nothing living - yet!
---+++ Chat with Jack Smith
Met F.J. Smith fom Queen's University Belfast. Editor of [[http://www.datasciencejournal.org/][Data Science Journal]] the CODATA journal. He as keen for me to submit papers before he steps down as editor at the end of the year. Would it be good to submit stuff?
---++ Final thoughts
Although there was a load of unrelated stuff the conference was actually very informational, has confirmed some ideas and suggested others.
In a boring talk about file names I came up with DIPI as a set of high level requirements for TDWG architecture
* Dynamic: The thing has to change through time incorporating new domains and changing existing ones.
* Integration: All standards must integrate with each other and with other ones outside.
* Provenance: All things need identity, ownership, acknowledgement, history.
* Inference: (In the broadest terms) Where we are all headed. At some point in the future we will need the machines to do some form of reasoning about the data even if it is only in the form of assisted searching.
Data Dictionaries are everywhere. Some one in most fields defines a dictionary that other people use as a central reference. This is the basis for both a central ontology and nomenclators moving forward.
-- Main.RogerHyam - 23 Nov 2005@
1.2
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="RogerHyam" date="1132825189" format="1.1" version="1.2"}%
d31 1
a31 1
Big problems with preservation of data and software but not documentation. Stressed strongly the need for preservation standards for documentation of data.
d78 1
a78 1
-- Main.RogerHyam - 23 Nov 2005
@
1.1
log
@none
@
text
@d1 1
a1 1
%META:TOPICINFO{author="RogerHyam" date="1132763329" format="1.1" version="1.1"}%
d54 2
a55 1
*
d57 5
a62 1
---++ Summary of ideas
d69 1
a69 1
Data Dictionaries are everywhere. Some one in most fields defines a dictionary.
@