wiki-archive/twiki/data/Phylogenetics/PhylogeneticsStandardsWorks...

%META:TOPICINFO{author="HilmarLapp" date="1225425380" format="1.1" version="1.1"}%
%META:TOPICPARENT{name="WebHome"}%
---+!! %MAKETEXT{"TDWG Phylogenetics Standards Workshop"}%

The workshop took place on Oct 19, 2008, at the 2008 TDWG Annual Meeting in Fremantle (Perth),

*Table of Contents:*
%TOC%

---++ %MAKETEXT{"Agenda"}%

[[http://docs.google.com/Doc?id=dhdjhbvd_47ccsrndg6][Workshop objectives, agenda, and initial participant list]] were published in advance and disseminated to all TDWG conference attendees 1 week prior to the event.

---++ %MAKETEXT{"Workshop Notes"}%

Attendees: 18 in total

---+++ Introductions
   * Stan Blum: TDWG process
      * Foster sharing of standards, data, and biodiversity informatics developments
      * Process is fairly lightweight
      * Put up (social) interfaces for other people to discover you and find out how to contribute
      * Interest group is a group of people that have a shared set of problems that they would like to address.
      * Key piece for a group to "declare itself" is a charter: what is it that the group aims to do.
      * Then create Task Groups, each of which will also have a charter that says what the Task Group will do.
      * Standards are essentially documents, specifications.
      * Infrastructure in the form of mailing lists and wikis are provided.
      * Ratification is initiated by the convener of the Task Group
         * Submits the specification document to the Executive Committee
         * Review manager is assigned and arranges review of the specification
         * Public review follows after review recommendations are taken care of
         * There is no voting process.
      * Q: TDWG is reported to be very slow.
         * Key to keep pushing on the committees, communication is key
         * Process being in place helps in comparison to previously annual pace
      * Q: how is participating in TDWG better than just putting stuff up in SourceForge?
         * There's a lot of experience among TDWG that is related to many of data and types of data and problems we would be dealing with.
         * Can take advantage of the participants/experts in other groups.
         * Interface and communication between biologists and information technologists, which is what TDWG is well set up for
      * Q: what is a good (or a bad) charter?
         * Charters and process have only been instituted since 2 years. The experience and lessons learned from this are limited at this point.
         * Charters ideally are updated at least once annually.

---+++ Session I
   * David Kidd:
      * proper representation of geographical areas
      * position of observed and inferred nodes, branch paths
      * vicariance-related metadata
      * branches can be segmented, using different methods, representing for example time, or shortest distance (which in projections isn't necessarily a straight line)
      * paleocontinental reconstruction methods and/or parameters and simulation metadata needed
      * data from stratigraphy can have (dating) errors associated with it
      * Pyramid of standards: space (geographic, GML), place (ecological, EML), time (stratigraphic), form (taxonomic): shouldn't phylogenetic standard be in the middle of all of this
   * Rick Ree:
      * where does X (extant, or ancestral taxon) live?
         * where has it been found or collected?
         * expert opinion (monographs, floras)
      * describing the geographic range
         * quantitative: lat/long, grid cell values
         * qualitative: geopolitical units
         * predictive: ecological niche, model?
      * inspiration from use cases: historical biogeography, ancestral range estimation
      * take advantage of analogy from standards and ontologies developed for characters?
         * geographic range as an emergent trait of a taxon
         * take standards that have already been developed (OGS geospatial standards) and look at them through the lens of phylogenetics)
         * TDWG work can consist of recommending certain ways to applies an external standard
         * One of the first questions could be to determine whether we can exchange the biogeographical data and reconstructions (e.g. from DIVA and LAGRANGE) that we already have
   * Greg Riccardi
      * 230,000 images at present, several TBs of space
      * Capture and track the data that support phylogenetic inference based on characters
      * Morphbank objects (images, collections,  etc) have external links: specimen, sequence
         * Chain of evidence
      * Morphbank is being used by external tools, for example MX, as the underlying image store
      * Character state annotation: "sort a bale of plants" metaphor for images
      * Character definitions in trees data files are typically much to short and limited to search databases such as Morphbank
      * Linking images to anatomy ontology terms: ideally have an outline of the part linked, not just the whole image or a pointer within the image
      * Defining characters and states by using ontology terms: capturing, and linking from states
         * would also enable the ability to infer the relatedness of characters
      * Q: Can we have ontologies that are based in phylogeny. It is impossible otherwise to simply combine different morphological ontologies.
      * Q: what is the role of ontologies to informatics standards? -> metadata (property) meaning standardization
      * There is no good way currently to link various annotations in various media about a digital or collection object

---+++ Session II
   * Rutger Vos: <nop>NeXML format
   * Chris Zmasek: <nop>PhyloXML, phylogenomics
      * using phylogenies for functional inference
      * Q: library support? -> Forester, <nop>BioPerl
   * Hilmar Lapp: <nop>PhyloDB, <nop>BioSQL, <nop>PhyloWS
   * Nico Cellinese: Phylogenetic nomenclature
      * Define names not based on organismal traits but based on phylogenetic relationships

---+++ Breakout Groups:
The breakout groups were determined from a 45min group discussion and whiteboarding of suggestions, following by self-assignments to groups.

   1. Phyloinformatic Web services (Bill P., Rutger, Cindy)
      * data services. which data or metadata are needed from providers
      * data demands that ask for portals (such as GBIF, EOL)
      * crosstalk & provenance between providers
      * scope recommendations, workflows, use cases
   1. Metadata standardization & Ontology (David, Chris, Peter, Aaron, Bill, James)
      * metadata uses, properties, semantics
      * reuse possibilities for other TDWG standards
      * expressing the domain model independent of technology
   1. Deposition to repositories
      * incentives and standards to increase deposition rates
      * reporting requirements to enable repurposing

---++++ Report-out from the groups

*Group 1)* Bill P.
   * divided tasks between "tree decoration" and "tree delivery"
   * tree decoration:
      * coevolution (food web, pollinator, host-parasite): for a given set of hosts, give me the tree of parasites
      * computational: calculate divergence times or ages
      * use the tree as input, get it out decorated with certain metadata
   * tree delivery
      * types are parameter query and computational query
      * results in ID list
      * given an ID, the desired view (what elements are to be returned) and format, return the object(s)
      * ability to say, give me all trees that contain human, but only those that are about apes
         * ability to give a scope of the desired trees
   * ability to dump data, be alerted to updates (e.g., RSS feed)
   * different levels of hierarchy of objects: analyses, matrices, trees

*Group 2+3)* David
   * attributes about what makes up a phylogeny, provenance, where data came from and what type, papers possibly associated, parameters used, when and where was it done
   * branch length, support values, need multiple of these
   * tree to tree relationships
   * breaking branches into segments
   * sets of nodes, within and between trees, and sets of trees, relationship between nodes (e.g., homologs, host-parasite)
   * attributes of nodes (type of node, species or gene, taxon concept, area, more than object per node)
   * gene trees are a big application of trees
   * talk to geospatial group to learn about their objects and standards
   * talk to technical architecture group to learn more about ontologies
   * started with an exercise that needs to be followed up with
   * there seems to be a taskgroup developing ontologies for this area
   * formulating the elements and use cases would create momentum

---++++ Interest Group charter
   * Keep it simple
   * encouragement to list core members
   * implementation is the proof, further those efforts
   * intersection, don't over-think, what are the core elements

-- Main.HilmarLapp - 31 Oct 2008