head 1.1; access; symbols; locks; strict; comment @# @; 1.1 date 2011.10.18.15.52.58; author KarenCranstn; state Exp; branches; next ; desc @none @ 1.1 log @none @ text @%META:TOPICINFO{author="KarenCranstn" date="1318953178" format="1.1" reprev="1.1" version="1.1"}% %META:TOPICPARENT{name="MIAPAWorkshop2011Ideas"}% ---++Checklist for group 2: Members: Bill Piel, Jim Leebens-Mack, Karen Cranston, Teodor Georgiev Initial list of terms: * topology in digital format * branch lengths * support values * method of analysis (ML / MP / Bayes) * samples with valid taxon names * character data * excluded characters * character sets / partitions * biorepository collection code * specimen number * locality data * !GenBank accession number * tissue sequenced * molecular or morphological * alignment method * consensus method * software & version * character labels and states for morphological data * evolutionary model * heuristic search parameters / MCMC settings * random seed * input trees for consensus / composite trees Agreed-upon minimums * topology * support values * method of analysis (ML / MP / Bayes) * unique and valid OTU label (valid species name, or specimen identifier that could be found in a database somewhere - using some informatics magic) * alignment used to construct the tree * raw data (pre-cleaning and alignment, e.g. !GenBank ID) * alignment method * data assembly (how did we go from databased sequence data to final alignment); in the short to medium term, this would likely be text, not machine readable Discussed: * branch lengths with units: ideal, but not required; morphological data may not produce meaningful branch lengths * metadata about OTU label * specimen data: if a study did new sequencing, specimen data should be included Other interesting discussion: * distinction between what data should be available (the best practices) and how this information should be communicated (in text of paper, as digital object) * issue of something being required that might not be relevent for all studies (i.e. support values for a glommogram, valid species names for environmental samples) * best practices for analysis and data management: i.e. better to include all characters in matrix and exclude using software settings than removing those characters from the matrix itself -- Main.KarenCranstn - 18 Oct 2011 @