update build instructions on README.md

This commit is contained in:
Steve Baskauf 2021-08-04 15:25:36 -05:00
parent 12a0b91543
commit 6a595447e4
1 changed files with 6 additions and 6 deletions

View File

@ -9,10 +9,10 @@
5. If a new term is being added, fill in a new row anywhere below the header row.
6. Special care must be taken if columns are added (i.e. metadata properties are added). This is not for the faint of heart! The new columns must be added to every file used as source data for the various scripts and the column header mapping files also need to be edited. See [this page](for more details). This should be a rare event. DO NOT ever delete columns! If you want to elimite values for a property, just leave empty strings in all of the cells of that property's column.
7. Create a new branch (or fork if you don't have push rights) of the [rs.tdwg.org repo](https://github.com/tdwg/rs.tdwg.org). Save your edited CSV file using some notable name in the [process](https://github.com/tdwg/rs.tdwg.org/tree/master/process) directory.
8. Open the [simplified_process_rs_tdwg_org.ipynb](https://github.com/tdwg/rs.tdwg.org/blob/master/process/simplified_process_rs_tdwg_org.ipynb) Jupyter notebook and follow [these instructions](https://github.com/tdwg/rs.tdwg.org/blob/master/process/process-vocabulary.md#21-setup) to edit the configuration section of the script.
9. Run the script, paying careful attention to whether particular sections are appropriate for what you are trying to accomplish. NOTE: there are still some kinks to be worked out for the borrowed terms (`dc:` and `dcterms:` namespaces), but changes there should be rare. It is useful to monitor the diffs that are generated as sections of the script are run and make sure that the changes are reasonable. This is easily monitored if you are using the GitHub desktop client.
10. If there are changes to more than one namespace, repeat all of the previous steps with the second namespace before continuing on.
11. When you are satisfied that all of the term, term list, vocabulary, and standards metadata changes are sensible, discard the changes made to the Jupyter notebook so that it will remain in it's "example" stage when the branch (or fork) is merged. Alternately, you can download the "example" notebook from GitHub to write over the version that you modified, and commit it to the branch.
8. Repeat this process for all namespaces that will be changed. I've been saving copies of the changes in [this directory](https://github.com/tdwg/rs.tdwg.org/tree/master/process/dwc-revisions) so that we can easily see what's been done in the past. Modify the [configuration file](https://github.com/tdwg/rs.tdwg.org/blob/master/process/config.json) so that it points to each CSV for each namespace. Look at the example to see what needs to be there.
9. Run the [processing script](https://github.com/tdwg/rs.tdwg.org/blob/master/process/process.py), which needs to be in the same directory as the configuration file.
10. There are some manual edits that need to be made if there are changes to either of the Dublin Core namespace terms. The versions don't get handled very automatically, so make the same changes to the [dcterms: version CSV](https://github.com/tdwg/rs.tdwg.org/blob/master/dcterms-for-dwc-versions/dcterms-for-dwc-versions.csv) or [dc: version CSV](https://github.com/tdwg/rs.tdwg.org/blob/master/dc-for-dwc-versions/dc-for-dwc-versions.csv) as were made to the main term CSVs.
11. The versions also need to be manually added for the new termlist version in the [termlist versions members CSV](https://github.com/tdwg/rs.tdwg.org/blob/master/term-lists-versions/term-lists-versions-members.csv). In the future, this may get automated.
12. As of 2020-08-20, updating rs.tdwg.org document metadata must be done manually. Steve Baskauf knows how to do it and will try to eventually write a script to automate the process. It's best to ask him to do the updating before merging the branch.
13. Push the branch to GitHub and create a pull request. It is best for someone to review the changes carefully before merging.
14. Once the branch has been merged the data are available via HTTP to the other scripts that use those data.
@ -25,7 +25,7 @@
21. Run the [build.py](https://github.com/tdwg/dwc/blob/master/build/build.py) script to build the Quick Reference Guide.
22. Create a pull request for the new branch.
23. When the branch has been reviewed carefully, merge the branch. The new pages shuld be live as soon as Jekyll rebuilds them on GitHub.
24. Term dereferencing to human and machine readable representations is handled by a server managed by GBIF. Ask Matt Blissett to reload the data from the `rs.tdwg.org` repo into the server (he has a script to do it.). Because dereferencing of current terms to human-readable web pages is handled by a redirect, there won't be any noticeable difference whether the data are reloaded in this step or not. But dereferencing the term versions, or dereferencing to acquire machine readable metadata will not reflect the new changes until the server is reloaded.
24. Term dereferencing to human and machine readable representations is handled by a server managed by GBIF. The new metadata gets fed into the production version of the server when there is a new release of the `rs.tdwg.org` repo, so when everything is done, make sure there a new release has been made. Because dereferencing of current terms to human-readable web pages is handled by a redirect, there won't be any noticeable difference whether the data are reloaded in this step or not. But dereferencing the term versions, or dereferencing to acquire machine readable metadata will not reflect the new changes until the release process completes.
## Build script
@ -65,4 +65,4 @@ It generates the file `term_versions.csv`, which is used as the input for the `b
The Python script `build-termlist.py` inputs the header information from `termlist-header.md`, then builds the list of terms and their metadata from data in the [rs.tdwg.org](http://github.com/tdwg/rs.tdwg.org) repository. The script also inputs `termlist-footer.md` and appends it to the end of the generated document, but currently it has no content. The constructed Markdown document is saved as `/docs/list/index.md`.
------
Last edited: 2020-08-20
Last edited: 2021-08-04