VFN Database Working Bee
From BluWiki
Contents |
VFN DATABASE WORKING BEE - August 15
(wiki created by Ian Wright, 26 June 2009)
ARC-NZ RESEARCH NETWORK FOR VEGETATION FUNCTION website here
Participants of this Vegetation Function Network "working bee" are welcome to add opinions and suggestions to this page. Please don't alter contributions put in by others, and please do label your contributions with name and date. Alternatively, please feel free to email suggestions to Ian Wright.
Advice on how to work with wikis can be found here.
BACKGROUND
A variety of databases are arising through activities of the Vegetation Function Network. It’s desirable that these be well documented with metadata, explanation and cautionary notes so that when they eventually enter the public domain, they are able to be used safely by other researchers asking new questions. This one-day gathering aims to advance the documentation of VFN-related databases, and to share experiences and tools for structuring, documenting, and checking databases.
Further information about the meeting is given below. First, though, I thought I'd insert some general notes about what a metadata document might ideally include:
Metadata: background information & links to further info
From DAAC website on metadata: Metadata is information about data that answers the following questions:
- What does the data set describe?
- Why was the data set created?
- Who produced the data set?
- How was each parameter measured?
- How was the data set created?
- When and how frequently were the data collected?
- Where were the data collected and with what spatial resolution?
- How reliable are the data; what problems remain in the data set?
- How can someone get a copy of the data set?
- Who prepared the metadata?
Also from DAAC, a webpage on Best Practice on putting together metadata docs:
- Assign Descriptive File Names
- Use Consistent and Stable File Formats
- Define the Parameters
- Use Consistent Data Organization
- Perform Basic Quality Assurance
- Assign Descriptive Data Set Titles
- Provide Documentation
Metadata in plain language: link to webpage from USGS describing basic information that is often found in metadata documents, and suggestions for how to structure these documents.
Jerry's Comments ... It's desirable that metadata should include at least the mandatory fields from the Dublin Core. The DC metadata element set is described in brief at [1], and in more detail in the user guide. A quick mapping between the Dublin Core 15 fields (DC) and the VFN metadata entities is here File:VFN-DC-mapping.xls Another metadata standard of particular relevence in New Zealand and Australia with respect to environmental and spatial datasets is the ANZLIC Metadata profile [2] And then there is EML [3] In the linked document I have done a quick exractionof the EML Dataset metadata fields, and I believe they are relevent to this effort File:QuickMetadataElements.doc
Useful links to different online databases
Name scrubbing
- TaxonScrubber: http://www.salvias.net/pages/taxonscrubber.html
- Phylomatic: http://www.phylodiversity.net/phylomatic/
- Angiosperm Phylogeny Website: http://www.mobot.org/MOBOT/research/APweb/welcome.html
- UBIO - see Tools/FindIT : http://www.ubio.org
- Taxamatch: http://www.cmar.csiro.au/datacentre/irmng/
Global lists
- Electronic Plant Name Index/International Plant Name Index: http://epic.kew.org/
- Tropicos: http://www.tropicos.org/
- Gymnosperm database: http://www.biologie.uni-hamburg.de/b-online/earle/
- Catalaogue of Life: http://www.catalogueoflife.org
- Global Biodiversity Information Facility: http://data.gbif.org/species/
Regional lists
- Australia Plant Name Index: http://www.anbg.gov.au/apni/index.html
- Africa Flowering Plant Database: http://www.ville-ge.ch/musinfo/bd/cjb/africa/index.php
VFN Database Working Bee: information
To be held at University of Queensland (St Lucia campus) on Saturday 15th August 2009 (Museum Conference Room, Goddard Building, School of Biological Sciences).
This is the weekend before Intecol10 in Brisbane, and the day before Intecol Workshop 14 on Plant Ecological Traits. Note that the venue for the working bee is different to that of the conference proper.
Location & mechanics
- Time & Location: 9.00am – 5.30pm, Museum Conference Room (Goddard Building, School of Biological Sciences at UQ). (Room booked from 8am-6pm).
- Directions to St Lucia campus. Link to maps of U Queensland.
- What to bring: you, your laptop, your metadata document, notes & materials useful for improving it!
- Facilities: Alice Yeates from UQ will be kindly providing support on the day. We should have access to a data projector, a whiteboard & a printer.
- Note:internet access is somewhat limited.
- We will provide morning & afternoon tea, and lunch. For lunch we will (most probably) go en masse to Wordsmith’s cafe, having pre-ordered lunch during the morning.
Participants
- Mark Westoby (possibly)
- Ian Wright (leader, WG 25 leaf size, WG21 13C)
- Amy Zanne (leader, WG27 wood anatomy and wood density)
- Drew Allen (WG60 DNA evolution rates)
- Will Cornwell (WG17 and WG35 leaf and wood decomposition, WG21 13C)
- Michael Crisp (WG18 assembly of southern floras)
- Richard Duncan (WG22 urbanization and plant functional traits)
- Satu Ramula (WG30 plant population syndromes)
- Fred Gurgel (WG47 and WG58 marine biogeography)
- Caroline Lehmann (WG49 Savanna structure)
- Jessie Wells (WG31 human-influenced countrysides)
- Nick Williams (WG22 urbanization and plant functional traits)
- Jerry Cooper (Landcare)
- No longer attending: Damien Fordham (WG54 future species distributions); Susan Wiser (Jerry Cooper attending instead); Johan Ehrlen (Satu Ramula attending instead); *Sandy Harrison (WG24 fire and Aust veg history).
- Not attending, but included on email list: Peter Linder (WG18 Assembly of southern floras)
In preparation for the Working Bee we ask that you have a look at the Table of VFN-related datasets represented by Working Bee participants (link is to Google docs version). There are two reasons for this. First, we ask that you check the entries relating to your dataset, and make any necessary changes to the spreadsheet. Second, from looking through the entries for other datasets we should get a sense of what other types of datasets and dataset structures we are dealing with, as a group.
An example of a basic metadata doc, provided by Fred Gurgel for WG47, can be found here.
Possible TOPICS FOR DISCUSSION include:
- What kinds of documentation or metadata are most important to provide?
- When do you need a data structure more complicated than spreadsheets?
- Procedures and tools for data-checking. What would be useful? What is available?
- What is an appropriate time-course for making databases available beyond the immediate group of people involved with compiling them? (A wiki at http://bluwiki.com/go/Vegfunction_data_sharing describes discussion about this within the Network up to the present.)
But besides discussion and sharing of tools, our main aim during the day is to actually move along the process of getting each database and its documentation improved, as much as possible.
-- That’s why we have called it a WORKING BEE rather than a workshop!
Tentative Schedule
- 0900 (Ian or Mark & Amy) What we hope to achieve:
- improved description and annotation for each dataset,
- data-checking tools,
- exchange of experience and skills and views about data access
- 0915 Discussion of two topics:
- What kinds of documentation or metadata are most important to provide? (and how far along are different databases in providing them?)
- What can we best do to motivate people to supply useful documentation, or how can we support them to do so less time-expensively?
- 1000 (Amy) Procedures & tools for taxonomic name checking
- 1015 (Ian or Mark) procedures and tools for data-checking. What would be useful? What is available?
- 1030 MORNING TEA
- 1100 Sorting out people’s priorities for the remainder of the day, so that they can get together in appropriate groups. Some options include:
- improving description in our compilation of databases
- writing metadata for their own dataset
- writing enquiries and requests to data contributors
- writing guidelines and advice
- writing software for data-checking
- 1230 LUNCH
- 1330 Continuing work
- 1540 AFTERNOON TEA
- 1600 Discussion about the appropriate time-course for making databases available beyond the immediate group of people involved with compiling them (Discussion so far is at a wiki at http://bluwiki.com/go/Vegfunction_data_sharing.)
PLEASE FEEL FREE TO SUGGEST MODIFICATIONS FOR DISCUSSION TOPICS & ACTIVITIES FOR THE DAY!! Suggestions can be emailed to Ian Wright, or added directly to this page.
Looking forward to seeing you all on August 15th!!



