Comment on page
Guidance for submitting images and datasets derived from images.
You may be wondering if you need to share images or video produced through the course of your research. These types of media, when associated with hypothesis-driven scientific research, are considered project output and valuable to the scientific community through their potential reuse. As such, many funding agencies will expect these data types to be shared through a repository. When in doubt you should reach out to your funding manager for confirmation.
BCO-DMO can publish your videos and images as well as data derived from those videos or images (e.g. coral reef quadrat photos and derived percent cover calculations). The information we ask you to include when you submit data will vary depending upon the instrumentation, methods, and software that produced your images and data.
Note that you can upload entire folders of files to our Submission Tool's files section. However, we request that you limit the size of any individual file to 10GB.
We realize imaging datasets can be quite large. If your data are too large to upload using a web browser, we can coordinate a file transfer with you using a Dropbox file request from our account. Or, if you already have your files in an online fileshare (Google Drive, Globus, Box, etc.), you can share the link with us and we will retrieve the files. If submitting files from an online fileshare, proceed with filling in the appropriate metadata by creating a dataset submission in the Submission Tool. You may skip the "Files" section of the form, and instead include a description of the data you have to send us in the comment box on the last page of the form ("Submit" page). We will email you to coordinate the file transfer.
Once we retreive your files, we will send you a file inventory so you can confirm we have everything you intended to submit. Alternatively, we can send you a link to upload files to BCO-DMO's Dropbox account if you prefer this transfer method.
Typically, BCO-DMO publishes your media (e.g. images/videos) by bundling them using zip or tar protocols (.zip with zip64 support, or .tar.gz). We can preserve any folder hierarchy your data type requires.
Note that any critical metadata encoded in folder or file names must also be in an accompanying file inventory table or in the main data table. For example, images stored in folders with names by year and site should also have an accompanying file inventory table with data columns for "year" and "site", or this information must be present in the main data table (more detail below).
The format of the images/video you submit to BCO-DMO should be one with the most reuse potential. If your community would benefit from raw images (this is often TIFF format), you can submit them in this form however, in these cases, we recommend submitting raw in addition to any processed formats.
For NSF OCE-funded projects, the Data Sharing Policy states: The Division of Ocean Sciences requires that metadata files, full data sets, derived data products and physical collections must be made publicly accessible within two (2) years of collection. https://www.nsf.gov/pubs/2017/nsf17037/nsf17037.jsp
Make sure you include the following parameters in a table. They can be included in a file inventory table or incorporated into your main data table if you have one.
- Filename, the full name of your file including the file extension (e.g. myimage.jpg, myvideo.mp4)
- [if applicable] folder name(s) if your files are stored in subfolders that you would like preserved when we publish your files.
If you need help making a file inventory table, contact us at [email protected]. We can get you started with a basic file inventory table to which you can add other data columns (described in the following sections) to provide important context for your data.
If your images or samples were collected in the field, provide collection information for the location, date, and time the samples/images were acquired:
- date, time (or DateTime), depth, latitude, longitude. Don't forget the time zone! You can include it in your datetime values directly or describe your time zone in the description of the date/time column(s).
- [if applicable] cruise_id, station, sampling metadata (e.g. cast, sample_id, mocness net_id)
- [if applicable] any other ancillary measurements taken concurrently with your samples (e.g. salinity, temperature, PAR)
If your images or samples have no field locations and are purely laboratory-based, we still recommend including date and time information in your dataset to provide context. However, if you only have the elapsed time since the start of an experiment, you should indicate in the metadata when the experiments took place.
If your experimental design included treatments and controls, do not forget to add data column(s) to indicate which treatment the image or video was from.
Check any taxonomic names before submitting your dataset to make sure they are correct.
[optional] If including taxonomic names, we recommend including the Lifescience Identifier (LSID) or a taxonomic identifier familiar to your community (e.g. AphiaID, TSN, etc.). This can be included either directly as a column in your main data table or in a supplementary species list for your dataset. If you used codes in your dataset instead of taxonomic names, please provide a supplementary species list with the codes and the taxonomic names. We recommend including taxonomic identifiers in your species list.
One way to check your taxonomic names for typos is to run your data table through the World Register of Marine Species "Taxa Match" tool (which accepts csv, tsv, Excel). The match tool will also provide information about whether your name exactly matches a known name or not. Correct any typos in taxonomic names before submitting to BCO-DMO. You can also add a column to your dataset with the matched LSID, and/or AphiaID using this tool.
These recommendations are an effort to consolidate (meta)data requirements from taxonomic, morphological, and ancillary information acquired from imagery of zooplankton, phytoplankton, and other particles. These data are often collected by imaging instruments such as Imaging FlowCytobot (IFCB), FlowCam, ZooScan, UVP, and LISST-Holo.
Include whatever data is needed for reuse by your community. BCO-DMO data managers are available to discuss your submission with you to decide together what would be best to publish (email [email protected]). BCO-DMO aligns with the best practices for reporting particle and plankton as presented in Neeley et al. (2021).
- Neeley, A., Beaulieu, S., Proctor, C., Cetinić, I., Futrelle, J., Soto Ramos, I., Sosik, H., Devred, E., Karp-Boss, L., Picheral, M., Poulton, N., Roesler, C., and Shepherd, A.. 2021: Standards and practices for reporting plankton and other particle observations from images. 38pp. DOI: 10.1575/1912/27377.
Your images and related data may become one or more "Datasets" at BCO-DMO depending upon the type of data and their structure. In some cases, it makes sense to have a dedicated dataset metadata page for the images themselves and a file inventory or collection information, which will have its own DOI. Associated data would be published from separate dataset metadata pages, each getting a DOI. A BCO-DMO Data Manager can help decide how to organize your data into metadata landing pages when you submit your data or when you reach out in advance of your submission.
ZooSCAN images and related data:
Particles from sediment traps:
- Images and associated metadata of individually classified particles imaged and quantified in sediment trap gel layers collected on four research cruises conducted between 2015 and 2018 https://www.bco-dmo.org/dataset/860725