Package 'vprr' reference manual

Title:	Processing and Visualization of Video Plankton Recorder Data
Description:	An oceanographic data processing package for analyzing and visualizing Video Plankton Recorder data. This package was developed at 'Bedford Institute of Oceanography'. Functions are designed to process automated image classification output and create organized and easily portable data products.
Authors:	Emily O'Grady [aut, cre], Kevin Sorochan [aut], Catherine Johnson [aut]
Maintainer:	Emily O'Grady <[email protected]>
License:	MIT + file LICENSE
Version:	0.3.0
Built:	2025-03-07 06:37:06 UTC
Source:	https://github.com/eogrady21/vprr

Get bin averages for VPR and CTD data

Description

Bins CTD data for an individual cast to avoid depth averaging across tow-yo's

Usage

bin_calculate(data, binSize = 1, imageVolume, rev = FALSE)
bin_calculate(data, binSize = 1, imageVolume, rev = FALSE)

Arguments

`data`	ctd data frame object
`binSize`	the height of bins over which to average, default is 1 metre
`imageVolume`	the volume of VPR images used for calculating concentrations (mm^3)
`rev`	logical value, if TRUE, binning will begin at bottom of each cast, this controls data loss due to uneven binning over depth. If bins begin at bottom, small amounts of data may be lost at the surface of each cast, if binning begins at surface (rev = FALSE), small amounts of data may be lost at bottom of each cast

Details

Image volume calculations can change based on optical setting of VPR as well as autodeck setting used to process images. For IML2018051 (S2) image volume was calculated as 108155 mm^3 by seascan (6.6 cubic inches). For COR2019002 S2 image volume was calculated as 83663 mm^3 and S3 image volume was calculated as 366082 mm^3. Used internally (bin_cast) after ctd_cast on a single ascending or descending section of VPR cast.

Note

binSize should be carefully considered for best results.

Depth is used for calculations! Please ensure depth is included in data frame using swDepth.

Author(s)

E. Chisholm, K. Sorochan

Bin vpr data

Description

Formats oce style VPR data into depth averaged bins using ctd_cast and bin_calculate This function is used inside concentration_category

Usage

bin_cast(
  ctd_roi_oce,
  imageVolume,
  binSize,
  rev = FALSE,
  breaks = NULL,
  cutoff = 0.1
)
bin_cast(
  ctd_roi_oce,
  imageVolume,
  binSize,
  rev = FALSE,
  breaks = NULL,
  cutoff = 0.1
)

Arguments

`ctd_roi_oce`	`oce` ctd format VPR data from `vpr_oce_create`
`imageVolume`	the volume of VPR images used for calculating concentrations (mm^3)
`binSize`	passed to `bin_calculate`, determines size of depth bins over which data is averaged
`rev`	logical value,passed to `bin_calculate` if TRUE, binning will begin at bottom of each cast, this controls data loss due to uneven binning over depth. If bins begin at bottom, small amounts of data may be lost at the surface of each cast, if binning begins at surface (rev = FALSE), small amounts of data may be lost at bottom of each cast
`breaks`	Argument passed to ctdFindProfiles
`cutoff`	Argument passed to ctdFindProfiles

Details

Image volume calculations can change based on optical setting of VPR as well as autodeck setting used to process images For IML2018051 (S2) image volume was calculated as 108155 mm^3 by seascan (6.6 cubic inches) For COR2019002 S2 image volume was calculated as 83663 mm^3 and S3 image volume was calculated as 366082 mm^3

Value

A dataframe of depth averaged bins of VPR data over an entire cast with calculated concentration values

A binned data frame of concentration data per category

Description

A 'binned' dataframe from sample VPR data, including concentrations of each category, where each data point represents a 5 metre bin of averaged VPR data. Produced using vpr_roi_concentration

Usage

category_conc_n
category_conc_n

Format

A dataframe with 21 variables

depth: Depth calculated from pressure in metres
min_depth: The minimum depth of the bin in metres
max_depth: The maximum depth of the bin in metres
depth_diff: The difference between minimum and maximum bin depth in metres
min_time_s: The minimum time in seconds of the bin
max_time_s: The maximum time in seconds of the bin
time_diff_s: The difference between minimum and maximum time in a bin, in seconds
n_roi_bin: The number of ROI observations in a bin
conc_m3: The concentration of ROIs in a bin, calculated based on image volume and number of frames per bin
temperature: Temperature measured from the VPR CTD in celsius (averaged within the bin)
salinity: Salinity measured from the VPR CTD (averaged within the bin)
density: sigma T density calculated from temperature, salinity and pressure (averaged within the bin)
fluorescence: Fluorescence measured by the VPR CTD in millivolts (uncalibrated) (averaged within the bin)
turbidity: Turbidity measured by the VPR CTD in millivolts (uncalibrated) (averaged within the bin)
avg_hr: The mean time in which bin data was collected, in hours
n_frames: The number of frames captured within a bin
vol_sampled_bin_m3: The volume of the bin sampled in metres cubed
toyo: Identifier of the tow-yo section which bin is a part of, either ascending or descending, appended by a number
max_cast_depth: The maximum depth of the entire VPR cast
category: The category in which ROIs in bin have been classified by Visual Plankton
station: Station identifier provided during processing

Binned concentrations

Description

This function produces depth binned concentrations for a specified category. Similar to bin_cast but calculates concentrations for only one category. Used inside vpr_roi_concentration

Usage

concentration_category(
  data,
  category,
  binSize,
  imageVolume,
  rev = FALSE,
  breaks = NULL,
  cutoff = 0.1
)
concentration_category(
  data,
  category,
  binSize,
  imageVolume,
  rev = FALSE,
  breaks = NULL,
  cutoff = 0.1
)

Arguments

`data`	dataframe produced by processing internal to vpr_roi_concentration
`category`	name of category isolated
`binSize`	passed to `bin_calculate`, determines size of depth bins over which data is averaged
`imageVolume`	the volume of VPR images used for calculating concentrations (mm^3)
`rev`	Logical value defining direction of binning, FALSE - bins will be calculated from surface to bottom, TRUE- bins will be calculated bottom to surface
`breaks`	Argument passed to ctdFindProfiles
`cutoff`	Argument passed to ctdFindProfiles

Details

Author(s)

E. Chisholm

Isolate ascending or descending section of ctd cast

Description

This is an internal step required to bin data

Usage

ctd_cast(
  data,
  cast_direction = "ascending",
  data_type,
  cutoff = 0.1,
  breaks = NULL
)
ctd_cast(
  data,
  cast_direction = "ascending",
  data_type,
  cutoff = 0.1,
  breaks = NULL
)

Arguments

`data`	an `oce` ctd object
`cast_direction`	'ascending' or 'descending' depending on desired section
`data_type`	specify 'oce' or 'df' depending on class of desired output
`cutoff`	Argument passed to ctdFindProfiles
`breaks`	Argument passed to ctdFindProfiles

Value

Outputs either data frame or oce ctd object

Note

ctdFindProfiles arguments for minLength and cutOff were updated to prevent losing data (EC 2019/07/23)

Author(s)

K Sorochan, E Chisholm

VPR CTD data

Description

A dataframe including all CTD parameters from the VPR CTD, produced by vpr_ctd_read

Usage

ctd_dat_combine
ctd_dat_combine

Format

A dataframe with 15 variables

time_ms: Time stamp when ROI was collected (milliseconds)
conductivity: Conductivity collected by the VPR CTD
pressure: Pressure measured from the VPR CTD in decibars
temperature: Temperature measured from the VPR CTD in celsius
salinity: Salinity measured from the VPR CTD
fluor_ref: A reference fluorescence baseline provided in millivolts by the VPR CTD for calibrating fluorescence_mv data
fluorescence_mv: Fluorescence in millivolts from the VPR CTD (uncalibrated)
turbidity_ref: A reference turbidity baseline provided in millivolts for calibrating turbidity_mv
turbidity_mv: Turbidity in millivolts from the VPR CTD (uncalibrated)
altitude_NA: Altitude data from the VPR CTD
day: Day on which VPR data was collected (from AutoDeck)
hour: Hour during which VPR data was collected (from AutoDeck)
station: Station idnetifier provided during processing
sigmaT: Density caluclated from temperature, pressure and salinity data
depth: Depth in metres caluclated form pressure

Read CTD data (SBE49) from CTD- VPR package

Description

Internal use vpr_ctd_read

Usage

ctd_df_cols(x, col_list)
ctd_df_cols(x, col_list)

Arguments

`x`	full filename (ctd .dat file)
`col_list`	list of CTD data column names

Details

WARNING This is hard coded to accept a specific order of CTD data columns. The names and values in these columns can change based on the specific instrument and should be updated before processing data from a new VPR.

Text file format .dat file Outputs ctd dataframe with variables time_ms, conductivity, temperature, pressure, salinity

Author(s)

K. Sorochan, E. Chisholm

VPR CTD data combined with tabulated ROIs

Description

A dataframe representing CTD data which has been merged with tabulated ROIs in each category, produced by vpr_ctdroi_merge

Usage

ctd_roi_merge
ctd_roi_merge

Format

A dataframe with 28 variables

time_ms: Time stamp when ROI was collected (milliseconds)
conductivity: Conductivity collected by the VPR CTD
pressure: Pressure measured from the VPR CTD in decibars
temperature: Temperature measured from the VPR CTD in celsius
salinity: Salinity measured from the VPR CTD
fluor_ref: A reference fluorescence baseline provided in millivolts by the VPR CTD for calibrating fluorescence_mv data
fluorescence_mv: Fluorescence in millivolts from the VPR CTD (uncalibrated)
turbidity_ref: A reference turbidity baseline provided in millivolts for calibrating turbidity_mv
turbidity_mv: Turbidity in millivolts from the VPR CTD (uncalibrated)
altitude_NA: Altitude data from the VPR CTD
day: Day on which VPR data was collected (from AutoDeck)
hour: Hour during which VPR data was collected (from AutoDeck)
station: Station identifier provided during processing
sigmaT: Density caluclated from temperature, pressure and salinity data
depth: Depth in metres caluclated form pressure
roi: ROI identification number
categories: For each category name (eg. bad_image_blurry, Calanus, krill), there is a line in the dataframe representing the number of ROIs identified in this category
n_roi_total: Total number of ROIs in all categories for each CTD data point

VPR data including CTD and ROI information

Description

An oce formatted CTD object with VPR CTD and ROI data from package example data set.

Usage

ctd_roi_oce
ctd_roi_oce

Format

An oce package format, a 'CTD' object with VPR CTD and ROI data (1000 data rows)

INTERNAL USE ONLY quick data frame function from github to insert row inside dat frame

Description

INTERNAL USE ONLY quick data frame function from github to insert row inside dat frame

Usage

insertRow(existingDF, newrow, r)
insertRow(existingDF, newrow, r)

Arguments

`existingDF`	data frame
`newrow`	new row of data
`r`	index of new row

Get vector to draw isopycnal lines on TS plot Used internally to create TS plots

Description

Get vector to draw isopycnal lines on TS plot Used internally to create TS plots

Usage

isopycnal_calculate(sal, pot.temp, reference.p = 0)
isopycnal_calculate(sal, pot.temp, reference.p = 0)

Arguments

`sal`	salinity vector
`pot.temp`	temperature vector in deg C
`reference.p`	reference pressure for calculation, set to 0

Note

: modified from source:https://github.com/Davidatlarge/ggTS/blob/master/ggTS_DK.R

Author(s)

E. Chisholm

Normalize a matrix

Description

take each element of matrix dived by column total

Usage

normalize_matrix(mat)
normalize_matrix(mat)

Arguments

mat

a matrix to normalize

Details

Make sure to remove total rows before using with VP data

Note

used internally for visualization of confusion matrices

Packages

Description

Packages

Get conversion factor for pixels to mm for roi measurements

Description

Used internally

Usage

px_to_mm(x, opticalSetting)
px_to_mm(x, opticalSetting)

Arguments

`x`	an aidmea data frame (standard) to be converted into mm from pixels
`opticalSetting`	the VPR setting determining the field of view and conversion factor between mm and pixels

Details

converts pixels to mm using conversion factor specific to optical setting

Options for opticalSetting are 'S0', 'S1', 'S2', or 'S3'

Read aid files produced by automated classification

Description

Read aid files produced by automated classification

Usage

read_aid_cnn(aid_file)
read_aid_cnn(aid_file)

Arguments

aid_file

a file path to an aid file produced by automated classification (with ROI path and probability value)

Value

ROI path and probability values in a table

VPR ROI data

Description

A dataframe including VPR ROI data from the sample dataset, produced by vpr_autoid_read

Usage

roi_dat_combine
roi_dat_combine

Format

A dataframe with 13 variables

roi: Unique ROI identifier - 8 digit
categories: For each category name (eg. bad_image_blurry, Calanus, krill), there is a line in the dataframe representing the number of ROIs identified in this category
time_ms: Time stamp when ROI was collected (milliseconds)

VPR measurement data calculated by Visual Plankton

Description

A data frame of measurement information for each ROI in the sample data set including long axis length, perimeter and area, produced by vpr_autoid_read

Usage

roimeas_dat_combine
roimeas_dat_combine

Format

A data frame with 12 variables

roi: Unique ROI identifier - 10 digit
category: Category in which ROI has been classified
day_hour: day and hour in which data was collected (from Autodeck)
Perimeter: The perimeter of the ROI in millimeters
Area: The area of the ROI in millimeters
width1: Width at a first point of the ROI in millimetres (defined in more detail in VPR manual)
width2: Width at a second point of the ROI in millimetres (defined in more detail in VPR manual)
width3: Width at a third point of the ROI in millimetres (defined in more detail in VPR manual)
short_axis_length: The length in millimeters of the ROI along the shorter axis
long_axis_length: The length in millimeters of the ROI along the longer axis
station: Station identifier provided in processing
time_ms: Time stamp when ROI was collected in milliseconds

VPR size information dataframe

Description

A sample data frame of size information from Visual Plankton outputs, processed using vpr_ctdroisize_merge

Usage

size_df_f
size_df_f

Format

An object of class data.frame with 14 rows and 14 columns.

Details

@format A dataframe with 14 variables including

frame_ID: Unique identifier for each VPR frame
pressure: Pressure measured from the VPR CTD in decibars
temperature: Temperature measured from the VPR CTD in celsius
salinity: Salinity measured from the VPR CTD
sigmaT: Density calculated from temperature, salinity and pressure
fluorescence_mv: Fluorescence measured by the VPR CTD in millivolts (uncalibrated)
turbidity_mv: Turbidity measured by the VPR CTD in millivolts (uncalibrated)
roi: Unique ROI identification number - 10 digits, 8 digit millisecond time stamp and two unique digits to denote multiple ROIs within a millisecond
category: Category in which ROI has been classified by Visual Plankton
day_hour: Day and hour in which data was collected, from AutoDeck processing
long_axis_length: The length of the longest axis of the ROI image, measured by Visual Plankton
station: Station identifier provided during processing
time_ms: Time stamp when ROI was collected (milliseconds)
roi_ID: ROI identification number- 8 digit time stamp, without unique 2 digit ending

Checks manually created aid files for errors

Description

Checks for empty files, with an option to delete them. Then checks all the data for duplicated or missing ROIs which would indicate a problem with vpr_autoid_create()

Usage

vpr_autoid_check(new_autoid, original_autoid, cruise, dayhours)
vpr_autoid_check(new_autoid, original_autoid, cruise, dayhours)

Arguments

`new_autoid`	file path to autoid folder eg. C:/data/CRUISENAME/autoid/ (produced by `vpr_autoid_create()`)
`original_autoid`	file path to original autoid folder (produced by automated classification)
`cruise`	name of cruise which is being checked
`dayhours`	chr vector, of unique day and hour values to check through (format d123.h12)

Value

text file (saved in working directory) named CRUISENAME_aid_file_check.txt

Author(s)

E Chisholm

Copy VPR images into folders

Description

Organize VPR images into folders based on classifications provided by visual plankton

Usage

vpr_autoid_copy(
  new_autoid,
  roi_path,
  day,
  hour,
  threshold = NULL,
  org = "dayhour",
  cast = NULL,
  station = NULL
)
vpr_autoid_copy(
  new_autoid,
  roi_path,
  day,
  hour,
  threshold = NULL,
  org = "dayhour",
  cast = NULL,
  station = NULL
)

Arguments

`new_autoid`	A file path to your autoid folder where data is stored eg. "C:/data/cruise_X/autoid/"
`roi_path`	(optional) provide if ROI data has been moved since autoid files were created (if path strings in aid files do not match where data currently exists), a file path where ROI data is stored (up to "rois" folder)
`day`	character string representing numeric day of interest (3 chr)
`hour`	character string representing hour of interest (2 chr)
`threshold`	(optional) a numeric value, supplied only if you are copying images based on automated classifications, only images below this threshold of confidence will be copied for manual classification. Default is set to NULL.
`org`	chr value, if 'station', images will be output in folders labelled by station, if 'dayhour', images will be output in folders labelled by day and hour
`cast`	(optional) character string, VPR cast number of interest (3 chr), required if org is 'station'
`station`	(optional) character string, station name of interest (eg. "Shediac"), required if org is 'station'

Value

organized file directory where VPR images are contained with folders, organized by day, hour and classification, inside your autoid folder

Note

this function uses tidy paths, see fs::path_tidy() for more info

Modifies aid and aid mea files based on manual reclassification

Description

Modifies aid and aid mea files based on manual reclassification

Usage

vpr_autoid_create(
  reclassify,
  misclassified,
  basepath,
  day,
  hour,
  mea = TRUE,
  categories
)
vpr_autoid_create(
  reclassify,
  misclassified,
  basepath,
  day,
  hour,
  mea = TRUE,
  categories
)

Arguments

`reclassify`	list of reclassify files (output from vpr_manual_classification())
`misclassified`	list misclassify files (output from vpr_manual_classification())
`basepath`	path to folder containing autoid files (e.g., 'extdata/COR2019002/autoid')
`day`	day identifier for relevant aid & aidmeas files
`hour`	hour identifier for relevant aid & aidmeas files
`mea`	logical indicating whether or not there are accompanying measurement files to be created
`categories`	A list object with all the potential classification categories

Author(s)

E. Chisholm

Examples

## Not run: 
basepath <- 'E:/autoID_EC_07032019/'
day <- '289'
hr <- '08'
categories <-
c("bad_image_blurry", "bad_image_malfunction", "bad_image_strobe", "Calanus", "chaetognaths",
"ctenophores", "krill", "marine_snow", "Other", "small_copepod", "stick")
day_hour_files <-  paste0('d', day, '.h', hr)
misclassified <- list.files(day_hour_files, pattern = 'misclassified_', full.names = TRUE)
reclassify <- list.files(day_hour_files, pattern = 'reclassify_', full.names = TRUE)
vpr_autoid_create(reclassify, misclassified, basepath, categories)

## End(Not run)
## Not run: 
basepath <- 'E:/autoID_EC_07032019/'
day <- '289'
hr <- '08'
categories <-
c("bad_image_blurry", "bad_image_malfunction", "bad_image_strobe", "Calanus", "chaetognaths",
"ctenophores", "krill", "marine_snow", "Other", "small_copepod", "stick")
day_hour_files <-  paste0('d', day, '.h', hr)
misclassified <- list.files(day_hour_files, pattern = 'misclassified_', full.names = TRUE)
reclassify <- list.files(day_hour_files, pattern = 'reclassify_', full.names = TRUE)
vpr_autoid_create(reclassify, misclassified, basepath, categories)

## End(Not run)

Read VPR aid files

Description

Read aid text files containing ROI string information or measurement data and output as a dataframe

Usage

vpr_autoid_read(
  file_list_aid,
  file_list_aidmeas,
  export,
  station_of_interest,
  opticalSetting,
  warn = TRUE,
  categories
)
vpr_autoid_read(
  file_list_aid,
  file_list_aidmeas,
  export,
  station_of_interest,
  opticalSetting,
  warn = TRUE,
  categories
)

Arguments

`file_list_aid`	a list object of aid text files, containing ROI strings.
`file_list_aidmeas`	a list object of aidmea text files, containing ROI measurements.
`export`	a character string specifying which type of data to output, either 'aid' (roi strings) or 'aidmeas' (measurement data)
`station_of_interest`	Station information to be added to ROI data output, use NA if irrelevant
`opticalSetting`	Optional argument specifying VPR optical setting. If provided will be used to convert size data into mm from pixels, if missing size data will be output in pixels
`warn`	Logical, FALSE silences size data unit warnings
`categories`	A list object (of chr strings) with all the potential classification categories

Details

Only outputs either ROI string information OR measurement data

Note

Full paths to each file should be specified

Author(s)

E. Chisholm & K. Sorochan

Examples


station_of_interest <- 'test'
dayhour <- c('d222.h03', 'd222.h04')
categories <- c("bad_image_blurry", "bad_image_malfunction",
"bad_image_strobe", "Calanus", "chaetognaths","ctenophores","krill",
"marine_snow","Other","small_copepod", "stick")

#' #VPR OPTICAL SETTING (S0, S1, S2 OR S3)
opticalSetting <- "S2"
imageVolume <- 83663 #mm^3

auto_id_folder <- system.file('extdata/COR2019002/autoid/', package = 'vprr', mustWork = TRUE)
auto_id_path <- list.files(paste0(auto_id_folder, "/"), full.names = TRUE)

#'   # Path to aid for each category
aid_path <- paste0(auto_id_path, '/aid/')
# Path to mea for each category
aidmea_path <- paste0(auto_id_path, '/aidmea/')

# AUTO ID FILES
aid_file_list <- list()
aidmea_file_list <- list()
for (i in 1:length(dayhour)) {
  aid_file_list[[i]] <-
    list.files(aid_path, pattern = dayhour[[i]], full.names = TRUE)
  # SIZE DATA FILES
  aidmea_file_list[[i]] <-
    list.files(aidmea_path, pattern = dayhour[[i]], full.names = TRUE)
}

aid_file_list_all <- unlist(aid_file_list)
aidmea_file_list_all <- unlist(aidmea_file_list)

 # ROIs
roi_dat_combine <-
  vpr_autoid_read(
    file_list_aid = aid_file_list_all,
    file_list_aidmeas = aidmea_file_list_all,
    export = 'aid',
    station_of_interest = station_of_interest,
    opticalSetting = opticalSetting,
    warn = FALSE,
    categories = categories
  )

# MEASUREMENTS
roimeas_dat_combine <-
  vpr_autoid_read(
    file_list_aid = aid_file_list_all,
    file_list_aidmeas = aidmea_file_list_all,
    export = 'aidmeas',
    station_of_interest = station_of_interest,
    opticalSetting = opticalSetting,
    warn = FALSE,
    categories = categories
 )

station_of_interest <- 'test'
dayhour <- c('d222.h03', 'd222.h04')
categories <- c("bad_image_blurry", "bad_image_malfunction",
"bad_image_strobe", "Calanus", "chaetognaths","ctenophores","krill",
"marine_snow","Other","small_copepod", "stick")

#' #VPR OPTICAL SETTING (S0, S1, S2 OR S3)
opticalSetting <- "S2"
imageVolume <- 83663 #mm^3

auto_id_folder <- system.file('extdata/COR2019002/autoid/', package = 'vprr', mustWork = TRUE)
auto_id_path <- list.files(paste0(auto_id_folder, "/"), full.names = TRUE)

#'   # Path to aid for each category
aid_path <- paste0(auto_id_path, '/aid/')
# Path to mea for each category
aidmea_path <- paste0(auto_id_path, '/aidmea/')

# AUTO ID FILES
aid_file_list <- list()
aidmea_file_list <- list()
for (i in 1:length(dayhour)) {
  aid_file_list[[i]] <-
    list.files(aid_path, pattern = dayhour[[i]], full.names = TRUE)
  # SIZE DATA FILES
  aidmea_file_list[[i]] <-
    list.files(aidmea_path, pattern = dayhour[[i]], full.names = TRUE)
}

aid_file_list_all <- unlist(aid_file_list)
aidmea_file_list_all <- unlist(aidmea_file_list)

 # ROIs
roi_dat_combine <-
  vpr_autoid_read(
    file_list_aid = aid_file_list_all,
    file_list_aidmeas = aidmea_file_list_all,
    export = 'aid',
    station_of_interest = station_of_interest,
    opticalSetting = opticalSetting,
    warn = FALSE,
    categories = categories
  )

# MEASUREMENTS
roimeas_dat_combine <-
  vpr_autoid_read(
    file_list_aid = aid_file_list_all,
    file_list_aidmeas = aidmea_file_list_all,
    export = 'aidmeas',
    station_of_interest = station_of_interest,
    opticalSetting = opticalSetting,
    warn = FALSE,
    categories = categories
 )

Get category ids from string

Description

Get category ids from string

Usage

vpr_category(x, categories)
vpr_category(x, categories)

Arguments

`x`	A chr string which represents file paths from which category should be extracted
`categories`	A list object with all the potential classification categories

Value

A chr string of only the category id

Note

This function searches for exact matches to categories within '/' file separators. You may encounter errors if

Author(s)

K Sorochan

Examples

category_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
categories <- list("Calanus", "marine_snow", "blurry", "other_copepod")
vpr_category(category_string, categories)

category_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
categories <- list("Calanus", "marine_snow", "blurry", "other_copepod")
vpr_category(category_string, categories)

Create a new category to be considered for classification after processing with VP

Description

creates empty directory structure to allow consideration of new category during vpr_manual_classification()

Usage

vpr_category_create(category, basepath)
vpr_category_create(category, basepath)

Arguments

`category`	new category name to be added (can be a list of multiple category names)
`basepath`	path to folder containing autoid files (e.g., 'extdata/COR2019002/autoid')

Value

empty directory structure using new category name inside basepath

Create a list of ctd files to be read

Description

Searches through typical VP directory structure

Usage

vpr_ctd_files(castdir, cruise, day_hour)
vpr_ctd_files(castdir, cruise, day_hour)

Arguments

`castdir`	root directory for ctd cast files
`cruise`	cruise name (exactly as in directory structure)
`day_hour`	vector of day-hour combinations (e.g, dXXX.hXX)

Details

Use with caution

Value

vector of ctd file paths matching days-hour combinations provided

Author(s)

E. Chisholm and K. Sorochan

Read and format CTD VPR data

Description

Acts as a wrapper for ctd_df_cols

Usage

vpr_ctd_read(ctd_files, station_of_interest, day, hour, col_list)
vpr_ctd_read(ctd_files, station_of_interest, day, hour, col_list)

Arguments

`ctd_files`	full file paths to vpr ctd `.dat` files
`station_of_interest`	VPR station name
`day`	Day of interest, if not provided will be pulled from file path
`hour`	Hour of interest, if not provided will be pulled from file path
`col_list`	Optional chr vector of CTD data column names

Details

Reads CTD data and adds day, hour, and station information. Calculates sigma T and depth variables from existing CTD data to supplement raw data. If there are multiple hours of CTD data, combines them into single dataframe.

WARNING ctd_df_cols is hard coded to accept a specific order of CTD data columns. The names and values in these columns can change based on the specific instrument and should be updated/confirmed before processing data from a new VPR.

Author(s)

E. Chisholm & K. Sorochan

Examples


station_of_interest <- 'test'

ctd_files <- system.file("extdata/COR2019002/rois/vpr5/d222", "h03ctd.dat.gz",
package = "vprr", mustWork = TRUE)

ctd_dat_combine <- vpr_ctd_read(ctd_files, station_of_interest)

station_of_interest <- 'test'

ctd_files <- system.file("extdata/COR2019002/rois/vpr5/d222", "h03ctd.dat.gz",
package = "vprr", mustWork = TRUE)

ctd_dat_combine <- vpr_ctd_read(ctd_files, station_of_interest)

Add Year/ month/ day hour:minute:second information

Description

Obtain columns for date and time (i.e., column "ymdhms") and time in hours (i.e., column time_hr) for each row in VPR data frame by utilizing day-of-year, hour, and millisecond outputs from VPR data output.

Usage

vpr_ctd_ymd(data, year, offset)
vpr_ctd_ymd(data, year, offset)

Arguments

`data`	VPR data frame from `vpr_ctdroi_merge`
`year`	Year of data collection
`offset`	time offset in hours between VPR CPU and processed data times (optional)

Value

A VPR data frame with columns for date and time (i.e., column 'ymdhms') and hour (i.e., column time_hr)

Examples

year <- 2019
data('ctd_roi_merge')
dat <- vpr_ctd_ymd(ctd_roi_merge, year)


year <- 2019
data('ctd_roi_merge')
dat <- vpr_ctd_ymd(ctd_roi_merge, year)

Merge CTD and ROI data from VPR

Description

Combines CTD data (time, hydrographic parameters), with ROI information (identification number) into single dataframe, aligning ROI identification numbers and category classifications with time and hydrographic parameters

Usage

vpr_ctdroi_merge(ctd_dat_combine, roi_dat_combine)
vpr_ctdroi_merge(ctd_dat_combine, roi_dat_combine)

Arguments

`ctd_dat_combine`	a CTD dataframe from VPR processing from `vpr_ctd_read`
`roi_dat_combine`	a data frame of roi aid data from `vpr_autoid_read`

Author(s)

E. Chisholm & K. Sorochan

Examples

data('ctd_dat_combine')
data('roi_dat_combine')

ctd_roi_merge <- vpr_ctdroi_merge(ctd_dat_combine, roi_dat_combine)
data('ctd_dat_combine')
data('roi_dat_combine')

ctd_roi_merge <- vpr_ctdroi_merge(ctd_dat_combine, roi_dat_combine)

Format CTD and Size data from VPR

Description

Format CTD and Meas data frames into combined data frame for analysis and plotting of size data

Usage

vpr_ctdroisize_merge(data, data_mea, category_of_interest)
vpr_ctdroisize_merge(data, data_mea, category_of_interest)

Arguments

`data`	VPR dataframe from `vpr_ctdroi_merge`, with calculated variable sigmaT
`data_mea`	VPR size data frame from `vpr_autoid_read`
`category_of_interest`	a list of category of interest to be included in output dataframe

Value

A dataframe containing VPR CTD and size data

Examples

## Not run: 
data("ctd_roi_merge")
data("roimeas_dat_combine")
category_of_interest = 'Calanus'

ctd_roi_merge$time_hr <- ctd_roi_merge$time_ms /3.6e+06

size_df_f <- vpr_ctdroisize_merge(ctd_roi_merge, data_mea = roimeas_dat_combine,
 category_of_interest = category_of_interest)

## End(Not run)


## Not run: 
data("ctd_roi_merge")
data("roimeas_dat_combine")
category_of_interest = 'Calanus'

ctd_roi_merge$time_hr <- ctd_roi_merge$time_ms /3.6e+06

size_df_f <- vpr_ctdroisize_merge(ctd_roi_merge, data_mea = roimeas_dat_combine,
 category_of_interest = category_of_interest)

## End(Not run)

Get day identifier

Description

Get day identifier

Usage

vpr_day(x)
vpr_day(x)

Arguments

`x`	A string specifying the directory and file name of the size file

Value

A string of only the day identifier (i.e., "dXXX")

Author(s)

K Sorochan

Examples

day_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
vpr_day(day_string)

day_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
vpr_day(day_string)

Find day & hour info to match each station of interest for processing

Description

@author E. Chisholm and K. Sorochan

Usage

vpr_dayhour(stations, file)
vpr_dayhour(stations, file)

Arguments

`stations`	a vector of character values naming stations of interest
`file`	CSV file containing 'day', 'hour', 'station', and 'day_hour' columns

Value

Vector of day-hour combinations corresponding to stations of interest

Format and export VPR data for publication (IN DEVELOPMENT) Exports a csv file with standard column names based on British Oceanographic Data Centre, BODC::P01 and DarwinCore (DwC) naming conventions, and a JSON metadata file for station level metadata

Description

Format and export VPR data for publication (IN DEVELOPMENT) Exports a csv file with standard column names based on British Oceanographic Data Centre, BODC::P01 and DarwinCore (DwC) naming conventions, and a JSON metadata file for station level metadata

Usage

vpr_export(data, metadata, columnNames, file)
vpr_export(data, metadata, columnNames, file)

Arguments

`data`	a VPR data frame
`metadata`	(optional) a named list of character values giving metadata to be included in JSON file
`columnNames`	(optional) a named list of character values giving relationships between existing names of data columns and standard names
`file`	a file name for the data.csv

Examples



## Not run: 
data(category_conc_n)
metadata <- list(
  "station_level" = list(
    "title" = list("en" = "VPR data from the Scotian Shelf",
                   "fr" = "Données VPR de l'étagère néo-écossaise"),
    "dataset_ID" = 1,
    "decimalLatitudeStart" = 44.5,
    "decimalLongitudeStart" = -64.5,
    "decimalLatitudeEnd" = 45.5,
    "decimalLongitudeEnd" = -65.5,
    "maximumDepthInMeters" = 1000,
    "eventDate" = "2019-08-11",
    "eventTime" = "00:00:00",
    "basisOfRecord" = "MachineObservation",
   "associatedMedia" = "https://ecotaxa.obs-vlfr.fr/ipt/archive.do?r=iml2018051",
   "identificationReferences" = "Iv3 model v3.3",
   "instrument" = list("opticalSetting" = "S2",
                       "imageVolume" = 83663),
   "resources" = list(
      "data" = list("name" = "vpr123_station25.csv",
                    "creationDate" = "2023-01-01"),
      "metadata" = list("name" = "vpr123_station25-metadata.json",
                        "creationDate" = "2023-01-01")
    ),
    "dataAttributes" = list(
      "eventID" = list(
        "dataType" = "chr",
        "definition" = "An identifier for the set of information associated
        with a dwc:Event (something that occurs at a place and time). May be
        a global unique identifier or an identifier specific to the data set.",
        "vocabulary" = "dwc"
      ),
      "minimumDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "The lesser depth of a range of depth below the local",
        "vocabulary" = "dwc"
      ),
      "maximumDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "The greater depth of a range of depth below the local",
        "vocabulary" = "dwc"
      ),
      "DEPHPRST" = list(
        "dataType" = "float",
        "definition" = "Depth (spatial coordinate) of sampling event start
        relative to water surface in the water body by profiling pressure
         sensor and conversion to depth using unspecified algorithm",
        "vocabulary" = "BODC::P01"
      ),
      "individualCount" = list(
        "dataType" = "float",
        "definition" = "The number of individuals present at the time of the
         dwc:Occurrence.",
        "vocabulary" = "dwc"
      ),
      "verbatimIdentification" = list(
        "dataType" = "chr",
       "definition" = "A string representing the taxonomic identification as
       it appeared in the original record.",
        "vocabulary" = "dwc"
      ),
      "SDBIOL01" = list(
        "dataType" = "float",
        "definition" = "Abundance of biological entity specified elsewhere
        per unit volume of the water body",
        "vocabulary" = "BODC::P01"
      ),
      "TEMPST01" = list(
        "dataType" = "float",
        "definition" = "Temperature of the water body by CTD or STD",
        "vocabulary" = "BODC::P01"
      ),
      "PSALST01" = list(
        "dataType" = "float",
        "definition" = "Practical salinity of the water body by CTD and
        computation using UNESCO 1983 algorithm",
        "vocabulary" = "BODC::P01"
      ),
      "POTDENS0" = list(
        "dataType" = "float",
        "definition" = "Density (potential) of the water body by computation
         from salinity and potential temperature using UNESCO algorithm with
          0 decibar reference pressure",
        "vocabulary" = "BODC::P01"
      ),
      "FLUOZZZZ" = list(
        "dataType" = "float",
        "definition" = "Fluorescence of the water body",
        "vocabulary" = "BODC::P01"
      ),
      "TURBXXXX" = list(
        "dataType" = "float",
        "definition" = "Turbidity of water in the water body",
       "vocabulary" = "BODC::P01"
     ),
      "sampleSizeValue" = list(
        "dataType" = "float",
        "definition" = "A numeric value for a measurement of the size (time
        duration, length, area, or volume) of a sample in a sampling
        dwc:Event.",
        "vocabulary" = "dwc"
      ),
      "sampleSizeUnit" = list(
        "dataType" = "chr",
        "definition" = "The unit of measurement of the size (time duration,
        length, area, or volume) of a sample in a sampling dwc:Event.",
       "vocabulary" = "dwc"
      ),
      "scientificName" = list(
        "dataType" = "chr",
        "definition" = "The full scientific name, with authorship and date
        information if known. When forming part of a dwc:Identification, this
         should be the name in lowest level taxonomic rank that can be
         determined. This term should not contain identification
         qualifications, which should instead be supplied in the
         dwc:identificationQualifier term.",
        "vocabulary" = "dwc"
      ),
      "identifiedBy" = list(
        "dataType" = "chr",
        "definition" = "A list (concatenated and separated) of names of
        people, groups, or organisations who assigned the Taxon to the subject.",
        "vocabulary" = "dwc"
      ),
      "identificationVerificationStatus" = list(
        "dataType" = "chr",
        "definition" = "A categorical indicator of the extent to which the
        taxonomic identification has been verified to be correct.",
        "vocabulary" = "dwc"
      ),
      "depthDifferenceMeters" = list(
       "dataType" = "float",
       "definition" = "Difference between maximumDepthInMeters and
       minimumDepthInMeters of an individual data bin, in meters",
        "vocabulary" = "BIO"
      ),
      "minimumTimeSeconds" = list(
        "dataType" = "float",
        "definition" = "minimum time value in a data bin, measured in seconds
         from the start of the day of sampling",
        "vocabulary" = "BIO"
      ),
      "maximumTimeSeconds" = list(
        "dataType" = "float",
        "definition" = "maximum time value in a data bin, measured in seconds
         from the start of the day of sampling",
        "vocabulary" = "BIO"
      ),
      "timeDifferenceSeconds" = list(
        "dataType" = "float",
        "definition" = "Difference between maximumTimeSeconds and
        minimumTimeSeconds of an individual data bin, in seconds",
        "vocabulary" = "BIO"
      ),
      "numberOfFrames" = list(
        "dataType" = "float",
        "definition" = "number of VPR frames captured within an individual data bin",
        "vocabulary" = "BIO"
      ),
      "timeMilliseconds" = list(
        "dataType" = "float",
        "definition" = "Time measured in milliseconds since the start of the sampling day",
        "vocabulary" = "BIO"
      ),
      "towyoID" = list(
        "dataType" = "chr",
        "definition" = "A string identifying the section of the cast to which
         the data point belongs",
        "vocabulary" = "BIO"
      ),
      "maximumCastDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "Maximum depth in Meters of the cast dataset",
        "vocabulary" = "BIO"
      )
    )
  )
)

# new_name = old_name
columnNames = list( "DEPHPRST" = "depth" ,
                    "verbatimIdentification" = "category",
                    "eventID" = "station",
                   "minimumDepthInMeters" = "min_depth",
                    "maximumDepthInMeters" = "max_depth",
                    "individualCount" = "n_roi_bin",
                    "SDBIOL01" = "conc_m3",
                    "TEMPST01" = "temperature",
                    "PSALST01" = "salinity",
                    "POTDENS0" = "density",
                    "FLUOZZZZ" = "fluorescence",
                    "TURBXXXX" = "turbidity",
                    "sampleSizeValue" = "vol_sampled_bin_m3",
                    "depthDifferenceMeters" = "depth_diff",
                    "minimumTimeSeconds" = "min_time_s",
                    "maximumTimeSeconds" = "max_time_s",
                    "timeDifferenceSeconds" = "time_diff_s",
                    "numberOfFrames" = "n_frames",
                    "timeMilliseconds" = "time_ms",
                    "towyoID" = "towyo",
                    "maximumCastDepthInMeters" = "max_cast_depth"
)

# add any new data columns required
# (eg. sampleSizeUnit, scientificName, identifiedBy, identificationVerificationStatus)
sampleSizeUnit <- "cubic metre"
identifiedBy <- "K. Sorochan"
identificationVerificationStatus <- "ValidatedByHuman"

data <- category_conc_n %>%
  mutate(., identifiedBy = identifiedBy,
         sampleSizeUnit = sampleSizeUnit,
         identificationVerificationStatus = identificationVerificationStatus)

# Define the mapping between category and scientific name
# scientific names based ecotaxa taxonomic system
scientificName <- list("blurry" = "bad_image_blurry",
                      "artefact" = c("bad_image_malfunction", "bad_image_strobe"),
                      "Calanus" = "Calanus")

# Create a new column of data called scientificName based on matches to category
data <- data %>%
  dplyr::mutate(., scientificName = case_when(
    category %in% scientificName[["blurry"]] ~ "blurry",
    category %in% scientificName[["artefact"]] ~ "artefact",
    category == scientificName[["Calanus"]] ~ "Calanus",
    TRUE ~ NA
  ))

vpr_export(data, metadata, columnNames, file = "vpr123_station25")

## End(Not run)
## Not run: 
data(category_conc_n)
metadata <- list(
  "station_level" = list(
    "title" = list("en" = "VPR data from the Scotian Shelf",
                   "fr" = "Données VPR de l'étagère néo-écossaise"),
    "dataset_ID" = 1,
    "decimalLatitudeStart" = 44.5,
    "decimalLongitudeStart" = -64.5,
    "decimalLatitudeEnd" = 45.5,
    "decimalLongitudeEnd" = -65.5,
    "maximumDepthInMeters" = 1000,
    "eventDate" = "2019-08-11",
    "eventTime" = "00:00:00",
    "basisOfRecord" = "MachineObservation",
   "associatedMedia" = "https://ecotaxa.obs-vlfr.fr/ipt/archive.do?r=iml2018051",
   "identificationReferences" = "Iv3 model v3.3",
   "instrument" = list("opticalSetting" = "S2",
                       "imageVolume" = 83663),
   "resources" = list(
      "data" = list("name" = "vpr123_station25.csv",
                    "creationDate" = "2023-01-01"),
      "metadata" = list("name" = "vpr123_station25-metadata.json",
                        "creationDate" = "2023-01-01")
    ),
    "dataAttributes" = list(
      "eventID" = list(
        "dataType" = "chr",
        "definition" = "An identifier for the set of information associated
        with a dwc:Event (something that occurs at a place and time). May be
        a global unique identifier or an identifier specific to the data set.",
        "vocabulary" = "dwc"
      ),
      "minimumDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "The lesser depth of a range of depth below the local",
        "vocabulary" = "dwc"
      ),
      "maximumDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "The greater depth of a range of depth below the local",
        "vocabulary" = "dwc"
      ),
      "DEPHPRST" = list(
        "dataType" = "float",
        "definition" = "Depth (spatial coordinate) of sampling event start
        relative to water surface in the water body by profiling pressure
         sensor and conversion to depth using unspecified algorithm",
        "vocabulary" = "BODC::P01"
      ),
      "individualCount" = list(
        "dataType" = "float",
        "definition" = "The number of individuals present at the time of the
         dwc:Occurrence.",
        "vocabulary" = "dwc"
      ),
      "verbatimIdentification" = list(
        "dataType" = "chr",
       "definition" = "A string representing the taxonomic identification as
       it appeared in the original record.",
        "vocabulary" = "dwc"
      ),
      "SDBIOL01" = list(
        "dataType" = "float",
        "definition" = "Abundance of biological entity specified elsewhere
        per unit volume of the water body",
        "vocabulary" = "BODC::P01"
      ),
      "TEMPST01" = list(
        "dataType" = "float",
        "definition" = "Temperature of the water body by CTD or STD",
        "vocabulary" = "BODC::P01"
      ),
      "PSALST01" = list(
        "dataType" = "float",
        "definition" = "Practical salinity of the water body by CTD and
        computation using UNESCO 1983 algorithm",
        "vocabulary" = "BODC::P01"
      ),
      "POTDENS0" = list(
        "dataType" = "float",
        "definition" = "Density (potential) of the water body by computation
         from salinity and potential temperature using UNESCO algorithm with
          0 decibar reference pressure",
        "vocabulary" = "BODC::P01"
      ),
      "FLUOZZZZ" = list(
        "dataType" = "float",
        "definition" = "Fluorescence of the water body",
        "vocabulary" = "BODC::P01"
      ),
      "TURBXXXX" = list(
        "dataType" = "float",
        "definition" = "Turbidity of water in the water body",
       "vocabulary" = "BODC::P01"
     ),
      "sampleSizeValue" = list(
        "dataType" = "float",
        "definition" = "A numeric value for a measurement of the size (time
        duration, length, area, or volume) of a sample in a sampling
        dwc:Event.",
        "vocabulary" = "dwc"
      ),
      "sampleSizeUnit" = list(
        "dataType" = "chr",
        "definition" = "The unit of measurement of the size (time duration,
        length, area, or volume) of a sample in a sampling dwc:Event.",
       "vocabulary" = "dwc"
      ),
      "scientificName" = list(
        "dataType" = "chr",
        "definition" = "The full scientific name, with authorship and date
        information if known. When forming part of a dwc:Identification, this
         should be the name in lowest level taxonomic rank that can be
         determined. This term should not contain identification
         qualifications, which should instead be supplied in the
         dwc:identificationQualifier term.",
        "vocabulary" = "dwc"
      ),
      "identifiedBy" = list(
        "dataType" = "chr",
        "definition" = "A list (concatenated and separated) of names of
        people, groups, or organisations who assigned the Taxon to the subject.",
        "vocabulary" = "dwc"
      ),
      "identificationVerificationStatus" = list(
        "dataType" = "chr",
        "definition" = "A categorical indicator of the extent to which the
        taxonomic identification has been verified to be correct.",
        "vocabulary" = "dwc"
      ),
      "depthDifferenceMeters" = list(
       "dataType" = "float",
       "definition" = "Difference between maximumDepthInMeters and
       minimumDepthInMeters of an individual data bin, in meters",
        "vocabulary" = "BIO"
      ),
      "minimumTimeSeconds" = list(
        "dataType" = "float",
        "definition" = "minimum time value in a data bin, measured in seconds
         from the start of the day of sampling",
        "vocabulary" = "BIO"
      ),
      "maximumTimeSeconds" = list(
        "dataType" = "float",
        "definition" = "maximum time value in a data bin, measured in seconds
         from the start of the day of sampling",
        "vocabulary" = "BIO"
      ),
      "timeDifferenceSeconds" = list(
        "dataType" = "float",
        "definition" = "Difference between maximumTimeSeconds and
        minimumTimeSeconds of an individual data bin, in seconds",
        "vocabulary" = "BIO"
      ),
      "numberOfFrames" = list(
        "dataType" = "float",
        "definition" = "number of VPR frames captured within an individual data bin",
        "vocabulary" = "BIO"
      ),
      "timeMilliseconds" = list(
        "dataType" = "float",
        "definition" = "Time measured in milliseconds since the start of the sampling day",
        "vocabulary" = "BIO"
      ),
      "towyoID" = list(
        "dataType" = "chr",
        "definition" = "A string identifying the section of the cast to which
         the data point belongs",
        "vocabulary" = "BIO"
      ),
      "maximumCastDepthInMeters" = list(
        "dataType" = "float",
        "definition" = "Maximum depth in Meters of the cast dataset",
        "vocabulary" = "BIO"
      )
    )
  )
)

# new_name = old_name
columnNames = list( "DEPHPRST" = "depth" ,
                    "verbatimIdentification" = "category",
                    "eventID" = "station",
                   "minimumDepthInMeters" = "min_depth",
                    "maximumDepthInMeters" = "max_depth",
                    "individualCount" = "n_roi_bin",
                    "SDBIOL01" = "conc_m3",
                    "TEMPST01" = "temperature",
                    "PSALST01" = "salinity",
                    "POTDENS0" = "density",
                    "FLUOZZZZ" = "fluorescence",
                    "TURBXXXX" = "turbidity",
                    "sampleSizeValue" = "vol_sampled_bin_m3",
                    "depthDifferenceMeters" = "depth_diff",
                    "minimumTimeSeconds" = "min_time_s",
                    "maximumTimeSeconds" = "max_time_s",
                    "timeDifferenceSeconds" = "time_diff_s",
                    "numberOfFrames" = "n_frames",
                    "timeMilliseconds" = "time_ms",
                    "towyoID" = "towyo",
                    "maximumCastDepthInMeters" = "max_cast_depth"
)

# add any new data columns required
# (eg. sampleSizeUnit, scientificName, identifiedBy, identificationVerificationStatus)
sampleSizeUnit <- "cubic metre"
identifiedBy <- "K. Sorochan"
identificationVerificationStatus <- "ValidatedByHuman"

data <- category_conc_n %>%
  mutate(., identifiedBy = identifiedBy,
         sampleSizeUnit = sampleSizeUnit,
         identificationVerificationStatus = identificationVerificationStatus)

# Define the mapping between category and scientific name
# scientific names based ecotaxa taxonomic system
scientificName <- list("blurry" = "bad_image_blurry",
                      "artefact" = c("bad_image_malfunction", "bad_image_strobe"),
                      "Calanus" = "Calanus")

# Create a new column of data called scientificName based on matches to category
data <- data %>%
  dplyr::mutate(., scientificName = case_when(
    category %in% scientificName[["blurry"]] ~ "blurry",
    category %in% scientificName[["artefact"]] ~ "artefact",
    category == scientificName[["Calanus"]] ~ "Calanus",
    TRUE ~ NA
  ))

vpr_export(data, metadata, columnNames, file = "vpr123_station25")

## End(Not run)

Get hour identifier

Description

Get hour identifier

Usage

vpr_hour(x)
vpr_hour(x)

Arguments

`x`	A string specifying the directory and file name of the size file

Value

A string of only the hour identifier (i.e., "hXX")

Author(s)

K Sorochan

Examples

hour_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
vpr_hour(hour_string)

hour_string <- 'C:/data/cruise/autoid/Calanus/d000/h00'
vpr_hour(hour_string)

Explore images by depth and classification

Description

Pulls images from specific depth ranges in specific classification group

Usage

vpr_img_category(
  data,
  min.depth,
  max.depth,
  roiFolder,
  format = "list",
  category_of_interest
)
vpr_img_category(
  data,
  min.depth,
  max.depth,
  roiFolder,
  format = "list",
  category_of_interest
)

Arguments

`data`	data frame containing CTD and ROI data from `vpr_ctdroi_merge`, which also contains calculated variables sigmaT and time_hr
`min.depth`	minimum depth of ROIs you are interested in looking at
`max.depth`	maximum depth of ROIs you are interested in exploring
`roiFolder`	directory that ROIs are within (can be very general eg. C:/data, but will be quicker to process with more specific file path)
`format`	option of how images will be output, either as 'list' a list of file names or 'image' where images will be displayed
`category_of_interest`	character string of classification group from which to pull images

Remove ROI strings from aid and aidmeas files based on a manually organized folder of images

Description

Should be used after vpr_img_copy, and manual image removal from created folders

Usage

vpr_img_check(folder_dir, basepath)
vpr_img_check(folder_dir, basepath)

Arguments

`folder_dir`	directory path to day hour folders containing manually reorganized images of a specific category eg. 'C:/data/cruise_IML2018051/krill/images/' where that folder contains '......d123.h01/' which contains manually sorted images of krill
`basepath`	directory path to original Visual Plankton files, specified down to the classification group. eg. 'C:/data/cruise_IML2018051/autoid/krill'

Image copying function for specific category of interest

Description

This function can be used to copy images from a particular category, day and hour into distinct folders within the auto id directory This is useful for visualizing the ROIs of a particular classification group or for performing manual tertiary checks to remove images not matching classification group descriptions.

Usage

vpr_img_copy(auto_id_folder, categories.of.interest, day, hour)
vpr_img_copy(auto_id_folder, categories.of.interest, day, hour)

Arguments

`auto_id_folder`	eg "D:/VP_data/IML2018051/autoid"
`categories.of.interest`	eg. categories.of.interest <- c('Calanus')
`day`	character, day of interest
`hour`	character, hour of interest

Explore VPR images by depth bin

Description

Allows user to pull VPR images from specific depth ranges, to investigate trends before classification of images into category groups

Usage

vpr_img_depth(data, min.depth, max.depth, roiFolder, format = "list")
vpr_img_depth(data, min.depth, max.depth, roiFolder, format = "list")

Arguments

`data`	data frame containing CTD and ROI data from `vpr_ctdroi_merge`, which also contains calculated variables sigmaT and time_hr
`min.depth`	minimum depth of ROIs you are interested in looking at
`max.depth`	maximum depth of ROIs you are interested in exploring
`roiFolder`	directory that ROIs are within (can be very general eg. C:/data, but will be quicker to process with more specific file path)
`format`	option of how images will be output, either as 'list' a list of file names or 'image' where images will be displayed

Explore reclassified images

Description

Pull image from reclassified or misclassified files produced during vpr_manual_classification

Usage

vpr_img_reclassified(day, hour, base_dir, category_of_interest, image_dir)
vpr_img_reclassified(day, hour, base_dir, category_of_interest, image_dir)

Arguments

`day`	Character string, 3 digit day of interest of VPR data
`hour`	Character string, 2 digit hour of interest of VPR data
`base_dir`	directory path to folder containing day/hour folders in which misclassified and reclassified files are organized (eg.'C:/VPR_PROJECT/r_project_data_vis/classification files/') which would contain 'd123.h01/reclassified_krill.txt' )
`category_of_interest`	Classification group from which to pull images
`image_dir`	directory path to ROI images, eg. "E:\\data\\cruise_IML2018051\\", file separator MUST BE "\\" in order to be recognized

Value

folders of misclassified or reclassified images inside image_dir

Function to check results of classification manually

Description

Displays each image in day hour specified, prompts user to confirm or deny classification. If classification is denied, asks for a reclassification value based on available category

Usage

vpr_manual_classification(
  day,
  hour,
  basepath,
  category_of_interest,
  gr = TRUE,
  scale = "x300",
  opticalSetting = "S2",
  img_bright = TRUE,
  threshold_score,
  path_score
)
vpr_manual_classification(
  day,
  hour,
  basepath,
  category_of_interest,
  gr = TRUE,
  scale = "x300",
  opticalSetting = "S2",
  img_bright = TRUE,
  threshold_score,
  path_score
)

Arguments

`day`	day of interest in autoid (3 chr)
`hour`	hour of interest in autoid (2 chr)
`basepath`	path to folder containing autoid files (e.g., 'extdata/COR2019002/autoid')
`category_of_interest`	list of category folders you wish you sort through
`gr`	logical indicating whether pop up graphic menus are used (user preference - defaults to TRUE)
`scale`	argument passed to `image_scale`, default = 'x300'
`opticalSetting`	specifies optical setting of VPR, defining image frame size, current options are 'S0', 'S1', 'S2' (default), 'S3', see further info in details
`img_bright`	logical value indicating whether or not to include a blown out high brightness version of image (can be helpful for viewing dark field fine appendages)
`threshold_score`	(optional) a numeric value defining the minimum confidence value, under which automatic classifications will be passed through manual reclassification. This argument should match the threshold provided in `vpr_autoid_copy()`
`path_score`	(optional) file path to the autoid_cnn_scr folder (autoid files with confidence values produced by automated classification)

Details

Optical Setting frame sizes: S0 = 7x7 mm, S1 = 14x14mm, S2 = 24x24mm, S3 = 48x48 mm. These settings define the conversion factor from pixels to millimetres and calculate image size for classification reference

Development

Add "undo" functionality to go back on a typing mistake
Fix scaling/ size issue so images are consistently sized

Create ctd oce object with vpr data

Description

Formats VPR data frame into oce format CTD object

Usage

vpr_oce_create(data)
vpr_oce_create(data)

Arguments

data

data frame of vpr data

Author(s)

E. Chisholm

Examples

data('ctd_roi_merge')
oce_dat <- vpr_oce_create(ctd_roi_merge)

data('ctd_roi_merge')
oce_dat <- vpr_oce_create(ctd_roi_merge)

Interpolated contour plot of particular variable

Description

Creates interpolated contour plot, can be used as a background for ROI or tow yo information

Usage

vpr_plot_contour(
  data,
  var,
  dup = "mean",
  method = "interp",
  labels = TRUE,
  bw = 1,
  cmo
)
vpr_plot_contour(
  data,
  var,
  dup = "mean",
  method = "interp",
  labels = TRUE,
  bw = 1,
  cmo
)

Arguments

`data`	data frame needs to include time_hr, depth, and variable of choice (var)
`var`	variable in dataframe which will be interpolated and plotted
`dup`	if method == 'interp'. Method of handling duplicates in interpolation, passed to interp function (options: 'mean', 'strip', 'error')
`method`	Specifies interpolation method, options are 'interp' or 'oce', oce uses slightly different method (oce is least error prone)
`labels`	logical value indicating whether or not to plot contour labels
`bw`	bin width defining interval at which contours are labelled
`cmo`	name of a `cmocean` plotting theme, see `?cmocean` for more information

Author(s)

E. Chisholm & Kevin Sorochan

Plots VPR profiles of temperature, salinity, density, fluorescence and concentration (by classification group)

Description

This plot allows a good overview of vertical distribution of individual classification groups along with reference to hydrographic parameters. Facet wrap is used to create distinct panels for each category provided

Usage

vpr_plot_profile(category_conc_n, category_to_plot, plot_conc)
vpr_plot_profile(category_conc_n, category_to_plot, plot_conc)

Arguments

`category_conc_n`	A VPR data frame with hydrographic and concentration data separated by category (from `vpr_roi_concentration`)
`category_to_plot`	The specific classification groups which will be plotted, if NULL, will plot all category combined
`plot_conc`	Logical value whether or not to include a concentration plot (FALSE just shows CTD data)

Value

A gridded object of at least 3 ggplot objects

Make a balloon plot against a TS plot

Description

TS balloon plot with ROI concentration, sorted by category includes isopycnal line calculations

Usage

vpr_plot_TS(x, reference.p = 0, var)
vpr_plot_TS(x, reference.p = 0, var)

Arguments

`x`	dataframe with temperature, salinity, number of rois (n_roi_bin)
`reference.p`	reference pressure (default at 0 for surface)- used to calculate isopycnals
`var`	variable on which size of points will be based, eg conc_m3 or n_roi_bin

Note

modified from source: https://github.com/Davidatlarge/ggTS/blob/master/ggTS_DK.R

Author(s)

E. Chisholm

Make a balloon plot

Description

Balloon plot against a TS plot with ROI concentration and sorted by category includes isopycnal line calculations. Version of vpr_plot_TS, with only relevant* category specified. *to current analysis and research objectives (See note).

Usage

vpr_plot_TScat(x, reference.p = 0)
vpr_plot_TScat(x, reference.p = 0)

Arguments

`x`	dataframe with temperature, salinity, number of rois named by category
`reference.p`	reference pressure (default at 0 for surface)- used to calculate isopycnals

Note

WARNING HARD CODED FOR 5 category, CALANUS, KRILL, ECHINODERM LARVAE, SMALL COPEPOD, CHAETOGNATHS !! Uses isopycnal labelling method which does not label every contour

modified from source: https://github.com/Davidatlarge/ggTS/blob/master/ggTS_DK.R

Read prediction output from a CNN model

Description

Read prediction output from a CNN model

Usage

vpr_pred_read(filename)
vpr_pred_read(filename)

Arguments

filename

model prediction output file (.txt) from vpr_transferlearn::save_output()

Value

a dataframe

Get roi ids from string

Description

Get roi ids from string

Usage

vpr_roi(x)
vpr_roi(x)

Arguments

`x`	A string specifying directory and file name of roi

Value

A string of only the 10 digit roi identifier

Author(s)

K Sorochan

Examples


roi_string <- 'roi.0100000000.tif'
vpr_roi(roi_string)

roi_string <- 'roi.0100000000.tif'
vpr_roi(roi_string)

Calculate VPR concentrations

Description

Calculates concentrations for each named category in dataframe

Usage

vpr_roi_concentration(
  data,
  category_list,
  station_of_interest,
  binSize,
  imageVolume,
  rev = FALSE
)
vpr_roi_concentration(
  data,
  category_list,
  station_of_interest,
  binSize,
  imageVolume,
  rev = FALSE
)

Arguments

`data`	a VPR dataframe as produced by `vpr_ctdroi_merge`
`category_list`	a vector of character strings representing category present in the station being processed
`station_of_interest`	The station being processed
`binSize`	passed to `bin_calculate`, determines size of depth bins over which data is averaged
`imageVolume`	the volume of VPR images used for calculating concentrations (mm^3)
`rev`	Logical value defining direction of binning, FALSE (default) - bins will be calculated from surface to bottom, TRUE- bins will be calculated bottom to surface

Examples


data('ctd_roi_merge')
ctd_roi_merge$time_hr <- ctd_roi_merge$time_ms /3.6e+06

category_list <- c('Calanus', 'krill')
binSize <- 5
station_of_interest <- 'test'
imageVolume <- 83663

category_conc_n <- vpr_roi_concentration(ctd_roi_merge, category_list,
station_of_interest, binSize, imageVolume)

data('ctd_roi_merge')
ctd_roi_merge$time_hr <- ctd_roi_merge$time_ms /3.6e+06

category_list <- c('Calanus', 'krill')
binSize <- 5
station_of_interest <- 'test'
imageVolume <- 83663

category_conc_n <- vpr_roi_concentration(ctd_roi_merge, category_list,
station_of_interest, binSize, imageVolume)

Save VPR data as an as.oce object

Description

Save VPR data as an as.oce object

Usage

vpr_save(data, metadata)
vpr_save(data, metadata)

Arguments

`data`	a VPR data frame
`metadata`	(optional) a named list of character values giving metadata values. If this argument is not provided user will be prompted for a few generic metadata requirements.

Details

This function will pass a VPR data frame to an oce object. Using an oce object as the default export format for VPR data allows for metadata and data to be kept in the same, space efficient file, and avoid redundancy in the data frame. The function checks for data parameters that may actually be metadata parameters (rows which have the same value repeated for every observation). These parameters will automatically be copied into the metadata slot of the oce object. The function will also prompt for a variety of required metadata fields. Depending on specific research / archiving requirements, these metadata parameters could be updated by providing the argument metadata.

Default metadata parameters include 'deploymentType', 'waterDepth', 'serialNumber', 'latitudeStart', 'longitudeStart', 'castDate', 'castStartTime', 'castEndTime', 'processedBy', 'opticalSetting', 'imageVolume', 'comment'.

Value

an oce CTD object with all VPR data as well as metadata

Examples

data("category_conc_n")
metadata <- list('deploymentType' = 'towyo', 'waterDepth' =
max(ctd_roi_merge$pressure), 'serialNumber' = NA, 'latitudeStart' = 47,
'longitudeStart' = -65, 'castDate' = '2019-08-11', 'castStartTime'= '00:00',
'castEndTime' = '01:00', 'processedBy' = 'E. Chisholm', 'opticalSetting' =
'S2', 'imageVolume' = 83663, 'comment' = 'test data')

oce_dat <- vpr_save(category_conc_n, metadata)
# save(oce_dat, file = vpr_save.RData') # save data

data("category_conc_n")
metadata <- list('deploymentType' = 'towyo', 'waterDepth' =
max(ctd_roi_merge$pressure), 'serialNumber' = NA, 'latitudeStart' = 47,
'longitudeStart' = -65, 'castDate' = '2019-08-11', 'castStartTime'= '00:00',
'castEndTime' = '01:00', 'processedBy' = 'E. Chisholm', 'opticalSetting' =
'S2', 'imageVolume' = 83663, 'comment' = 'test data')

oce_dat <- vpr_save(category_conc_n, metadata)
# save(oce_dat, file = vpr_save.RData') # save data

Bin VPR size data

Description

Calculates statistics for VPR measurement data in depth averaged bins for analysis and visualization

Usage

vpr_size_bin(data_all, bin_mea)
vpr_size_bin(data_all, bin_mea)

Arguments

`data_all`	a VPR CTD and measurement dataframe from `vpr_ctdroisize_merge`
`bin_mea`	Numerical value representing size of depth bins over which data will be combined, unit is metres, typical values range from 1 - 5

Value

a dataframe of binned VPR size data statistics including number of observations, median, interquartile ranges, salinity and pressure, useful for making boxplots

Examples

## Not run: 
data('size_df_f')
vpr_size_bin(size_df_f, bin_mea = 5)

## End(Not run)




## Not run: 
data('size_df_f')
vpr_size_bin(size_df_f, bin_mea = 5)

## End(Not run)

Get size data from idsize files

Description

useful for getting size distribution of known rois from each category. gathers size information from idsize text files produced when training a new classifier in VP (Visual Plankton)

Usage

vpr_trrois_size(directory, category, opticalSetting)
vpr_trrois_size(directory, category, opticalSetting)

Arguments

`directory`	cruise directory eg. 'C:/data/IML2018051/'
`category`	list of character elements containing category of interest
`opticalSetting`	VPR optical setting determining conversion between pixels and millimetres (options are 'S0', 'S1', 'S2', or 'S3')

Package 'vprr'

Help Index

Get bin averages for VPR and CTD data

Description

Usage

Arguments

Details

Note

Author(s)

Bin vpr data

Description

Usage

Arguments

Details

Value

A binned data frame of concentration data per category

Description

Usage

Format

Binned concentrations

Description

Usage

Arguments

Details

Author(s)

Isolate ascending or descending section of ctd cast

Description

Usage

Arguments

Value

Note

Author(s)

VPR CTD data

Description

Usage

Format

Read CTD data (SBE49) from CTD- VPR package

Description

Usage

Arguments

Details

Author(s)

VPR CTD data combined with tabulated ROIs

Description

Usage

Format

VPR data including CTD and ROI information

Description

Usage

Format

INTERNAL USE ONLY quick data frame function from github to insert row inside dat frame

Description

Usage

Arguments

Get vector to draw isopycnal lines on TS plot Used internally to create TS plots

Description

Usage

Arguments

Note

Author(s)

Normalize a matrix

Description

Usage

Arguments

Details

Note

Packages

Description

Get conversion factor for pixels to mm for roi measurements

Description

Usage

Arguments

Details

Read aid files produced by automated classification

Description

Usage

Arguments

Value

VPR ROI data

Description