Introduction

What is wildRtrax?

wildRtrax, pronounced ‘wilder tracks’, is an R package for ecologists and advanced users who work with environmental sensor data, mainly from autonomous recordings units (ARUs). It contains functions designed to meet most needs in order to organize, analyze and standardize data to the WildTrax infrastructure. wildRtrax is self-contained and must be run under an R statistical environment, and it also depends on many other R packages. wildRtrax is free software and distributed under MIT License (c) 2020.

What is WildTrax?

WildTrax is a web-enabled portal designed to manage, store, process, and share this environmental sensor data and transform it into biological data. It was developed by the Alberta Biodiversity Monitoring Institute and the Bioacoustic Unit. wildRtrax serves as a parallel design and indicator to analytics and functions that can be developed in WildTrax.

What R packages does wildRtrax depend on?

wildRtrax depends on a multitude of packages to provide flexible routines and work flows for data management. tidyverse for piping functions, standard grammar and tidy data manipulation, furrr and doParallel for parallel computing, and acoustic analysis packages: bioacoustics, tuneR, seewave. Certain functions are indebted to the QUT Ecoacoustics Analysis Software software package as well.

How do I report a bug in wildRtrax?

If you think you have found a bug in wildRtrax, you should report it to developers or maintainers. Please do not send bug reports to R mailing lists, since wildRtrax is not a standard R package. The preferred forum to report bugs is GitHub. Here is what is required in order to report a bug - issues are welcome and encouraged and are the only way to make wildRtrax non-buggy:

  • The bug report should be so detailed that the bug can be replicated and corrected
  • Send an example that causes a bug
  • Send a minimal data set as well if it is not available in R
  • Paste the output or error message in your message
  • Specify which version of wildRtrax you used

Can I contribute to wildRtrax?

Yes! wildRtrax is dependent on user contribution and all feedback is welcome. If you have problems with wildRtrax, it may be as simple as incomplete documentation. Feature requests also are welcome, but they are not necessarily fulfilled. A new feature will be added if it is easy to do and it looks useful to the user base of the package, or if you submit fully annotated code.

See here for more information.

Disclaimers

The pronoun “you” throughout these articles refers to the reader. “We” refers to the wildRtrax team in general.

Audio data management

Autonomous recording units (ARUs) and remote camera traps collect data on the environment by means of capturing acoustic or visual data, respectively. ARUs are used to survey a variety of species such as birds, amphibians, and bats. But really anything that gives a vocalizing cue. Camera traps are best used to detect and monitor mammals. See abmi.camera.extras if you’re interested in getting estimates of animal density collected from remote camera traps. Both environmental sensors are designed to record sound or images autonomously for long periods of time. This is turn can accrue a large amount of data.

Audio file formats

There are two major audio file types used within the wildRtrax framework: wac and wav

  • wac, w4v are proprietary, lossless compressed file formats developed by Wildlife Acoustics
  • wav is the standard, ubiquitous uncompressed audio file format

There are also a couple other file types you might be working with:

  • mp3 a lossy compressed audio file format; works by reducing the accuracy of certain sound components, and eliminating others
  • flac a lossless compressed format useful for efficiently packing audio data
file <- "/volumes/GoogleDrive/Shared drives/wildRtrax/data/example/ABMI-0754-SW_20170301_085900.wav"

wave_t <- tuneR::readWave(file, header = T) #True header format

wave_f <- tuneR::readWave(file, header = F)

list(wave_t, wave_f)
#> [[1]]
#> [[1]]$sample.rate
#> [1] 44100
#> 
#> [[1]]$channels
#> [1] 2
#> 
#> [[1]]$bits
#> [1] 16
#> 
#> [[1]]$samples
#> [1] 26459904
#> 
#> 
#> [[2]]
#> 
#> Wave Object
#>  Number of Samples:      26459904
#>  Duration (seconds):     600
#>  Samplingrate (Hertz):   44100
#>  Channels (Mono/Stereo): Stereo
#>  PCM (integer format):   TRUE
#>  Bit (8/16/24/32/64):    16

You can access the objects in wave_t through a $ as you would a normal list. When header = FALSE and you are reading in the entire wav file, you can access slots of the S4 object using @.

sound_length_S4 <- round((wave_f@left / wave_f@samp.rate), 2)

#Is equivalent to:
sound_length_list <- wave_t$samples / wave_t$sample.rate

Spectrograms

A spectrogram is a visual representation of the spectrum of frequencies of an audio signal as it varies with time (Wikipedia). Spectrograms can be used to identify animal vocalizations by their unique spectral signature. Generally speaking, there are three pieces of information you can extract from a spectrogram:

  • Length of the vocalization via the x-axis in seconds
  • Frequency range of the vocalization via the y-axis in Hz
  • Relative amplitude of the vocalization via the z-axis in dB

Let’s create a spectrogram to get a better look at some of those audio files. Here’s one way to do it using ggspectro in seewave.

#Plot a spectrogram
v <- seewave::ggspectro(tuneR::readWave(file, from = 0, to = 60, units = "seconds"), ovlp = 50) + ggplot2::geom_tile(aes(fill=amplitude)) + theme_bw()

SoX is also a very powerful command line tool that can build spectrograms as well. Processing time here is much faster given R doesn’t have to read in the file as an S4 wave object. More on this later.

#Or try a bash command using SoX

#cd /path/to/file && for file in *.wav; do outfile="${file%.*}.png"; title_in_pic="${file%.*}"; sox "$file" -n spectrogram -l -m -t "$title_in_pic" -o "$outfile"; done

From the field to the office

Familiarity with processes, protocols, equipment and data is an important first step in understanding how to manage environmental sensor data. Check with your study design or monitoring plan to ensure that you are correctly managing your data prior to heading into the field. wildRtrax doesn’t focus on the field components of the data flows but is heavily dependent on it. Acoustic data has certain metadata dependencies that can be extracted from raw data. But robust field or visit metadata as it’s called in wildRtrax and WildTrax is important to support the quality control process of the incoming sensor data.

Metadata dependencies for acoustic data

The wildRtrax prefers that the file name string from which the data is deriving is composed of two parts: a spatial component and a temporal component. We call these fields the location and the recording_date_time of the audio respectively. The location, date and time should be critical pieces of information that should be collected and checked when you are visiting environmental sensors in the field.

Data volume, storage and computing power

Collecting lots of data with environmental sensors is easy. Are there ways you can reduce what you collect and have to process?

  • Are you interested in one or many species?
    • A community analysis would require a broad spectrum range to record in or analyze or with more data being collected to account for imperfect detection. Whereas with a single or multi-species approach, you may only need to look at a narrow frequency range in order to detect the species.
  • What frequency range does your species vocalize in?
    • If you know the frequency range your species vocalizes in, you may be able to change the sampling rate, apply a band-pass filter or compress the data
  • Data quality or data volume?
    • All methods above will inherently reduce data quality in favour of also reducing data volume. With more complex acoustic information comes larger files.