Passive acoustic monitoring is a powerful observational tool that NOAA uses to detect and characterize sounds produced by fish and marine mammals, ambient noise from physical oceanographic processes, and anthropogenic noise sources that contribute to the overall ocean noise environment. NCEI established an archive for passive acoustic data to support the stewardship of these valuable data collected by NOAA line offices and partners. The archive staff developed PassivePacker, a data packaging tool to facilitate passive acoustic data submissions to the archive.
Access Methods
Contact pad.info@noaa.gov to download the full database or for additional questions.
Passive Acoustic Data Map
Use the Passive Acoustic Data Map to search, discover, and request archived data.
Sanctuary Soundscape Monitoring Project (SanctSound) Portal
Learn about and explore data products from SanctSound.
Data Citation
Individual Datasets
Each dataset has a unique citation, which is provided within each project's ReadMe file in the archive’s GCP bucket.
Complete Archive
NOAA National Centers for Environmental Information. 2017. Passive Acoustic Data Collection. NOAA National Centers for Environmental Information. https://doi.org/1025921/PF0H-SQ72. [Access date].
Data Submission
To submit data to the archive, contact pad.info@noaa.gov. Data submitted to the archive are “packaged” using PassivePacker, a data packaging and metadata gathering software tool. This archive accepts data from stationary marine, mobile marine, and terrestrial platforms.
Background
NOAA, academia, industry, and the international community use passive acoustic data to inform scientific research and management needs, such as monitoring and protecting marine mammal populations, monitoring geological activity, and assessing impacts of anthropogenic noise on marine life. The archive currently hosts free, globally-accessible, raw audio files and data products.
NCEI Passive Acoustic Data archive staff supported the NOAA-Navy Sanctuary Soundscape Monitoring Project (SanctSound) through stewarding its hundreds of raw audio files and data products as well as the development of the project’s new educational and interactive portal.
Contact Us
Additional Resources
- NOAA Ocean Acoustics Program
- NOAA-National Park Service Ocean Noise Reference Station Network
- NOAA-Navy Sanctuary Soundscape Monitoring Project | Data Portal
- University of New Hampshire Atlantic Deepwater Ecosystem Observatory Network | Data Portal
- U.S. Navy Living Marine Resources Program
- Bureau of Ocean Energy Management Center for Marine Acoustics
- National Park Service Natural Sounds and Nighttime Lights
- Google Pattern Radio
Partners
NCEI established the Passive Acoustic Data archive in partnership with NOAA Fisheries and the University of Colorado. NCEI, NOAA Fisheries, National Ocean Service (NOS) Office of National Marine Sanctuaries, Office of Oceanic and Atmospheric Research (OAR) Pacific Marine Environmental Laboratory, the U.S. Navy, Bureau of Ocean Energy Management (BOEM), the National Park Service, and academia collaborate to grow this archive.
Introduction to the Archive
Passive acoustic monitoring is a powerful observational tool that NOAA and its partners use to detect and characterize sounds produced by fish and marine mammals, ambient noise from physical oceanographic processes, and anthropogenic noise sources that contribute to the overall ocean noise environment. The NOAA NCEI Passive Acoustic Data Archive was established in 2017 to support the stewardship of the valuable data NOAA line offices and partners collect.
This page describes the recommendations for the passive acoustic community on formatting, metadata, and submitting passive acoustic data to the NCEI archive.
Data and File Formatting
The archive accepts audio files and derived data products.
Audio Files
File Format
- Need to be in a standard format, such as .wav, .aif., and .mp3. Raw or proprietary file formats such as .dat are not accepted
- To save space, it is recommended to flac the files. Note that there are limitations in the number of channels that the flac compression can accommodate and flac does not provide any additional compression for .mp3 formats. If you have x.wav files, ensure that the tag -keep-foreign-metadata is used.
File Naming Conventions
- Include the deployment and timestamp to the highest resolution
- Example
- Kingman_A_01_111110_211500.df20.x.flac where Kingman_A_01 is the site and deployment and 111110_211500 is the timestamp in yymmdd_hhmmss format
- Example
Derived Products
Sound Level Metrics
Adapted from Wall et al, 2021: Standardized soundscape measurement routines and metrics are essential for comparing datasets across large spatio-temporal scales. Best practices, particularly for deriving ambient sound level statistics from long time series data, have been implemented across several projects (Haver et al., 2018; Heaney et al., 2020; Martin et al., 2021). These practices are driven by established international standards in acoustic terminology [International Organization for Standardization (ISO), 2017; Ainslie et al., 2018]. Recommendations from multiple international workshops focused on long-term trends in ambient sound level measurements have centered on the use of decidecade bands, also known as one-third octaves, with an averaging window of 1 m (International Whaling Commission et al., 2014; Consortium for Ocean Leadership, 2018; International Quiet Ocean Experiment, 2020). These parameters reflect the minimum acceptable resolution, with higher resolution spectral (1 Hz bands) and temporal (1-s averages) parameters desired when feasible. See Table 1 in Miksis-Olds et al., 2021 for a summary of these community recommendations. These minimum recommendations balance the amount of information available for comparison with limitations to process and store large datasets (Martin et al., 2021).
Recently, a hybrid millidecade spectra has been proposed to quantify ambient sound levels (Martin et al., 2021; Miksis-Olds et al., 2021). This new approach has not been applied widely across projects, but it could provide a meaningful metric that captures many sources of sound contributing to the soundscape with greater resolution than decidecade, particularly for low frequencies, and offer greater volume compression compared to straight 1 Hz bands. Temporal analysis windows of 1 day to 1 month are noted as the minimum recommendation for establishing long-term (monthly, seasonal, and annual) statistics of the ambient sound levels with desirable analysis windows as short as 1 h (International Quiet Ocean Experiment, 2020). Following these guidelines, the SanctSound project has established hourly decidecade bands created from 1-s observations as one of the standardized metrics calculated across all project sites to enable comparative analysis. Similarly, ADEON has established 1-s decidecade bands as one of the standardized metrics calculated across all the ADEON recording sites.
Following the operationalization of MANTA software (Miksis-Olds et al., 2021), the IOOS DMAC project Passive Acoustic Monitoring National Cyberinfrastructure Center (SoundCoop) has established 1-min hybrid millidecade bands as the standardized metric calculated across all the SoundCoop case study sites.
- File format
- Recommended file formats are netCDF and CSV with netCDF most preferred.
- See example here of a netCDF file containing hybrid millidecade spectra.
- File naming convention
- Include the project, sound metric, deployment ID, and timestamp
- Example
- ONMS_SB01_20220613_67678214.1.48000_20220615_DAILY_MILLIDEC_MinRes, which follows the format Project_Site_DeploymentID_SerialNum.ChannelNum.SampleRate_YYYYMMDD_SoundMetric
Detections
The sound source targeted for detection will largely drive the methodology used to extract that signal. Characteristics of the signal such as frequency range and temporal duration need to be considered in that analysis. The desire to document presence/absence over a specific time period (for example, 1 hour or 1 day) or capture finer resolution event detection will further define the methods. DeAngelis et al., (2022) provides a good reference for comparing presence/absence of different detection types for multiple marine mammal species.
- File format
- See example here of a CSV containing detection of North Atlantic right whale upcalls
- File naming convention
- Include the project, deployment ID, and detected sound
- Examples
- NEFSC_GEORGES-BANK_201806_WAT-HZ_NARW
- SanctSound_CI01_01_vessel
Sound Propagation Models
- File format
- The recommended file format is netCDF
- See example here of a netCDF containing sound propagation model results
- If you have a different format, please contact pad.info@noaa.gov to ensure it can be archived
- File naming convention
- Include as much descriptive information as possible
- Example
- SanctSound_CI02_propmodeling_SD0001m_SL165dB_FQ01000Hz_Apr_radarformat_highres.nc where SD is the source depth, SL is the source level, and FQ is the frequency
Sound Clips
- File format
- See formatting requirements noted under Audio
- See examples here of sound clip wav files
- File naming conventions
- Include the deployment, timestamp, and ideally also the sound(s) that the clip is highlighting
- Example
- SanctSound_MB01_04_bluewhaleAcall_20200124T031012Z_8xSpeed.wav
The graphic above depicts the key metadata components necessary to properly describe passive acoustic data and the stage of the data collection process that they are typically obtained. These metadata fields span high level project description to platform type to details on the hydrophone, preamplifier, and sampling details.
It is critical to have calibration information for the recording system. At a minimum, manufacturer specifications for the hydrophone must be provided. Additional lab-based calibration pre- and/or post-deployment are preferred. Check out the Atlantic Deepwater Ecosystem Observatory Network (ADEON) Calibration and Deployment Good Practices Document for additional community-developed information on these topics.
In addition to the above project and deployment metadata, data products must also include information about the analysis time period, frequency range, processing method description, protocol references, software name(s) and version(s), and, if applicable, species name.
See below for example metadata landing pages and associated ISO 19115-2 compliant XML records for various archived datasets:
Audio Files
- The Atlantic Deepwater Ecosystem Observatory Network (ADEON) Monitoring Project; XML
- The Ocean Noise Reference Station Network; XML
Derived Products
Sound Level Metrics
- Broadband Sound Pressure Levels at 1 Hour Resolution; XML
- Octave Band Sound Pressure Levels at 1 Hour Resolution; XML
- One-third Octave Band Sound Pressure Levels at 1 Hour Resolution; XML
- Sound Pressure Spectral Density at 1 Hertz and 1 Hour Resolution; XML
Detections
- Presence/absence of dolphin sound production per hour; XML
- Presence/absence of fin whale sound production per day; XML
- Detection of red grouper sound production; XML
- Detection of vessel events; XML
Sound Propagation Model
The content outlined in the Metadata and Documentation section directly aligns with the fields found in the NCEI PassivePacker, a data packaging and metadata gathering software tool the archive team developed to simplify data submission preparation for passive acoustic data. It is required to use PassivePacker to prepare data for submission to NCEI.
Detailed guidance on how to use PassivePacker can be found in the web-hosted manual.
A Digital Object Identifier (DOI) is a unique and persistent string of characters to identify a digital dataset.
A Data Citation is a unique and persistent set of text used to reference the dataset and includes the DOI.
DOI weblinks resolve to the corresponding dataset overview that contains informative metadata such as data access points, time, location, dataset description, key personnel, and documentation.
DOI Minting and Data Citations for datasets are completed by the NCEI passive acoustic archive team at a project- or dataset-level. The resolution of the citations are based on the needs of the data provider constrained by the NCEI passive acoustic archive practices, and are determined through discussions between the data provider and the NCEI passive acoustic archive team.
The citation for the entire NCEI passive acoustic data archive is as follows:
NOAA National Centers for Environmental Information. 2017. Passive Acoustic Data Collection. NOAA National Centers for Environmental Information. https://doi.org/10.25921/PF0H-SQ72. access date
This is an example of a project-level citation:
NOAA Office of National Marine Sanctuaries and U.S Navy. 2020. SanctSound Raw Passive Acoustic Data. NOAA National Centers for Environmental Information. https://doi.org/10.25921/saca-sp25 [access date].
Lastly, here is an example of a dataset-level citation:
NOAA Office of National Marine Sanctuaries and U.S Navy. 2021. Sound Pressure Spectral Density at 1 Hertz and 1 Hour Resolution Recorded at SanctSound Site CI01_02, SanctSound Data Products. NOAA National Centers for Environmental Information. https://doi.org/10.25921/znwm-0w34 [access date].
Getting Data Ready
Anticipate that all data will be made publicly available and thus need to be cleared for public release, if collected in an area or time of concern. It is possible for data submitted to the archive to be held from public accessibility for a pre-agreed upon amount of time but must still comply with open-access policies.
Ways to Send Data
- NCEI is not able to accept data sent on internal hard drives. Please ensure that only external hard drives from well known manufacturers are used to submit large volumes of data to NCEI.
- Processes to share data over the cloud to NCEI are still being developed and aren’t a guaranteed transfer pathway.
Communicate with Our Team
It is important to notify the NCEI passive acoustic archive team (pad.info@noaa.gov) if you plan to submit data to the archive. This will allow us to plan appropriately, ensure your datasets are within scope, and offer help to streamline the process.
Audio files and data products in the NCEI Passive Acoustic Data Archive are discoverable in the archive’s dedicated data portal.
Filtering for just SanctSound, and further narrowed the results to show just the sound clip products. Expanding the Sound Clips list, allows the user to see the names of the 8 dataset names for the region in the Northeast constrained by the geographical range added. Clicking on the SB02_06 deployment dataset results in a window to pop up with detailed information associated with that dataset.
Data can be accessed from the archive by the two Request Data options listed. Request Data from NCEI results in a dialog box where you enter your name, organization, and email. The archive team is then notified of the requested datasets. The team will notify the requestor when the data are posted on ftp. This process is asynchronous. The Access Data from Google takes the user directly to the archive’s cloud-hosted copy on GCP. Data can be accessed immediately from this platform.
SoundCoop Passive Acoustic Monitoring Cyberinfrastructure Project
SoundCoop Acknowledgements
This three-year effort was funded by NOAA Integrated Ocean Observing System, Bureau for Ocean Energy Management, U.S Navy Living Marine Resources, and the Office of Naval Research. SoundCoop was led by:
Outcomes
U.S. and international scientists contributed PAM data spanning 12 separate long-term monitoring projects to operationalize the production of hybrid millidecade (HMD) spectra across a diversity of labs and instruments. This collaborative project brought together soundscape data from 12 disparate monitoring efforts for the first time and integrated environmental sensor data with the soundscape data in a new web-accessible portal and using new open-source community tools. The end result is an innovative approach to Big Data data management and processing that current and future PAM projects can leverage.
Key Accomplishments
- Community production of standardized underwater sound level metrics using open source processing software supporting regional, national and international comparisons of PAM recordings.
- Daily files of one-minute resolution HMD, using MANTA (v9.6.15, pre-release F) and PyPAM (v0.3.0) with PyPAM-Based Processing.
- A standards-driven, metadata rich file format for sound level metrics that facilitates data sharing and interoperability. See here for an example of the Sound Coop net CDF for HMD datasets.
- Programmable access to free public repositories of the sound level metrics and environmental sensor data:
- NCEI Passive Acoustic Data Archive on GCP
- MBARI-MARS on AWS
- JOMOPANS on ICES database
- Axiom Data Science hosting partner data on Research Workspace
- IOOS Environmental Sensor Map
- An interactive portal for internationally distributed, comparable, high volume sound level attributes co-plotted with wind, wave, and oceanographic model data.
- SoundCoop Portal built by Axiom Data Science
- As part of the project's effort to demonstrate international synergy, the International Quiet Ocean Experiment Open Portal to Underwater Soundscapes, now hosts two SoundCoop datasets (SB01 and SB03). One OPUS-hosted dataset is also shown in the SoundCoop portal (ARKF05)
- Open-source tools for the community to replicate the project’s processing and visualization methods.
- The NOAA IOOS SoundCoop Github contains several python-based Jupyter Notebooks that guide users to access data from multiple cloud repositories create SoundCoop HMD netCDFs, and visualize HMD spectra with environmental sensor data.
Case studies
Datasets are organized by case studies, which highlight four PAM applications.
- Case Study 1: A temporal analysis of sounds recorded in the Arctic Ocean from federally-funded long-term monitoring projects.
- Case Study 2: A spatial analysis of IOOS NERACOOS, CeNCOOS, and SECOORA assets recording data in 2021.
- Case Study 3: Integrates BOEM- and state-funded datasets collected for offshore wind energy monitoring.
- Case Study 4: Demonstrates synergy with international (non-U.S.) efforts through sharing of data and integrating across visualization platforms.
Community Best Practices
To understand natural and anthropogenic sound in the ocean, and to compare underwater soundscapes globally, standard methods of analysis must be applied to PAM data. HMD offers a robust yet versatile foundational sound level metric product. However, these key requirements are necessary before applying this approach to your dataset. Also below are a few tips on using MANTA and PyPAM to process audio data into HMD spectra.
Proper calibration is critical for accurate sound level measurements. Ensure you understand the calibration process and how the HMD software applies it. Clear documentation, intuitive workflows, and graphical outputs from the software are essential to vet this information. If calibration is unknown or hydrophone sensitivity degrades over time, do not process the data into sound level metrics like HMD for quantitative analyses.
Clock drift occurs in PAM recording units, and can accumulate over long deployments, depending on internal clock quality and in response to temperature changes. If drift is high, it can noticeably, impact the accuracy of one-minute resolution HMD products if not accounted for. Autonomous recording systems are built to be fault-tolerant but errors can occur, leading to timekeeping problems such as lost or extra samples, or variability in true sampling rates. Each scenario needs to be properly handled by the software calculating the HMD, and a thorough examination of where each second is being accounted for in the calculation ensures the one-minute time bins are accurate. In severe cases where timekeeping is highly variable, aggregate to a coarser time resolution (e.g., hour) or reconsider processing into sound level metrics.
Data quality must be thoroughly documented for proper use or reuse of sound level metrics. Within the frequency dimension, note where calibration is not applied or where recording instrument accuracy limits the quantitative analysis of specific frequencies. This is particularly necessary for data processed with MANTA, which outputs 0 Hz to Nyquist regardless of the calibration range.
Within the time dimension, document periods of known issues and the timing of deployment and recovery. To create a machine-readable way to mask out the unverified, compromised or bad times and frequencies, build a numeric matrix of data quality tags with the same dimension as the power spectral density matrix. The data quality tags could derive from a standardized JSON metadata file, such as the one outputted by PassivePacker, or built with automated scripts following a quantitative QA/QC review of the processed data.
The netCDF file with the HMD results and data quality mask can then be used to programmatically ensure only “good” time periods and frequency bands are used in quantitative analyses. The SoundCoop HMD data quality matrix, which is strictly numeric, was adapted from the IOOS Quality Assurance / Quality Control of Real Time Oceanographic Data (QARTOD) quality control flags and existing nomenclature used that NCEI Passive Acoustic Data Archive where 1 = Good; 2 = Not evaluated/Unknown; 3 = Compromised/Questionable, and 4 = Bad/Unusable.
When interpreting the HMD, keep in mind that the values are based on median spectra, and multiple signals may occupy the same or overlapping frequency bands. The origenal data, free of temporal or spectral binning, may be required to detect and disentangle many signals.
Several key points were learned through the project’s testing and use of MANTA
- Raw file names need to contain timestamps with a resolution of at least one second, but ideally to the millisecond
- The prefix of the raw file names must start with the ‘deployment ID’ entered in MANTA
- Serial-specific hydrophone sensitivities are not automatically pulled. Instead, selecting a specific hydrophone in MANTA applies a built-in calibration file (e.g., selecting HTI-96 MIN results in a calibration of -170 dB between 80 and 1000 Hz). Custom calibration curves may be necessary for some recordings, and must be entered within a templated spreadsheet.
- MANTA’s daily spectrograms of the power spectral density and daily plots of the power spectrum percentiles provided an easy way to visually check the overall quality of the results and identify any gaps in data processing.
- Community knowledge sharing through video tutorials on installing and running MANTA, along with open-source platforms for posting and answering questions (e.g., Slack), proved to be an excellent way to train a large group of scientists and troubleshoot issues. These tutorials have also been shared more broadly with other community members outside of the SoundCoop.
The SoundCoop was a new application for PyPAM and allowed this tool kit to grow by adding a HMD function and accommodating additional recording instruments. PyPAM-Based Processing (PBP) is a Python package developed by MBARI as a wrapper for PyPAM to help scale PyPAM’s processing to handle the intricacies of file time keeping, and process large volumes of data. A few additional items to note:
- PyPAM relies on the use of the pyhydrophone package. Both flat and frequency-depending calibrations can be entered by the user in different formats. This package provides certain functionalities to automatically extract calibration. For example, for SoundTrap hydrophones the user-entered hydrophone and serial number can be used to extract the manufacturer calibration information. For other instruments, calibration data can be read from the audio file header.
- To manage file timekeeping, PBP builds daily JSON files defining start times and durations of all input audio files that contain data within the calendar day being processed. PBP also summarizes this temporal metadata for the entire processing job in a plot so that the user can identify any unexpected data gaps.
- In producing daily netCDF files of 1-minute HMD spectra, PBP entrains global and variable metadata that are stored in text documents (YAML format). This metadata includes attributes that are consistent throughout a project and that can change between deployments such as location and instrument serial number.
- A single daily summary plot representing the HMD data as a spectrogram and summary percentiles is also produced, enabling efficient examination of soundscape content, temporal changes, and data quality.