Mining Seismic Wavefields (Stanford)

During my Ph.D., I worked with Greg Beroza in the Department of Geophysics on applications of data mining and machine learning techniques in earthquake seismology.

Ph.D. thesis: Big data for small earthquakes: Detecting earthquakes over a seismic network with waveform similarity search.

FAST Earthquake Detector

Fingerprint and Similarity Thresholding (FAST), introduced in Yoon et al. (2015) is a novel method for large-scale earthquake detection. FAST draws on techniques used by content-based audio recognition systems (like the Shazam app, or Google's Waveprint algorithm), and adapts these methods for the unique characteristics of seismic waveform data.

FAST is an unsupervised detector -- it does not require any examples of known event waveforms or waveform characteristics for detection. This allows FAST to discover new earthquake sources, even if template waveforms (training data) is not available.

FAST was developed as part of a multidisciplinary collaboration at Stanford, involving researchers from the Department of Geophysics, the Institute for Computational and Mathematical Engineering (ICME), and the Department of Computer Science.

The FAST code is available on Github!




Seismic data sets for Machine Learning

  • STanford EArthquake Dataset (STEAD): A Global Data Set of Seismic Signals for AI. Compiled by Mousavi et al. (2019) [ref].

  • SCEDC Deep Learning Datasets. Compiled by Ross et al. (2018) [ref] [ref].

  • INSTANCE: The Italian Seismic Dataset for Machine Learning. Compiled by Michelini et al. (2021) [ref].

  • LEN-DB: Local earthquakes detection: A benchmark dataset of 3-component seismograms build on a global scale. Compiled by Magrini et al. (2020) [ref].

  • LANL Earthquake Prediction Dataset (hosted by Kaggle). Reference: Johnson et al. (2021) [ref].