Mining Seismic Wavefields (Stanford)
During my Ph.D., I worked with Greg Beroza in the Department of Geophysics on applications of data mining and machine learning techniques in earthquake seismology.
FAST Earthquake Detector
Fingerprint and Similarity Thresholding (FAST), introduced in Yoon et al. (2015) is a novel method for large-scale earthquake detection. FAST draws on techniques used by content-based audio recognition systems (like the Shazam app, or Google's Waveprint algorithm), and adapts these methods for the unique characteristics of seismic waveform data.
FAST is an unsupervised detector -- it does not require any examples of known event waveforms or waveform characteristics for detection. This allows FAST to discover new earthquake sources, even if template waveforms (training data) is not available.
FAST was developed as part of a multidisciplinary collaboration at Stanford, involving researchers from the Department of Geophysics, the Institute for Computational and Mathematical Engineering (ICME), and the Department of Computer Science.
Implementation and Performance: Locality-Sensitive Hashing for Earthquake Detection: A Case Study Scaling Data-Driven Science (also see Stanford DAWN blog)
FAST at scale (Diablo Canyon case study): Unsupervised Large-Scale Search for Similar Earthquake Signals
Other FAST applications:
Broader context: Machine learning for data-driven discovery in solid Earth geoscience
Stanford Scientists develop "Shazam for Earthquakes" (Dec 4, 2015)
Earthquake monitoring in the age of "big data:" challenges and opportunities (video), UTIG Seminar, Jackson School of Geosciences, UT Austin (Sept 2019)
Big data for small earthquakes: a data mining approach to earthquake detection (video), FISH Seminar, Earth Resources Laboratory, MIT (October 2018)
Earthquake Detection Through Computationally Efficient Similarity Search (video), USGS Earthquake Science Center Seminar (August 2015, with Clara Yoon)
Editorial Board of Geophysical Journal International: 2018 GJI Student Author Award (November 2018)
Seismic data sets for Machine Learning
STanford EArthquake Dataset (STEAD): A Global Data Set of Seismic Signals for AI. Compiled by Mousavi et al. (2019) [ref].