Multi-band detection and location code documentation

Contents:

Method

Overview

1. Signal Processing - Broadband Characteristic Function Calculation

Exctracting non-stationary frequency-dependent statistical features of the recorded seismic signals.


Higher-Order Statistics (HOS) CFs

Estimating CF of the raw record by extracting information on the statistical variations of the signal. Suitable for enhansing and extracting short transient signals e.g., local earthquakes, phase arrivals from distant events, and other short transients assosiated to low frequency earthquakes and vulcanic earthquakes. HOS perform well in the conditions of low signal-to-noise ratio.

HOS of a probability distribution for a random variable \(X\) are defined by the standardized moment:

\[S_n = \frac {E \left[ \left( X - \mu \right)^n \right]} {\left( E \left[ \left( X - \mu \right)^2 \right] \right)^ {n/2}} = \frac{m_n}{m_2^{n/2}}\]

where \(E\left[\cdot\right]\) is the expectation operator, \(\mu = E\left[X\right]\) is the mean, \(m_2 = \sigma^2\) is the variance (second moment), and \(m_n\) is the \(n^{th}\) central moment.

Kurtosis, is the HOS of the \(4^{th}\) order. It is defined as a \(4^{th}\) moment about the mean normalized by the square of the \(2^{nd}\) moment (variance) - \(S_4 = m_4/m_2^2\) . Although, HOS of higher orders can be equally used, in his version of the program we limit the calculation of HOS CFs to kurtosis.

Calculation of kurtosis for a discrete signal \(u\left(t\right)\) (seismic record) is clasically performed in a sliding window. This can be computationally expensive. To improve the computational performance we implemented a recursive scheme based on formulation of the exponential moving average. Such scheme provides an efficient way for accumulating time-averaged statistics. The recursive formula for exponential moving average \(\hat{\mu}\left(t\right)\) of signal \(u\left(t\right)\) is defined as:

\[\hat{\mu}\left(t_i\right)= C\times u\left(t_i\right)+\left(1-C\right) \times \hat{\mu}\left(t_{i-1}\right)\]

where \(u\left(t_i\right)\) is the \(i\)-th sample of the signal, \(\hat{\mu}\left(t_{i-1}\right)\) previous estimate of the average, and \(C\) is the decay constant (\(0\leq 1-C \leq 1\)). Directly extending the above definition to the estimates of the \(2^{nd}\) and \(4^{th}\) cantral moments, and using the definition of HOS we obtain the following recursive expression for the kurtosis CF:

\[CF_{HOS} \left(t_i\right) \equiv\ \hat{S}_4\left(t_i\right) = \frac {C \times \left[u\left(t_i\right) - \hat{\mu}\left(t_{i-1} \right)\right]^4 +\left( 1-C \right) \times \hat{m}_4\left(t_{i-1}\right)} {\left( C \times \left[u\left(t_i\right) - \hat{\mu}\left(t_{i-1} \right)\right]^2 +\left( 1-C \right) \times \hat{m}_2\left(t_{i-1}\right)\right)^2}\]

For a more general case of the \(n^{th}\) order HOS replace \(4\) with \(n\) in the above formula. Currently code is tested for orders \(n = 4, 6, 8\)

An example of the recursive kurtosis CF calculated on a seismogram of a regional earthquake is shown below.

_images/HOS.png

The decay constant \(C\) can be as seen as a smoothing factor, defining how fast the memory of older data-points will be discarded. It is expressed as \(C = \frac{\Delta T}{T_{decay}}\) where \(\Delta T\) is sampling rate of the data and \(T_{decay}\) represents the time-length of the averaging scale. Effect of different values of \(C\) constant on the final CF is shown in the following figure.

_images/HOS2.png

Decay constant \(C\) should be set separately for each particular case study, through a preliminary testing step. The actual value of this parameter is defined by selecting \(T_{decay}\) (in seconds) in the program’s control file. The parameter is further transformed into the decay constant, using the sampling rate of the data.


Envelope CFs

Characterization of signal based on the amplitude or energy modulation - energy envelope that can be applied to signals associated to sources characterized by slow energy transients, lasting several tens of seconds e.g., tectonic tremors. The relative timing of localized energy transients can be used to detect and locate a tremor source.

The root-mean-square (RMS) envelope CF function is defined as:

\[CF_{env} \left( t_i \right) \equiv RMS(t_i) = \sqrt {\frac{\sideset{}{_{i=1}^M}\sum u(t_i)^2}{M-1}}\]

Following the same idea of recursive implementation, we define the recursive envelope CF as a single component recursive RMS envelope of the recorded signal \(u\left(t_i\right)\) using following expression:

\[CF_{env}( t_i ) \equiv \widehat{RMS}(t_i) = \sqrt{C \times u^2(t_i) + (1-C) \times\left[\widehat{RMS}(t_{i-1}) \right]^2 }\]

_images/env.png

Similarly to the recursive kurtosis CF, parameter to be set for the calculation is \(T_{decay}\) in seconds, from which the decay constant is calculated as \(C = \frac{\Delta T}{T_{decay}}\).


Time-Frequency Analyasis: Multi-Band Filter

The Multi-Band Filter (MBF) algorithm performs a time-frequency decomposition of the signal by producing a set of band-pass filtered time series with different central periods. Implemented MBF scheme follows the recursive multi-band filtering scheme of FilterPicker algorithm by Lomax et al. (2012). MBF analysis is performed by running the original record through a predefined bank of narrow band-pass filters covering the interval \(\left[ f_{min},f_{max} \right]\), and having \(n = \left\{0,...,N_{band}\right\}\) central frequencies (either linearly or logarithmically spaced). On practice, filtering is implemented through a cascade of two simple low-pass and high-pass, one-pole recursive filters. This is equivalent to a two-pole band-pass filter. Below is an illustration of such a filter-bank:

_images/MBF_filter.png

A frequency-dependent CFs (\(CF(t,f_n)\)) calculated by running the kurtosis/envelope recursive estimations for each of the filtered signals. This yields a time-frequency dependent representation of the signal’s characteristics.

\[CF^{TF}(t)=\max\limits_f CF(t,f), \quad f\in [f_{min},f_{max}]\]
_images/MBF_example2.png

Generalized Characteristic Function

\[\begin{split}CF(t) = \begin {cases} \sideset{}{_{HOS}^+}{\dot{CF}}(t) \ast e^{t^2/4\sigma^2} & \quad \text{if } CF\text{_}type \text{ is } kurtosis \\ \\ CF_{env}(t) & \quad \text{if } CF\text{_}type \text{ is } envelope \\ \end{cases}\end{split}\]

_images/generic_CF.png



2. Detection and Location Scheme

Detection and location is based on stacking Spatial Likelihood Functions (SLFs) according to the theoretical 3-D travel-time grids. SLF are corresponding to the station-pair time-delay estimate functions. The procedure of stacking the projected SLF functions can be viewed as space-time imaging procedure.


Stations-pair Time Delay Estimate (TDE) Functions

Corresponds to maximum local cross-correlation function (see Poiata et al., 2016 for details) calculated in a given time window.


_images/LCC.png

Space-Time Imaging

Performs projection (mapping) and stacking of staion-pairs TDE according to the theoretical 3-D travel-times calculated for a given velocity model:


_images/SpaceTimeImaging.png

Detection and Location

Performed for each position of a sliding time window on a final imaging function (summary SLF) if its maximum value exceeds the selected threshold value.


_images/DetectionLocation.png

Analysing continuous data

How to

This is a quick guide for installing and running BackTrackBB.


System requirements

  • python 2.7.x or >3.5
  • NumPy
  • SciPy
  • Matplotlib
  • ObsPy, available at: http://obspy.org
  • NonLinLoc - optional for time grid calculation (available here: http://alomax.free.fr/nlloc/). Otherwise can use any other way for calculating theoretical travel times.

Accepted data formats: formats supported by ObsPy


Installation

Decompress the package and run:

pip install .

Optionally, if you plan to edit the source code, you can install in “editable” mode:

pip install -e .

This will install the following scripts in your path:

  • btbb ? the main code
  • mbf_plot ? plotting of multiband filtering and signal processing
  • group_triggers ? post-processing tool for removing duplicate detections
  • bt2eventdata ? cut event waveforms from continuous streams

Using virtual environment:

We recommend you to use virtual environment (also called virtualenv) for creating an isolated local Python setup that will contain all the requirements listed above before installing BackTrackBB. This will help keeping your project-based environment independent from the main system setup. Any changes (upgrades) you make to the project will not affect any others you are also working on.

If you do not already have virtualenv on you machine you can install it via pip:

pip install virtualenv

Once installed, you can create a virtual environment for a project:

$ virtualenv venv_project

This will create a folder venv_project which will contain the Python executable files, and a copy of the pip library which can be used to install other packages.

Activate the virtual environment using following command:

$ source my_project/bin/activate

You can now use your virtual environment and install packages using pip install. Now, any packages installed will be placed in venv_project folder, isolated from the global Python installations.

To deactivate the virtual environment use:

$ deactivate

For more details and information on how to install and use virtual environment refer to these links:

https://virtualenv.pypa.io/en/latest/

http://docs.python-guide.org/en/latest/dev/virtualenvs/


Optional external packages:

In case you want to use NonLinLoc to calculate the 3D time-grids, download it from Anthony Lomax’s website http://alomax.free.fr/nlloc/. You only need to be able to call Vel2Grid and Grid2Time correctly.



Running detection and location code

To run the main detection and location code execute:

$ btbb examples/congig_file.conf

You need to provide a configuration file containing the description of parameters required by the code.



Config file and parameters

Below is a description of the basic parameters you should specify:

  • ncpu - is a number of CPUs that will be used for running the code:
    • ncpu = 1: corresponds to serial version (useful for debugging)
    • ncpu > 1: will run parallel version of the code in which detection and location for each position of sliding window will be performed as different process
  • stations - list of station names to be used for location. Should be separated by , and be the same as in the header of the data file
  • channel - name of the channel component (EW,NS,UD). Should use the same as in the header of the data file
  • data_dir - path to the directory containing the data
  • wave_type - type of the phase assume for location (P/S or P & S)
  • data_type - data format (‘SAC’, ‘mseed’,’gse2’,...)
  • data_dir - path to the directory containing the data
  • grid_dir - path to the directory containing the theoretical travel-time grids for corresponding phase assumption
  • out_dir - path to the directory where the outputs of the run will be saved (will be created by the code if does not exist)
  • sample_rate_data - sampling rate of the analyzed data. All the data will be resampled to this sampling rate before the analysis starts
  • sample_rate_cf - sampling rate of the CFs. All the CFs will be resampled to this sampling rate (using lower sampling rate for CF can be useful for reducing the calculation times)
  • decay_const - value in seconds of decay constant \(T_{decay}\) that will be used for calculatin CF (kurtosis/envelope)
  • ch_function - type of CF. Options available are kurtosis (HOS \(2^{nd}\) order) or envelope
  • win_type:
    • if True frequency-dependent \(T_{decay}\) will be used
    • if False a constant \(T_{decay}\) = decay_const will be used for time-frequency CF calculation
  • recursive_memory:
    • if True uses recursive memory mechanism for the multi-band filter (MBF) and CF calculation on a window-by-window basis. This setting can be useful for increasing the computational speed and optimizing the usage of the memory.
    • if False (default value) the MBF and CF calculations will be done on the entire data-set read to the memory.
  • f_min - lower frequency \(f_{min}\) of the MBF
  • f_max - lower frequency \(f_{min}\) of the MBF
  • n_freq_bands - number of frequency bands \(N_{band}\) between f_min and f_max for the filter-bank
  • band_spacing - choose between liniarly (lin) and logarithmically (log) spaces frequency bands for MBF filter
  • time_lag - (in seconds) defines at the same time value of maximum lag in local cross-correlation, and size of sliding time-window. Should be set to the maximum possible travel-time between the two furtherst stations
  • maxSTA_distance - maximum distance (in km) between the pair of stations that will be used for detection and location
  • t_overlap - overlap (in seconds) between the two consecutive time windows
  • start_t - position of the first time-window (in seconds) from the begining of the record
  • end_t - position of the last time-window (in seconds) from the begining of the record
  • dt_min - maximum time difference (in seconds) between the picked and theoretical travel-time used for calculating the origin time
  • do_smooth_lcc - will run a 2D Gaussian smoothing over the LCC function
  • smooth_lcc - value (in seconds) for Gaussian time-findow smoothing for LCC
  • cut_data - set True if want to cut a smaller length of data for analysis
  • cut_start - cutting start time (in seconds) from the begining of the record
  • cut_delta - length (in seconds) of data to cut
  • trigger - set value of detection threshold
  • lat_orig - latitude of the origin of XYZ grid used for detection and location
  • lon_orig - latitude of the origin of XYZ grid used for detection and location
  • plot_waveforms - option for plotting the seismograms on the final plot (True/False)
  • plot_format - select output format for the plotting (png/pdf)

An example configuration file BT_ChileExample.conf:

#--number of CPU's-------------------------------------------------------
ncpu = 1
#--station names---------------------------------------------------------
stations = QF17,G07S,QF05,G02S,QC12,QF16,G03S,QF12,G11S,G06S,U67B,U51B,U45B,G08S,U66B,U65B,G09S,U46B,QF14,G12S,U56B,G13S
channel = HHZ
wave_type = 'P'
#--settings for input data files-----------------------------------------
data_type = 'mseed'
data_dir = 'examples/data/data_Chile'
grid_dir = 'examples/grids/grids_Chile'
#
out_dir = 'out_example_Chile/'
#-----------------------------------------------------------------------
sampl_rate_data = 100.
sampl_rate_cf = 50.
#------------------------------------------------------------------------
decay_const = 1.00
ch_function = 'kurtosis'
win_type = False
#--Parameters for the MBF analysis--------------------------------------
f_min = 0.02
f_max = 49.
n_freq_bands = 20
band_spacing = log
#------------------------------------------------------------------------
time_lag = 20.
maxSTA_distance = 100.
#--Parameters for the sliding window------------------------------------
t_overlap = 17.
start_t = 33.
end_t = 35.
#
dt_min = 1.0
#---LCC calculatioin settings-------------------------------------------
do_smooth_lcc = False
smooth_lcc = 0.1
#------------------------------------------------------------------------
cut_data = False
cut_start = 0.
cut_delta = 180.
#------------------------------------------------------------------------
trigger = 0.7
#--geogr. coordinates for the origin of the coordinate system-----------
lat_orig = -35.404
lon_orig = -73.000
#--parameters for plotting----------------------------------------------
plot_waveforms = True
plot_format = png


Outputs

Run the example provided with the code:

$ btbb examples/BT_ChileExample.conf

use of var time window for location: False
Number of traces in stream =  22
Number of time windows =  1
frequencies for filtering in (Hz): [  2.00000000e-02   3.01583209e-02   4.54762160e-02   6.85743157e-02
1.03404311e-01   1.55925020e-01   2.35121839e-01   3.54543993e-01
5.34622576e-01   8.06165961e-01   1.21563059e+00   1.83306887e+00
2.76411396e+00   4.16805178e+00   6.28507216e+00   9.47736116e+00
1.42910650e+01   2.15497261e+01   3.24951778e+01   4.90000000e+01]
starting BPmodule
Running on 1 thread
22
20100401_1253A X 5.0 Y 33.0 Z 16.0 MaxStack 0.858 Ntraces 22 BEG 33.0 END 53.0  LAT -34.70242 LON -72.04541 T_ORIG 2010-04-01T12:53:05.343609Z

Since in the configuration file BT_ChileExample.conf ouput directory is set to 'out_example_Chile/' all the outputs will be redirected there

$ ls out_example_Chile/

10040112_20fq0.0_49.0hz_1.050.00.117.0_kurtosis_HHZ_P_trig0.7_FIG2.png
0040112_20fq0.0_49.0hz_1.050.00.117.0_kurtosis_HHZ_P_trig0.7_OUT2.dat
10040112_t0033.0s_0.0_49.0_fig.png

where

  • *_FIG2.png is a summary plot of triggered locations
  • *_fig.png format for the plots for each position of the sliding time-window
  • *_OUT2.dat summary data file providing the information on the triggered locations and the picked and theoretical arrival times from this location for each station

Tests and Examples

The code is provided with examples, for testing that all the things are installed and running ok, and also allowing to play around with the parameters. This may help to understand better how the things a working.

Download file examples.zip. After uncompressing it you will get folder examples/ containing all the data (seismograms and theoretical travel-time grids) that you will need to run the examples.

Detection and location

Use config file BT_ChileExample.conf. Check that folder with examples is in the same folder with the main code and configuration file, otherwise change the path to data_dir and grid_dir .

Run the detection and location code:

$ btbb examples/BT_ChileExample.conf

use of var time window for location: False
Number of traces in stream =  22
Number of time windows =  1
frequencies for filtering in (Hz): [  2.00000000e-02   3.01583209e-02   4.54762160e-02   6.85743157e-02
1.03404311e-01   1.55925020e-01   2.35121839e-01   3.54543993e-01
5.34622576e-01   8.06165961e-01   1.21563059e+00   1.83306887e+00
2.76411396e+00   4.16805178e+00   6.28507216e+00   9.47736116e+00
1.42910650e+01   2.15497261e+01   3.24951778e+01   4.90000000e+01]
starting BPmodule
Running on 1 thread
22
20100401_1253A X 5.0 Y 33.0 Z 16.0 MaxStack 0.858 Ntraces 22 BEG 33.0 END 53.0  LAT -34.70242 LON -72.04541 T_ORIG 2010-04-01T12:53:05.343609Z

You should get following outputs written in the directory defined in out_dir:

  • 10040112_20fq0.0_49.0hz_1.050.00.117.0_kurtosis_HHZ_P_trig0.7_FIG2.png
  • 0040112_20fq0.0_49.0hz_1.050.00.117.0_kurtosis_HHZ_P_trig0.7_OUT2.dat
  • 10040112_t0033.0s_0.0_49.0_fig.png

The image files should look like:

_images/Chile_location.png

0040112_20fq0.0_49.0hz_1.050.00.117.0_kurtosis_HHZ_P_trig0.7_OUT2.dat - summary text file. It will have full information on the location of the triggered event and arrival times of the phase to the stations:

20100401_1253A X 5.0 Y 33.0 Z 16.0 MaxStack 0.858 Ntraces 22 BEG 33.0 END 53.0  LAT -34.70242 LON -72.04541 T_ORIG 2010-04-01T12:53:05.343609Z
 sta G02S  Ph P  TT 12.08  PT 12.17
 sta G03S  Ph P  TT 7.95  PT 8.14
 sta G06S  Ph P  TT 5.09  PT 5.07
 sta G07S  Ph P  TT 3.23  PT 2.92
 sta G08S  Ph P  TT 6.49  PT 6.86
 sta G09S  Ph P  TT 11.47  PT 11.53
 sta G11S  Ph P  TT 7.02  PT 7.29
 sta G12S  Ph P  TT 11.33  PT 11.60
 sta G13S  Ph P  TT 13.53  PT 13.40
 sta QC12  Ph P  TT 17.12  PT 16.16
 sta QF05  Ph P  TT 15.63  PT 15.06
 sta QF12  Ph P  TT 9.20  PT 9.08
 sta QF14  Ph P  TT 10.38  PT 10.48
 sta QF16  Ph P  TT 9.81  PT 9.92
 sta QF17  Ph P  TT 3.85  PT 3.81
 sta U45B  Ph P  TT 7.14  PT 7.48
 sta U46B  Ph P  TT 16.21  PT 16.28
 sta U51B  Ph P  TT 13.76  PT 13.59
 sta U56B  Ph P  TT 17.35  PT 17.13
 sta U65B  Ph P  TT 7.10  PT 7.52
 sta U66B  Ph P  TT 5.32  PT 5.59
 sta U67B  Ph P  TT 9.42  PT 9.40

Signal processing: MBF characteristic functions

Run the MBF signal processing example for a defined filter bank and Kurtosis CF:

$ mbf_plot examples/MBF_ChileExample.conf

This code can be useful to understand how the MBF filtering and CF calculation works, and for adjusting the signal processing parameters for the data.

Output should look like:

_images/MBF_example_Chile.png

mbf_plot is useful for deciding most appropriate signal-processing parameters for your data.

Indices and tables