Signal quality assessment

In this tutorial we will assess the quality of MIMIC waveform signals.

Our objectives are to:

  • Understand a template-matching approach to assess signal quality of cardiovascular signals.

  • Apply the template-matching approach to ECG and PPG signals.

  • Understand how to interpret the results.

Context: Physiological signals can be subject to noise from multiple sources. Signal quality assessment algorithms assess the quality of signals to determine whether they are of sufficient quality for a particular purpose (such as heart rate estimation). In this tutorial we will use the template-matching signal quality assessment algorithm described in this publication.

Extension: If you want to find out more about photoplethysmography (PPG) signal quality assessment then I'd recommend this publication.

Setup

Import packages

The following steps have been covered in previous tutorials. We’ll just re-use the previous code here.

# Packages
from scipy import signal
import numpy as np
import matplotlib.pyplot as plt
!pip install wfdb==4.0.0
import wfdb

# import sys
# from pathlib import Path

Import ECG beat detectors

!pip install py-ecg-detectors

Details of MIMIC record to use

Specify details of a MIMIC database record to use in this tutorial

# The name of the MIMIC IV Waveform Database on Physionet
database_name = 'mimic4wdb/0.1.0'

# Segment for analysis
segment_names = ['83404654_0005', '82924339_0007', '84248019_0005', '82439920_0004', '82800131_0002', '84304393_0001', '89464742_0001', '88958796_0004', '88995377_0001', '85230771_0004', '86643930_0004', '81250824_0005', '87706224_0003', '83058614_0005', '82803505_0017', '88574629_0001', '87867111_0012', '84560969_0001', '87562386_0001', '88685937_0001', '86120311_0001', '89866183_0014', '89068160_0002', '86380383_0001', '85078610_0008', '87702634_0007', '84686667_0002', '84802706_0002', '81811182_0004', '84421559_0005', '88221516_0007', '80057524_0005', '84209926_0018', '83959636_0010', '89989722_0016', '89225487_0007', '84391267_0001', '80889556_0002', '85250558_0011', '84567505_0005', '85814172_0007', '88884866_0005', '80497954_0012', '80666640_0014', '84939605_0004', '82141753_0018', '86874920_0014', '84505262_0010', '86288257_0001', '89699401_0001', '88537698_0013', '83958172_0001']
segment_dirs = ['mimic4wdb/0.1.0/waves/p100/p10020306/83404654', 'mimic4wdb/0.1.0/waves/p101/p10126957/82924339', 'mimic4wdb/0.1.0/waves/p102/p10209410/84248019', 'mimic4wdb/0.1.0/waves/p109/p10952189/82439920', 'mimic4wdb/0.1.0/waves/p111/p11109975/82800131', 'mimic4wdb/0.1.0/waves/p113/p11392990/84304393', 'mimic4wdb/0.1.0/waves/p121/p12168037/89464742', 'mimic4wdb/0.1.0/waves/p121/p12173569/88958796', 'mimic4wdb/0.1.0/waves/p121/p12188288/88995377', 'mimic4wdb/0.1.0/waves/p128/p12872596/85230771', 'mimic4wdb/0.1.0/waves/p129/p12933208/86643930', 'mimic4wdb/0.1.0/waves/p130/p13016481/81250824', 'mimic4wdb/0.1.0/waves/p132/p13240081/87706224', 'mimic4wdb/0.1.0/waves/p136/p13624686/83058614', 'mimic4wdb/0.1.0/waves/p137/p13791821/82803505', 'mimic4wdb/0.1.0/waves/p141/p14191565/88574629', 'mimic4wdb/0.1.0/waves/p142/p14285792/87867111', 'mimic4wdb/0.1.0/waves/p143/p14356077/84560969', 'mimic4wdb/0.1.0/waves/p143/p14363499/87562386', 'mimic4wdb/0.1.0/waves/p146/p14695840/88685937', 'mimic4wdb/0.1.0/waves/p149/p14931547/86120311', 'mimic4wdb/0.1.0/waves/p151/p15174162/89866183', 'mimic4wdb/0.1.0/waves/p153/p15312343/89068160', 'mimic4wdb/0.1.0/waves/p153/p15342703/86380383', 'mimic4wdb/0.1.0/waves/p155/p15552902/85078610', 'mimic4wdb/0.1.0/waves/p156/p15649186/87702634', 'mimic4wdb/0.1.0/waves/p158/p15857793/84686667', 'mimic4wdb/0.1.0/waves/p158/p15865327/84802706', 'mimic4wdb/0.1.0/waves/p158/p15896656/81811182', 'mimic4wdb/0.1.0/waves/p159/p15920699/84421559', 'mimic4wdb/0.1.0/waves/p160/p16034243/88221516', 'mimic4wdb/0.1.0/waves/p165/p16566444/80057524', 'mimic4wdb/0.1.0/waves/p166/p16644640/84209926', 'mimic4wdb/0.1.0/waves/p167/p16709726/83959636', 'mimic4wdb/0.1.0/waves/p167/p16715341/89989722', 'mimic4wdb/0.1.0/waves/p168/p16818396/89225487', 'mimic4wdb/0.1.0/waves/p170/p17032851/84391267', 'mimic4wdb/0.1.0/waves/p172/p17229504/80889556', 'mimic4wdb/0.1.0/waves/p173/p17301721/85250558', 'mimic4wdb/0.1.0/waves/p173/p17325001/84567505', 'mimic4wdb/0.1.0/waves/p174/p17490822/85814172', 'mimic4wdb/0.1.0/waves/p177/p17738824/88884866', 'mimic4wdb/0.1.0/waves/p177/p17744715/80497954', 'mimic4wdb/0.1.0/waves/p179/p17957832/80666640', 'mimic4wdb/0.1.0/waves/p180/p18080257/84939605', 'mimic4wdb/0.1.0/waves/p181/p18109577/82141753', 'mimic4wdb/0.1.0/waves/p183/p18324626/86874920', 'mimic4wdb/0.1.0/waves/p187/p18742074/84505262', 'mimic4wdb/0.1.0/waves/p188/p18824975/86288257', 'mimic4wdb/0.1.0/waves/p191/p19126489/89699401', 'mimic4wdb/0.1.0/waves/p193/p19313794/88537698', 'mimic4wdb/0.1.0/waves/p196/p19619764/83958172']

# Segment 0 is helpful for filtering, and 3 and 8 are helpful for differentiation
rel_segment_n = 0
rel_segment_name = segment_names[rel_segment_n]
rel_segment_dir = segment_dirs[rel_segment_n]

rel_segment_n = 8 
rel_segment_name = segment_names[rel_segment_n]
rel_segment_dir = segment_dirs[rel_segment_n]

Signal quality assessment algorithm settings

Threshold to distinguish between high and low quality data.

thresh = 0.66 # For ECG, from: Orphanidou C et al., Signal-quality indices for the electrocardiogram and photoplethysmogram: derivation and applications to wireless monitoring. IEEE J Biomed Heal Informatics 2015;19:832–838. https://doi.org/10.1109/JBHI.2014.2338351

Define template matching functions

  • filter_ecg - normalises and applies a ~1-15 Hz Butterworth band-pass filter

def filter_ecg(x, fs):
    sig = x
    # sig = sig[:,0]
    order = 3
    low_cutoff = 1  # in Hz
    high_cutoff = 15  # in Hz
    cutoff_frequency = (low_cutoff, high_cutoff)
    b, a = signal.butter(order, cutoff_frequency, btype='band', fs=fs)
    # b, a = signal.butter(3, [0.004, 0.06], 'band')    # original 
    sig = signal.filtfilt(b, a, sig, padlen=150)
    sig = (sig - min(sig)) / (max(sig) - min(sig))
    return sig
  • detect_beats - detects beats in the ECG signal

from ecgdetectors import Detectors

def detect_beats(sig, fs):
    
    # detect beats
    detectors = Detectors(fs)
    #beats = detectors.swt_detector(sig)
    #beats = detectors.wqrs_detector(sig) 
    beats = detectors.hamilton_detector(sig)    
    
    # find R-peaks
    tol_secs = 0.15
    tol_samps = np.floor(fs*tol_secs)
    for beat_no in range(0,len(beats)-1):
            min_el = int(max([0, beats[beat_no]-tol_samps]))
            max_el = int(min([len(sig), beats[beat_no]+tol_samps]))
            curr_samps = sig[min_el:max_el+1]
            beats[beat_no] = int(beats[beat_no]-tol_samps+np.argmax(curr_samps))
    return beats
  • find_rr_ints - finds RR intervals from beat indices

def find_rr_ints(beats,fs):
    
    rr_int = []
    for beat_no in range(0,len(beats)-1):
        rr_int.append((1/fs)*(beats[beat_no+1]-beats[beat_no]))   # in secs
    
    return rr_int
  • assess_feasibility - assesses feasibility of beat detections

def assess_feasibility(beats):
    
    feas = 1
    
    # find HR
    hr = 60*len(beats)/((beats[-1]-beats[0])/fs)  # in bpm

    # check HR
    if hr < 40 or hr > 180:
        feas = 0
        
    # find RR intervals
    rr_int = find_rr_ints(beats,fs)   # in secs
        
    # check max RR interval
    if max(rr_int) > 3:
        feas = 0
    
    # check max to min RR interval
    rr_int_ratio = max(rr_int)/min(rr_int)
    if rr_int_ratio >= 2.2:
        feas = 0
    
    return feas
  • calculate_med_rr_int - calculates the median RR interval (in samples)

def calculate_med_rr_int(beats):
    
    # find RR intervals
    rr_int = find_rr_ints(beats,1)   # in samples
    
    # find median RR interval
    med_rr_int = np.median(rr_int)
    
    return med_rr_int
  • calculate_template - calculates a template beat shape

def calculate_template(sig, beats):
    
    # find median rr interval
    med_rr_int = calculate_med_rr_int(beats)
    
    # find no. samples either side of beat
    tol = int(np.floor(med_rr_int/2))
    sum_waves = np.zeros(1+2*tol)
    no_beats_used = 0
    for beat_no in range(0,len(beats)-1):
            min_el = beats[beat_no]-tol
            max_el = beats[beat_no]+tol
            if min_el < 0 or max_el > beats[-1]:
                continue
            curr_samps = sig[min_el:max_el+1]
            for i in range(0,len(sum_waves)):
                sum_waves[i] += curr_samps[i]
            no_beats_used +=1
    templ = sum_waves/no_beats_used
    return templ
  • calculate_cc - calculates average correlation coefficient between template beat shape and individual beat shapes

def calculate_cc(sig, beats, templ):
    
    # find median rr interval
    med_rr_int = calculate_med_rr_int(beats)
    
    # find no. samples either side of beat
    tol = int(np.floor(med_rr_int/2))
    
    # calculate correlation coefficients for each beat
    sum_cc = 0
    no_beats_used = 0
    for beat_no in range(0,len(beats)-1):
            min_el = beats[beat_no]-tol
            max_el = beats[beat_no]+tol
            if min_el < 0 or max_el > beats[-1]:
                continue
            curr_samps = np.zeros(1+2*tol)
            for i in range(0, 1+2*tol):
                curr_samps[i] += sig[min_el+i]
            temp = np.corrcoef(curr_samps, templ)
            curr_cc = temp[0,1]
            sum_cc = np.add(sum_cc, curr_cc)
            no_beats_used +=1
            
    # find average correlation coefficient
    cc = sum_cc/no_beats_used
    return cc
  • compare_cc_to_thresh - compares the average correlation coefficient to a threshold to determine whether the signal is of high or low quality

def compare_cc_to_thresh(cc, thresh):
    
    if cc >= thresh:
        qual = 1
    else:
        qual = 0
        
    return qual
  • assess_qual - assesses the quality of a 10 second window of ECG signal

def assess_qual(x, fs, thresh):
    
    # filter ECG
    sig = filter_ecg(x, fs)
    
    # detect beats
    beats = detect_beats(sig, fs)
    
    # assess feasibility of beat detections
    feas = assess_feasibility(beats)
    if feas == 0:
        qual = 0
        return qual
    
    # create template beat shape
    templ = calculate_template(x, beats)
    
    # calculate correlation coefficient
    cc = calculate_cc(x, beats, templ)
    
    # compare correlation coefficient to threshold
    qual = compare_cc_to_thresh(cc, thresh)

    return qual

Extract one minute of ECG and PPG signals from the MIMIC database

Extract data

These steps have been covered in previous tutorials, so we’ll just re-use the code here.

# Specify the segment of data to be loaded
start_seconds = 20 # time since the start of the segment at which to begin extracting data
n_seconds_to_load = 60

# Load metadata for this record
segment_metadata = wfdb.rdheader(record_name=rel_segment_name, pn_dir=rel_segment_dir) 
fs = round(segment_metadata.fs)
print(f"Metadata loaded from segment: {rel_segment_name}")

# Load data from this record
sampfrom = fs*start_seconds
sampto = fs*(start_seconds + n_seconds_to_load)
segment_data = wfdb.rdrecord(record_name=rel_segment_name,
                             sampfrom=sampfrom,
                             sampto=sampto,
                             pn_dir=rel_segment_dir)
print(f"{n_seconds_to_load} seconds of data extracted from: {rel_segment_name}")

# Extract the PPG signal
sig_no = segment_data.sig_name.index('Pleth')
ppg = segment_data.p_signal[:,sig_no]
sig_no2 = segment_data.sig_name.index('II')
ecg = segment_data.p_signal[:,sig_no2]
fs = segment_data.fs
print(f"Extracted the PPG and ECG signals from columns {sig_no} and {sig_no2} of the matrix of waveform data.")
Metadata loaded from segment: 88995377_0001
60 seconds of data extracted from: 88995377_0001
Extracted the PPG and ECG signals from columns 4 and 2 of the matrix of waveform data.

Plot data

Full duration of signal

t = np.arange(0, (len(ecg) / fs), 1.0 / fs)
plt.plot(t, ecg, color = 'blue', label='ECG')
plt.xlabel('time (s)')
plt.ylabel('ECG')
plt.show()
../../_images/signal-quality-assessment_40_0.png

Just a short segment

plt.plot(t, ecg, color = 'blue', label='ECG')
plt.xlabel('time (s)')
plt.ylabel('ECG')
plt.xlim(0,5)
plt.show()
../../_images/signal-quality-assessment_42_0.png

Assess signal quality of this sample data

Using the functions defined above:

filter the ECG

sig = filter_ecg(ecg, fs)

Plot the result

t = np.arange(0, (len(ecg) / fs), 1.0 / fs)
plt.plot(t, ecg, color = 'blue', label='raw ECG')
plt.plot(t, sig, color = 'red', label='filtered ECG')
plt.xlim([0, 5])
plt.xlabel('time (s)')
plt.ylabel('ECG')
plt.legend()
plt.show()
../../_images/signal-quality-assessment_48_0.png

detect beats

beats = detect_beats(sig, fs)

Plot the result

t = np.arange(0, (len(ecg) / fs), 1.0 / fs)
plt.plot(t, sig, color = 'blue', label='ECG')
plt.scatter(t[beats], sig[beats], color = 'red', marker = 'o', label='beats')
plt.xlim([0, 15])
plt.xlabel('time (s)')
plt.ylabel('ECG')
plt.show()
../../_images/signal-quality-assessment_52_0.png

assess feasibility of beat detections

feas = assess_feasibility(beats)
print(f'feas (1 indicates feasible): {feas}')
feas (1 indicates feasible): 1

create template beat shape

templ = calculate_template(ecg, beats)
durn = (len(templ)-1)/fs
durn = durn+0.01
t = np.arange(0, durn, 1.0 / fs)
plt.plot(t, templ, color = 'blue', label='Template ECG beat')
plt.xlabel('time (s)')
plt.ylabel('Template ECG beat')
plt.show()
../../_images/signal-quality-assessment_57_0.png

calculate correlation coefficient

cc = calculate_cc(ecg, beats, templ)
print(f'Corr coeff: {cc}')
Corr coeff: 0.9606167664715456

compare correlation coefficient to threshold

qual = compare_cc_to_thresh(cc, thresh)

if feas == 0:
        qual = 0
print(f'quality (1 indicates high quality): {qual}')
quality (1 indicates high quality): 1

Question: What value of correlation coefficient would result in this being low quality?

Re-do quality assessment using single function

qual = assess_qual(ecg, fs, thresh)
print(f'quality (1 indicates high quality): {qual}')
quality (1 indicates high quality): 1

Extension 1: How could we extend this to assess the quality of PPG signals? Consider what threshold would be required (see the original publication) and how the code would need to be adjusted.

Further reading: this book chapter provides further information on PPG signal quality assessment.