Physionet CinC 2017 Challenge Dataset

Table of contents

  1. Overview
  2. Importing the data into MATLAB
  3. Resulting dataset

Overview

Item Details
Links Dataset, Publication
Signals ECG recordings lasting 9-61 seconds
No. Subjs 8,528
Protocol Handheld, single-lead recordings provided by AliveCor, alongside heart rhythm labels.

Importing the data into MATLAB

I took the following steps to import a subset of the data into MATLAB:

  1. Download the physionet_cinc_2017_challenge_data_collator MATLAB script.
  2. Download the zip folder of training ECG data from here.
  3. Unzip the folder to extract the individual files.
  4. Specify the location of the folder containing the individual files in the setup_up function in the script (see the up.paths.training_data_folder variable).
  5. Download the REFERENCE-v3.csv file containing rhythm labels from here.
  6. Specify the location of this file in the setup_up function in the script (see the up.paths.training_data_labels variable).
  7. Run the MATLAB script to collate all the individual ECG data files and the reference labels into a single MATLAB data file, ready for analysis.

Resulting dataset

The resulting dataset is a subset of the training data. Specifically, it only includes recordings of 30-second duration (5,977 out of 8,528 recordings). A duration of 30 seconds was chosen to ensure the data are in keeping with those collected in the STROKESTOP and SAFER studies (see here for details of these studies). The resulting subset contains the following recordings:

Label Abbreviation Number of recordings
Atrial fibrillation A 504
Normal sinus rhythm N 3,695
Other rhythm O 1,655
Noisy ~ 123

Copyright © 2021 Peter Charlton.