Physionet CinC 2017 Challenge Dataset
Table of contents
Overview
Item | Details |
---|---|
Links | Dataset, Publication |
Signals | ECG recordings lasting 9-61 seconds |
No. Subjs | 8,528 |
Protocol | Handheld, single-lead recordings provided by AliveCor, alongside heart rhythm labels. |
Importing the data into MATLAB
I took the following steps to import a subset of the data into MATLAB:
- Download the
physionet_cinc_2017_challenge_data_collator
MATLAB script. - Download the zip folder of training ECG data from here.
- Unzip the folder to extract the individual files.
- Specify the location of the folder containing the individual files in the
setup_up
function in the script (see theup.paths.training_data_folder
variable). - Download the
REFERENCE-v3.csv
file containing rhythm labels from here. - Specify the location of this file in the
setup_up
function in the script (see theup.paths.training_data_labels
variable). - Run the MATLAB script to collate all the individual ECG data files and the reference labels into a single MATLAB data file, ready for analysis.
Resulting dataset
The resulting dataset is a subset of the training data. Specifically, it only includes recordings of 30-second duration (5,977 out of 8,528 recordings). A duration of 30 seconds was chosen to ensure the data are in keeping with those collected in the STROKESTOP and SAFER studies (see here for details of these studies). The resulting subset contains the following recordings:
Label | Abbreviation | Number of recordings |
---|---|---|
Atrial fibrillation | A | 504 |
Normal sinus rhythm | N | 3,695 |
Other rhythm | O | 1,655 |
Noisy | ~ | 123 |