Papers, code, and official scores |
January 26, 2022: Both the 2020 Challenge and the 2021 Challenge, which extended the 2020 Challenge, are now complete.
To refer to the 2020 Challenge, please cite the following article:
Perez Alday EA, Gu A, J Shah A, Robichaux C, Ian Wong AK, Liu C, Liu F, Bahrami Rad A, Elola A, Seyedi S, Li Q, Sharma A, Clifford GD* Reyna MA*. Classification of 12-lead ECGs: The PhysioNet/Computing in Cardiology Challenge 2020. Physiol. Meas. 2021 Jan 1;41(12):124003, http://doi.org/10.1088/1361-6579/abc960.
October 25, 2021: We are currently evaluating entries on the 2021 Challenge test data in support of the Physiological Measurement focus issue on multilead ECG classification. The deadline to submit your code and a preprint is 1 December 2021, and the deadline to submit your article is 11 January 2022. See this forum announcement for details.
September 20, 2021: The winners of the 2021 Challenge were announced on 15 September 2021 at CinC in Brno, Czech Republic. Congratulations, teams! See this page for the results and the full announcement for the final steps in this year’s Challenge, including details about the focus issue (deadline: 11 January 2022).
September 15, 2021: In honor of the contributions of George Moody to PhysioNet and Computing in Cardiology, the Board of CinC voted to rename the Challenges to the George B. Moody PhysioNet Challenge.
July 21, 2021: As you prepare your CinC papers, please follow the CinC preparation and submission instructions and use either our LaTeX (Overleaf or download) or Word templates, which include important instructions, advice, and references. Please see here for more information, including our draft paper and important citation information.
June 23, 2021: CinC has released its abstract decisions for the Challenge track of the conference. Congratulations to those with accepted abstracts. Those without an accepted abstract can still compete for a wildcard entry.
May 1, 2021: The official phase of the Challenge reopens today. Due to your engagement, we have enormously expanded the training data, modified the lead combination, and modified the example code and scoring function. Please see our announcement on the Challenge forum for more details. We will update and clarify these changes in response to your questions in the coming days.
April 19, 2021: CinC has extended its abstract submission deadline to April 24, 2021. Please submit your abstract if you have not done so already. Like last year, CinC will host a hybrid conference with both in-person and remote attendance. Please see our announcement on the Challenge forum for more details.
April 13, 2021: Only two days left to submit an abstract to CinC! Please find the abstract submission announcement and the instructions announcement on the Challenge forum. Please see the leaderboard for the final scores of the unofficial phase, and please submit your abstract today!
February 24, 2021: The leaderboard is now live! Please see the announcement on the Challenge forum. Please see the updated rules about the number of submissions allowed per day, so please submit early!
January 30, 2021: We are now accepting submissions for the 2021 Challenge! See below for details. Please register your team (even if you registered last year), check the submission instructions, and submit your code when ready. As always, please join the Challenge forum to discuss this year’s Challenge.
December 24, 2020: The NIH-funded 2021 Challenge is now open! See below for details. Please read this website for details and share questions and comments on Challenge forum. This year’s Challenge is generously co-sponsored by Google, MathWorks, and the Gordon and Betty Moore Foundation.
The electrocardiogram (ECG) is a non-invasive representation of the electrical activity of the heart. Although the twelve-lead ECG is the standard diagnostic screening system for many cardiological issues, the limited accessibility of twelve-lead ECG devices provides a rationale for smaller, lower-cost, and easier to use devices. While single-lead ECGs are limiting [1], reduced-lead ECG systems hold promise, with evidence that subsets of the standard twelve leads can capture useful information [2], [3], [4] and even be comparable to twelve-lead ECGs in some limited contexts. In 2017 we challenged the public to classify AF from a single-lead ECG, and in 2020 we challenged the public to diagnose a much larger number of cardiac problems using twelve-lead recordings. However, there is limited evidence to demonstrate the utility of reduced-lead ECGs for capturing a wide range of diagnostic information.
In this year’s Challenge, we ask the following question: ‘Will two do?’ This year’s Challenge builds on last year’s Challenge [5], which asked participants to classify cardiac abnormalities from twelve-lead ECGs. We are asking you to build an algorithm that can classify cardiac abnormalities from twelve-lead, six-lead, four-lead, three-lead, and two-lead ECGs. We will test each algorithm on databases of these reduced-lead ECGs, and the differences in performances of the algorithms on these databases will reveal the utility of reduced-lead ECGs in comparison to standard twelve-lead EGCs.
The goal of the 2021 Challenge is to identify clinical diagnoses from twelve-lead, six-lead (I, II, III, aVR, aVL, aVF), four-lead (I, II, III, V2), three-lead (I, II, V2), and two-lead (I and II) ECG recordings.
We ask participants to design and implement a working, open-source algorithm that, based only on the provided twelve-lead ECG recordings and routine demographic data, can automatically identify any cardiac abnormalities present in the recording. We will award prizes for the top performing twelve-lead, six-lead, four-lead, three-lead, and two-lead algorithms.
The training data contains twelve-lead ECGs. The validation and test data contains twelve-lead, six-lead, four-lead, three-lead, and two-lead ECGs:
Each ECG recording has one or more labels that describe cardiac abnormalities (and/or a normal sinus rhythm). We mapped the labels for each recording to SNOMED-CT codes. The lists of scored labels and unscored labels are given with the evaluation code; see the scoring section for details.
The Challenge data include recordings from last year’s Challenge and many new recordings for this year’s Challenge:
The Challenge data include annotated twelve-lead ECG recordings from six sources in four countries across three continents. These databases include over 100,000 twelve-lead ECG recordings with over 88,000 ECGs shared publicly as training data, 6,630 ECGs retained privately as validation data, and 36,266 ECGs retained privately as test data.
The first source is the China Physiological Signal Challenge in 2018 (CPSC 2018), which was held during the 7th International Conference on Biomedical Engineering and Biotechnology in Nanjing, China. This source contains two databases: the data from CPSC 2018 (the CPSC Database) and unused data from CPSC 2018 (the CPSC-Extra Database). Together, these databases contain 13,256 ECGs (10,330 ECGs shared as training data, 1,463 retained as validation data, and 1,463 retained as test data). We shared the training set and an unused dataset from CPSC 2018 as training data, and we split the test set from CPSC 2018 into validation and test sets. Each recording is between 6 and 144 seconds long with a sampling frequency of 500 Hz.
The second source is the St Petersburg INCART 12-lead Arrhythmia Database. This source contains 74 annotated ECGs (all shared as training data) extracted from 32 Holter monitor recordings. Each recording is 30 minutes long with a sampling frequency of 257 Hz.
The third source is the Physikalisch-Technische Bundesanstalt (PTB) and includes two public datasets: the PTB and the PTB-XL databases. The source contains 22,353 ECGs (all shared as training data). Each recording is between 10 and 120 seconds long with a sampling frequency of either 500 or 1,000 Hz.
The fourth source is a Georgia database which represents a unique demographic of the Southeastern United States. This source contains 20,672 ECGs (10,344 ECGs shared as training data, 5,167 retained as validation data, and 5,161 retained as test data). Each recording is between 5 and 10 seconds long with a sampling frequency of 500 Hz.
The fifth source is an undisclosed American database that is geographically distinct from the Georgia database. This source contains 10,000 ECGs (all retained as test data).
The sixth source is the Chapman University, Shaoxing People’s Hospital (Chapman-Shaoxing) and Ningbo First Hospital (Ningbo) database [5], [6]. This source contains 45,152 ECGS (all shared as training data). Each recording is 10 seconds long with a sampling frequency of 500 Hz.
The seventh source is UMich Database from the University of Michigan. This source contains 19,642 ECGs (all retained as test data). Each recording is 10 seconds long with a sampling frequency of either 250 Hz or 500 Hz.
Like other real-world datasets, different databases may have different proportions of cardiac abnormalities, but all of the labels in the validation or test data are represented in the training data. Moreover, while this is a curated dataset, some of the data and labels are likely to have errors, and an important part of the Challenge is to work out these issues. In particular, some of the databases have human-overread machine labels with single or multiple human readers, so the quality of the labels varies between databases. You can find more information about the label mappings of the Challenge training data in this table.
The six-lead, four-lead, three-lead, and two-lead validation data are reduced-lead versions of the twelve-lead validation data: the same recordings with the same header data but only with signal data for the relevant leads.
We are not planning to release the test data at any point, including after the end of the Challenge. Requests for the test data will not receive a response. We do not release test data to prevent overfitting on the test data and claims or publications of inflated performances. We will entertain requests to run code on the test data after the Challenge on a limited basis based on publication necessity and capacity. (The Challenge is largely staged by volunteers.)
All data was formatted in WFDB format. Each ECG recording uses a binary MATLAB v4 file (see page 27) for the ECG signal data and a plain text file in WFDB header format for the recording and patient attributes, including the diagnosis, i.e., the labels for the recording. The binary files can be read using the load function in MATLAB and the scipy.io.loadmat function in Python; see our MATLAB and Python example code for working examples. The first line of the header provides information about the total number of leads and the total number of samples or time points per lead, the following lines describe how each lead was encoded, and the last lines provide information on the demographics and diagnosis of the patient.
For example, a header file A0001.hea
may have the following contents:
A0001 12 500 7500 05-Feb-2020 11:39:16
A0001.mat 16+24 1000/mV 16 0 28 -1716 0 I
A0001.mat 16+24 1000/mV 16 0 7 2029 0 II
A0001.mat 16+24 1000/mV 16 0 -21 3745 0 III
A0001.mat 16+24 1000/mV 16 0 -17 3680 0 aVR
A0001.mat 16+24 1000/mV 16 0 24 -2664 0 aVL
A0001.mat 16+24 1000/mV 16 0 -7 -1499 0 aVF
A0001.mat 16+24 1000/mV 16 0 -290 390 0 V1
A0001.mat 16+24 1000/mV 16 0 -204 157 0 V2
A0001.mat 16+24 1000/mV 16 0 -96 -2555 0 V3
A0001.mat 16+24 1000/mV 16 0 -112 49 0 V4
A0001.mat 16+24 1000/mV 16 0 -596 -321 0 V5
A0001.mat 16+24 1000/mV 16 0 -16 -3112 0 V6
#Age: 74
#Sex: Male
#Dx: 426783006
#Rx: Unknown
#Hx: Unknown
#Sx: Unknown
From the first line of the file, we see that the recording number is A0001, and the recording file is A0001.mat
. The recording has 12 leads, each recorded at a 500 Hz sampling frequency, and contains 7500 samples. From the next 12 lines of the file (one for each lead), we see that each signal was written at 16 bits with an offset of 24 bits, the floating point number (analog-to-digital converter (ADC) units per physical unit) is 1000/mV, the resolution of the analog-to-digital converter (ADC) used to digitize the signal is 16 bits, and the baseline value corresponding to 0 physical units is 0. The first value of the signal (-1716, etc.), the checksum (0, etc.), and the lead name (I, etc.) are the last three entries of each of these lines. From the final 6 lines, we see that the patient is a 74-year-old male with a diagnosis (Dx) of 426783006, which is the SNOMED-CT code for sinus rhythm. The medical prescription (Rx), history (Hx), and symptom or surgery (Sx) are unknown. Please visit WFDB header format for more information on the header file and variables.
The training data can be downloaded from this page:
Please move the files to a single folder to use the Challenge algorithms with the data.
To participate in the Challenge, register your team by providing the full names, affiliations, and official email addresses of your entire team before you submit your algorithm. The details of all authors must be exactly the same as the details in your abstract submission to Computing in Cardiology. You may update your author list by completing this form again (read the form for details), but changes to your authors must not contravene the rules of the Challenge.
For each ECG recording, your algorithm must identify a set of one or more classes as well as a probability or confidence score for each class. As an example, suppose that your classifier identifies atrial fibrillation (164889003) and a first-degree atrioventricular block (270492004) with probabilities of 90% and 60%, respectively, for a particular recording, but it does not identify any other rhythm types. Your code might produce the following output for the recording:
#Record ID
164889003, 270492004, 164909002, 426783006, 59118001, 284470004, 164884008, 429622005, 164931005
1, 1, 0, 0, 0, 0, 0, 0, 0
0.9, 0.6, 0.2, 0.05, 0.2, 0.35, 0.35, 0.1, 0.1
We have implemented two example algorithms in MATLAB and Python as templates for successful submissions.
.hea
file)..hea
file).Please use the above example code as templates for your submissions.
Please see the submission instructions for detailed information about how to submit a successful Challenge entry. We will open scoring in January. We will provide feedback on your entry as soon as possible, so please wait at least 72 hours before contacting us about the status of your entry.
Like last year’s Challenge, we will continue to require code both for your trained model and for testing your model. If we cannot reproduce your model from the training code, then you will not be eligible for ranking or a prize.
We will run your training code on Google Cloud using 10 vCPUs, 65 GB RAM, 100 GB disk space, and an optional NVIDIA T4 Tensor Core GPU with 16 GB VRAM. Your training code has a 72 hour time limit without a GPU and a 48 hour time limit with a GPU.
We will run your trained model on Google Cloud using 6 vCPUs, 39 GB RAM, 100 GB disk space, and an optional NVIDIA T4 Tensor Core GPU with 16 GB VRAM. Your trained model has a 24 hour time limit on each of the validation and test sets.
We are using an N1 custom machine type to run submissions on GCP. If you would like to use a predefined machine type, then the n1-highmem-8
is the closest predefined machine type, but with 2 fewer vCPUs and 13 GB less RAM. For GPU submissions, we use the 418.40.04 driver version.
For last year’s Challenge, we developed a new scoring metric that awards partial credit to misdiagnoses that result in similar treatments or outcomes as the true diagnosis as judged by our cardiologists. This scoring metric reflects the clinical reality that some misdiagnoses are more harmful than others and should be scored accordingly. Moreover, it reflects the fact that confusing some classes is less harmful than confusing others.
We are starting this year’s Challenge with this scoring metric, but we welcome feedback. It is defined as follows:
Let C = [ci] be a collection of diagnoses. We compute a multi-class confusion matrix A = [aij], where aij is the number of recordings in a database that were classified as belonging to class ci but actually belong to class cj. We assign different weights W = [wij] to different entries in this matrix based on the similarity of treatments or differences in risks. The score s is given by s = Σij wij aij, which is a generalized version of the traditional accuracy metric. The score s is then normalized so that a classifier that always outputs the true class(es) receives a score of 1 and an inactive classifier that always outputs the normal class receives a score of 0.
The scoring metric is designed to award full credit to correct diagnoses and partial credit to misdiagnoses with similar risks or outcomes as the true diagnosis. A classifier that returns only positive outputs typically receives a negative score, i.e., a lower score than a classifier that returns only negative outputs.
The leaderboard provides the scores of successful submissions on the hidden data.
There are two phases for the Challenge: an unofficial phase and an official phase. The unofficial phase of the Challenge allows us to introduce and ‘beta test’ the data, scores, and submission system before the official phase of the Challenge. Participation in the unofficial phase is mandatory for participating in the official phase of the Challenge because it helps us to improve the official phase.
Entrants may have an overall total of up to 15 scored entries over both the unofficial and official phases of the competition (see the below table). All deadlines occur at 11:59pm GMT on the dates mentioned below, and all dates are during 2021 unless indicated otherwise. If you do not know the difference between GMT and your local time, then find it out before the deadline!
Please submit your entries early to ensure that you have the most chances for success. If you wait until the last few days to submit your entries, then you may not receive feedback before the submission deadline, and you may be unable to resubmit your entries if there are unexpected errors or issues with your submissions. Every year, several teams wait until the last few days to submit their first entry and are unable to debug their work before the deadline.
Although we score on a first-come-first-serve basis, please note that if you submit more than one entry in a 24-hour period, your second entry may be deprioritized compared to other teams’ first entries. If you submit more than one entry in the final 24 hours before the Challenge deadline, then we may be unable to provide feedback or a score for more than one of your entries. It is unlikely that we will be able to debug any code in the final days of the Challenge.
For these reasons, we strongly suggest that you start submitting entries at least 5 days before the unofficial deadline and 10 days before the official deadline. We have found that the earlier teams enter the Challenge, the better they do because they have time to digest feedback and performance. We therefore suggest entering your submissions many weeks before the deadline to give yourself the best chance for success.
Start | End | Submissions | |
---|---|---|---|
Unofficial phase | 24 December 2020 | 8 April 2021 | 1-5 scored entries (*) |
Hiatus | 9 April 2021 | 30 April 2021 | N/A |
Abstract deadline | 24 April 2021 | 24 April 2021 | 1 abstract |
Official phase | 1 May 2021 | 15 August 2021 | 1-10 scored entries (*) |
Abstract decisions released | 21 June 2021 | 21 June 2021 | N/A |
Wild card entry date | 31 July 2021 | 31 July 2021 | N/A |
Hiatus | 16 August 2021 | 11 September 2021 | N/A |
Preprint deadline | 1 September 2021 | 1 September 2021 | One 4-page paper (**) |
Conference | 12 September 2021 | 15 September 2021 | 1 presentation (***) |
Final scores released | 16 September 2021 | 16 September 2021 | N/A |
Final paper deadline | 23 September 2021 | 30 September 2021 | One 4-page paper (***) |
(* Entries that fail to score do not count against limits.)
(** Must include preliminary scores.)
(*** Must include final scores, your ranking in the Challenge, and any updates to your work as a result of feedback after presenting at CinC. This final paper daedline is earlier than the deadline given by CinC so that we can check these details.)
To be eligible for the open-source award, you must do all the following:
You must not submit an analysis of this year’s Challenge data to other conferences or journals until after CinC 2021 so that we can discuss the Challenge in a single forum. If we discover evidence that you have submitted elsewhere before the end of CinC 2021, then you will be disqualified and de-ranked on the website, banned from future Challenges, and the journal/conference will be contacted to request your article be withdrawn for contravention of the terms of use.
There are many reasons for this policy: 1) we do not release results on the test data before the end of CinC, and only reporting results on the training data increases the likelihood of overfitting and is not comparable to the official results on the test data, and 2) attempting to publish on the Challenge data before the Challengers present their results is unprofessional and comes across as a territorial grab. This requirement stands even if your abstract is rejected, but you may continue to enter the competition and receive scores. (However, unless you are accepted into the conference at a later date as a ‘wild card’ entry, you will not be eligible to win a prize.) Of course, any publicly available data that was available before the Challenge is exempted from this condition, but any of the novelty of the Challenge (the Challenge design, the Challenge data that you downloaded from this page because it was processed for the Challenge, the scoring function, etc.) is not exempted.
After the Challenge is over and the final scores have been posted (in late September), everyone may then submit their work to a journal or another conference. In particular, we encourage all entrants (including those who missed the opportunity to compete or attend CinC 2021) to submit extended analysis and articles to the special issue, taking into account the publications and discussions at CinC 2021.
If your abstract is rejected or if you otherwise failed to qualify during the unofficial period, then there is still a chance to present as CinC and win the Challenge. A ‘wild card’ entry has been reserved for a high-scoring entry from a team that was unable to submit an accepted abstract to CinC by the original abstract submission deadline. A successful entry must be submitted by the wild card entry deadline. We will contact eligible teams and ask them to submit an abstract. The abstract will still be reviewed as thoroughly as any other abstract accepted for the conference. See Advice on Writing an Abstract.
To improve your chances of having your abstract accepted, we offer the following advice:
You will be notified if your abstract has been accepted by email from CinC in June. You may not enter more than one abstract describing your work in the Challenge. We know you may have multiple ideas, and the actual abstract will evolve over the course of the Challenge. More information, particularly on discounts and scholarships, can be found here. We are sorry, but the Challenge Organizers do not have extra funds to enable discounts or funding to attend the conference.
Again, we cannot guarantee that your code will be run in time for the CinC abstract deadline, especially if you submit your code immediately before the deadline. It is much more important to focus on writing a high-quality abstract describing your work and submit this to the conference by abstract deadline. Please follow these instructions here carefully.
Please make sure that all of your team members are authors on your abstract. If you need to add or subtract authors, do this at least a week before the abstract deadline. Asking us to alter your team membership near or after the deadline is going to lead to confusion that could affect your score during review. It is better to be more inclusive on the abstract in terms of authorship, though, and if we find authors have moved between abstracts/teams without permission, then this is likely to lead to disqualification. As noted above, you may change the authors/team members later in the Challenge.
Please make sure that you include your team name, your official score as it appears on the leaderboard, and cross validation results in your abstract using the scoring metrics for this year’s Challenge (especially if you are unable to receive a score or are scoring poorly). The novelty of your approach and the rigor of your research is much more important during the unofficial phase. Please make sure you describe your technique and any novelty very specifically. General statements such as ‘a 1D CNN was used’ are uninformative and will score poorly in review.
The Organizers of the Challenge have no ability to help with any problems with the abstract submission system. We do not operate it. Please do not email us with issues related to the abstract submission system.
We encourage the use of open-source licenses for your entries.
Entries with non open-source licenses will be scored but not ranked in the official competition. All scores will be made public. At the end of the competition, all entries will be posted publicly, and therefore automatically mirrored on several sites around the world. We have no control over these sites, so we cannot remove your code even on request. Code which the organizers deem to be functional will be made publicly available after the end of the Challenge. You can request to withdraw from the Challenge, so that your entry’s performance is no longer listed in the official leader board, up until a week before the end of the official phase. However, the Organizers reserve the right to publish any submitted open-source code after the official phase is over. The Organizers also retain the right to use a copy of submitted code for non-commercial use. This allows us to re-score if definitions change and validate any claims made by competitors.
If no license is specified in your submission, then the license given in the example code will be added to your entry, i.e., we will assume that you have released your code under the BSD 3-Clause license.
To maintain the scientific impact of the Challenges, it is important that all Challengers contribute truly independent ideas. For this reason, we impose the following rules on team composition/collaboration:
If we discover evidence of the contravention of these rules, then you will be ineligible for a prize and your entry publicly marked as possibly associated with another entry. Although we will contact the team(s) in question, time and resources are limited and the Organizers must use their best judgement on the matter in a short period of time. The Organizers’ decision on rule violations will be final.
CinC 2021 will take place from 12-15 September 2021 in Brno, Czech Republic. You must attend the whole conference to be eligible for prizes. If you send someone in your place who is not a team member or co-author, then you will be disqualified and your abstract will be removed from the proceedings. In particular, it is vital that the presenter (oral or poster) can defend your work and have in-depth knowledge of all decisions made during the development of your algorithm. Due to this year’s challenges, both in person and remote attendance are allowed. If you require a visa to attend the conference, we strongly suggest that you apply as soon as possible. Please contact the local conference organizing committee (not the Challenge Organizers) for any visa sponsorship letters and answer any questions concerning the conference.
Due to the uncertainties around travel, we have unfortunately decided not to run the Hackathon again this year.
This year’s Challenge is generously co-sponsored by Google, MathWorks, and the Gordon and Betty Moore Foundation.
MathWorks has generously decided to sponsor this Challenge by providing complimentary licenses to all teams that wish to use MATLAB. Users can apply for a license and learn more about MATLAB support by visiting the PhysioNet Challenge page from MathWorks. If you have questions or need technical support, then please contact MathWorks at studentcompetitions@mathworks.com.
Google has generously agreed to provide Google Cloud Platform (GCP) credits for a limited nunber of teams for this Challenge.
At the time of launching this Challenge, Google Cloud offers multiple services for free on a one-year trial basis and $300 in cloud credits. Additionally, if teams are based at an educational institution in selected countries, then they can access free GCP training online.
Google Cloud credits will be made available to teams that requested credits when registering for the Challenge. Only one credit will be provided to one email address associated with each team, and teams must have a successful entry to the official phase of the Challenge and an accepted abstract to CinC.
The Challenge Organizers, their employers, PhysioNet and Computing in Cardiology accept no responsibility for the loss of credits, or failure to issue credits for any reason. Please note, by requesting credits, you are granting us permission to forward your details to Google for the distribution of credits. You can register for these credits during the Challenge registration process.
This year’s Challenge is generously co-sponsored by Google, MathWorks, and the Gordon and Betty Moore Foundation.
Supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number R01EB030362.
© PhysioNet Challenges. Website content licensed under the Creative Commons Attribution 4.0 International Public License.