Data Description

This page provides a short description of the variables extracted from a physical examination and a heart sound auscultation. Please see this page for more details.

Note that information about murmurs is provided in the subject description file (.txt file) only when a murmur is detected in at least one of the patient’s recordings. Thus, no information about murmurs is provided for healthy subjects. For such subjects, a nan (not a number) symbol is provided for the murmur-associated variables.

The tag Outcome provides the pediatric cardiologist’s overall assessment of the patient’s cardiac condition:

This is a holistic diagnosis based on clinical history, physical examination, analog auscultation, echocardiogram, etc. The cardiologists did not have access to the digital auscultations and the murmur gradings when making these diagnoses, and the indicated outcomes do not necessarily imply that the expert had identified a murmur during analog auscultation. The pediatric cardiologists were not the annotators of the digital auscultations.

The tag Murmur provides one of the following outcomes:

The tag Murmurs locations provides the auscultation location(s) where at least one murmur wave has been observed. If more than one location needs to be reported, then the locations are separated by plus (+) signs. The auscultation locations are:

The tag Most audible location provides the auscultation location where murmur waves were most audible:

The tag Systolic murmur timing describes the timing of the murmur wave within the systolic period. The possible values are one of the following:

The tag Systolic murmur shape describes the shape of the murmur wave that has been observed in the systolic period. The shape of a murmur can be viewed as a function of murmur intensity over time. The possible outcomes for this variable are:

The tag Systolic murmur pitch describes the pressure gradient felt in the heart chambers. In general, the higher the pitch is, the higher the pressure gradient felt in the corresponding heart chamber. For example, in an aortic stenosis, a large pressure gradient is felt between the left ventricle and the aorta artery. As a result, murmurs generated by an aortic stenosis generally have a high pitch. The possible values are:

The tag Systolic murmur grading describes the murmur’s intensity grade feature in the systolic period according to the Levine scale [12]. This feature is highly correlated with the severity of the murmur. The higher the grading, the worse the patient prognostic and outcome. Since not all subjects have auscultation sounds recorded from all the four prominent auscultation locations, the approach adopted by the expert annotator to provide grading annotations is as follows:

Accordingly, the grade annotations can diverge from the original definition of murmur grading, when applied to cases for which not all the auscultation locations are available. In such cases, murmurs were classified by default as grade I/VI. Moreover, the cases classified as grade III/VI actually include murmurs that could potentially be of grade III/VI or higher, since discrimination among grades III/VI, IV/VI, V/VI, and VI/VI is associated with palpable murmurs, also known as thrills [13], which can only be assessed via in-person physical examination.

The tag Systolic murmur quality describes the murmur’s quality feature from waves observed in the systolic period. It relates to the presence of harmonics and the overtones. The possible values are:

The tag Diastolic murmur timingdescribes the timing of the murmur wave within the the diastolic period. The possible values are:

The tag Diastolic murmur shape describes the shape of the murmur wave that has been observed in the diastolic period. The possible values observed in this database are:

The tag Diastolic murmur grading describes the murmur’s grade feature from waves heard during the diastolic period. In contrast to systolic murmurs, diastolic murmurs do not follow the Levine grading scale. Instead, murmurs are graded from I to IV (instead of I to VI):

On the other hand, IV/IV are associated with palpable murmurs, also known as thrills [12], which can only be assessed via physical in-person examination. In this database, Grades III/IV and IV/IV are merged together.

The tag Diastolic murmur pitch describes the murmur’s pitch feature from waves observed in the diastolic period. The possible values are:

The tag Diastolic murmur quality describes the murmur’s quality feature from waves heard in the diastolic period. The possible values observed in this database are:

The tag Age provides the age category of the subject and according to the National Institute of Child Health and Human Development (NICHD) pediatric terminology [14]. The possible values of this variable are:

The tag Sex provides the subject’s reported gender. The possible values for this variable are Male or Female, according to the reported gender of the subject at the time of data acquisition. Subjects participating in both screening campaigns (CC2014 and CC2015) reported consistent genders across both campaigns.

The tag Height is a number and corresponds to the subject’s height in centimeters (cm).

The tag Weight is a number and corresponds to the subject’s weight in kilograms (kg).

The tag Pregnancy status is a boolean variable (True/False) that identifies the subjects that were pregnant at the time of the screening campaign.

The tag Campaign identifies the screening campaign that the subject attended:

Note that the two screening campaigns have a one year time gap and taking into account the subjects’ age, an open question that requires further study is whether or not any long-term dependencies can be found between the data acquired from the same subjects between the two campaigns. This is left as an open discussion for further evaluation by the scientific community.

The tag Additional ID provides the other identifier used by subjects who attended the both screening campaigns.

Note that for those subjects who attended both screening campaigns, the data from the same subject (regardless of the screening campaign) is provided only in the training, validation, or test set.

The acquired audio samples were automatically segmented using the three algorithms proposed in [15], [16], and [17]. These algorithms were only used to detect and identify the fundamental heart sounds (S1 and S2 sounds) and their corresponding boundaries. Two cardiac physiologists inspected the algorithms’ outputs on mutually exclusive data (as each expert screened only one of the two campaigns). Accordingly, each expert analyzed the automated annotations and whenever the annotator disagreed with the suggested automatic annotations, a manual annotation was required. In such cases, the annotator was instructed to annotate at least five complete heart cycles. Segmentation labels were retained for sections of heart sound recordings that were considered of high quality and representative by the cardiac physiologists. The remainder of the signal may include both low and high quality data. In this way, the users of the dataset may choose to use (or not to use) the suggested time windows, where the signal quality was manually inspected, and the automated labels were validated.


Supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number R01EB030362.

© PhysioNet Challenges. Website content licensed under the Creative Commons Attribution 4.0 International Public License.

Back