Data Description

This page provides a short description of the variables extracted from a physical examination and a heart sound auscultation. Please see this page for more details.

Note that information about murmurs is provided in the subject description file (.txt file) only when a murmur is detected in at least one of the patient’s recordings. Thus, no information about murmurs is provided for healthy subjects. For such subjects, a nan (not a number) symbol is provided for the murmur-associated variables.

The tag Outcome provides the pediatric cardiologist’s overall assessment of the patient’s cardiac condition:

Normal: no referral or indication for treatment.
Abnormal: referral or indication for treatment.

This is a holistic diagnosis based on clinical history, physical examination, analog auscultation, echocardiogram, etc. The cardiologists did not have access to the digital auscultations and the murmur gradings when making these diagnoses, and the indicated outcomes do not necessarily imply that the expert had identified a murmur during analog auscultation. The pediatric cardiologists were not the annotators of the digital auscultations.

The tag Murmur provides one of the following outcomes:

Present: murmur waves were detected in at least one heart sound recording.
Absent: murmur waves were not detected in any heart sound recording.
Unknown: the presence or absence of murmurs is unclear.

The tag Murmurs locations provides the auscultation location(s) where at least one murmur wave has been observed. If more than one location needs to be reported, then the locations are separated by plus (+) signs. The auscultation locations are:

PV: pulmonary valve
TV: tricuspid valve
AV: aortic valve
MV: mitral valve
Phc: any other location

The tag Most audible location provides the auscultation location where murmur waves were most audible:

PV: pulmonary valve
TV: tricuspid valve
AV: aortic valve
MV: mitral valve
Phc: any other auscultation location

The tag Systolic murmur timing describes the timing of the murmur wave within the systolic period. The possible values are one of the following:

Early-systolic: a murmur has been observed at the beginning of the systolic period.
Mid-systolic: a murmur has been observed at the middle of the systolic period.
Late-systolic: a murmur has been observed at the ending of the systolic period.
Holosystolic: a murmur has been observed over the whole systolic period.

The tag Systolic murmur shape describes the shape of the murmur wave that has been observed in the systolic period. The shape of a murmur can be viewed as a function of murmur intensity over time. The possible outcomes for this variable are:

Crescendo: the amplitude of the murmur wave increases over time.
Decrescendo: the amplitude of the murmur wave decreases over time.
Diamond: the amplitude of the murmur wave first increases for some time but then decreases for the rest of the time period.
Plateau: the amplitude of the murmur wave stays approximately constant over the whole period.

The tag Systolic murmur pitch describes the pressure gradient felt in the heart chambers. In general, the higher the pitch is, the higher the pressure gradient felt in the corresponding heart chamber. For example, in an aortic stenosis, a large pressure gradient is felt between the left ventricle and the aorta artery. As a result, murmurs generated by an aortic stenosis generally have a high pitch. The possible values are:

High
Medium
Low

The tag Systolic murmur grading describes the murmur’s intensity grade feature in the systolic period according to the Levine scale [12]. This feature is highly correlated with the severity of the murmur. The higher the grading, the worse the patient prognostic and outcome. Since not all subjects have auscultation sounds recorded from all the four prominent auscultation locations, the approach adopted by the expert annotator to provide grading annotations is as follows:

Grade I/VI: if barely audible and not heard/present or not recorded in all auscultation locations.
Grade II/VI: if soft, but easily heard in all auscultation locations.
Grade III/VI: if moderately loud or loud. In this dataset, grade III/VI denotes all grade III/VI and above (IV/VI, V/VI, and VI/VI).

Accordingly, the grade annotations can diverge from the original definition of murmur grading, when applied to cases for which not all the auscultation locations are available. In such cases, murmurs were classified by default as grade I/VI. Moreover, the cases classified as grade III/VI actually include murmurs that could potentially be of grade III/VI or higher, since discrimination among grades III/VI, IV/VI, V/VI, and VI/VI is associated with palpable murmurs, also known as thrills [13], which can only be assessed via in-person physical examination.

The tag Systolic murmur quality describes the murmur’s quality feature from waves observed in the systolic period. It relates to the presence of harmonics and the overtones. The possible values are:

Blowing
Harsh
Musical

The tag Diastolic murmur timingdescribes the timing of the murmur wave within the the diastolic period. The possible values are:

Early-diastolic: a murmur has been observed at the beginning of the diastolic period.
Mid-diastolic: a murmur has been observed at the middle of the diastolic period.
Holosystolic: a murmur has been observed over the whole diastolic period.

The tag Diastolic murmur shape describes the shape of the murmur wave that has been observed in the diastolic period. The possible values observed in this database are:

Decrescendo: the amplitude of the murmur wave decreases over time.
Plateau: the amplitude of the murmur wave stays approximately constant over the whole period.

The tag Diastolic murmur grading describes the murmur’s grade feature from waves heard during the diastolic period. In contrast to systolic murmurs, diastolic murmurs do not follow the Levine grading scale. Instead, murmurs are graded from I to IV (instead of I to VI):

Grade I/IV: if barely audible and not heard/present or not recorded in all auscultation locations
Grade II/IV: if soft, but easily heard in all auscultation locations
Grade III/IV: if moderately loud or loud

On the other hand, IV/IV are associated with palpable murmurs, also known as thrills [12], which can only be assessed via physical in-person examination. In this database, Grades III/IV and IV/IV are merged together.

The tag Diastolic murmur pitch describes the murmur’s pitch feature from waves observed in the diastolic period. The possible values are:

High
Medium
Low

The tag Diastolic murmur quality describes the murmur’s quality feature from waves heard in the diastolic period. The possible values observed in this database are:

Blowing
Harsh

The tag Age provides the age category of the subject and according to the National Institute of Child Health and Human Development (NICHD) pediatric terminology [14]. The possible values of this variable are:

Neonate: birth to 27 days old
Infant: 28 days old to 1 year old
Child: 1 to 11 years old
Adolescent: 12 to 18 years old
Young Adult: 19 to 21 years old

The tag Sex provides the subject’s reported gender. The possible values for this variable are Male or Female, according to the reported gender of the subject at the time of data acquisition. Subjects participating in both screening campaigns (CC2014 and CC2015) reported consistent genders across both campaigns.

The tag Height is a number and corresponds to the subject’s height in centimeters (cm).

The tag Weight is a number and corresponds to the subject’s weight in kilograms (kg).

The tag Pregnancy status is a boolean variable (True/False) that identifies the subjects that were pregnant at the time of the screening campaign.

The tag Campaign identifies the screening campaign that the subject attended:

CC2014: the 2014 screening campaign
CC2015: the 2015 screening campaign

Note that the two screening campaigns have a one year time gap and taking into account the subjects’ age, an open question that requires further study is whether or not any long-term dependencies can be found between the data acquired from the same subjects between the two campaigns. This is left as an open discussion for further evaluation by the scientific community.

The tag Additional ID provides the other identifier used by subjects who attended the both screening campaigns.

Note that for those subjects who attended both screening campaigns, the data from the same subject (regardless of the screening campaign) is provided only in the training, validation, or test set.

The acquired audio samples were automatically segmented using the three algorithms proposed in [15], [16], and [17]. These algorithms were only used to detect and identify the fundamental heart sounds (S1 and S2 sounds) and their corresponding boundaries. Two cardiac physiologists inspected the algorithms’ outputs on mutually exclusive data (as each expert screened only one of the two campaigns). Accordingly, each expert analyzed the automated annotations and whenever the annotator disagreed with the suggested automatic annotations, a manual annotation was required. In such cases, the annotator was instructed to annotate at least five complete heart cycles. Segmentation labels were retained for sections of heart sound recordings that were considered of high quality and representative by the cardiac physiologists. The remainder of the signal may include both low and high quality data. In this way, the users of the dataset may choose to use (or not to use) the suggested time windows, where the signal quality was manually inspected, and the automated labels were validated.

Supported by the National Institute of Biomedical Imaging and Bioengineering (NIBIB) under NIH grant number R01EB030362.

Back