Validation of the Systematic COronary Risk Evaluation - Older Persons (SCORE-OP) in the EPIC-Norfolk prospective population study

Background: TheSystematicCOronary RiskEvaluation – OlderPersons(SCORE-OP)algorithm isdevelopedtoas- sess10-year riskofdeath dueto cardiovascular disease(CVD)inindividualsaged ≥ 65years.Westudiedtheper-formance of SCORE-OP in the European Prospective Investigation of Cancer Norfolk (EPIC-Norfolk) prospective population cohort. Methods: 10-year CVD mortality as predicted by SCORE-OP was compared with observed CVD mortality among individuals intheEPIC-Norfolkcohort.Personsaged65 – 79yearswithoutknown CVDwereincludedintheanal-ysis. CVD mortality was de ﬁ ned as death due to ischemic heart disease, cardiac failure, cerebrovascular disease, peripheral-artery disease or aortic aneurysm. Predicted 10-year CVD mortality was calculated by the SCORE-OP algorithm, and compared to observed mortality rates. The area under the receiver operator characteristics curve (AUROC)wascalculatedtoevaluatediscriminativepower.Calibrationwasevaluatedbycalculatingratiosofpre-dicted vs observed mortality and by Hosmer-Lemeshow tests. Results: Atotalof6590individuals(45.8%men),meanage70.2years(standarddeviation3.3)wereincluded.The predicted mortality by SCORE-OP was 9.84% (95% con ﬁ dence interval (CI) 9.76 – 9.92) and observed mortality was 10.2% (95% CI 9.52 – 11.04), ratio 0.96. AUROC was 0.63 (95% CI 0.60 – 0.65), and X 2 was 3.3 ( p = 0.92). Conclusion: SCORE-OP overall accurately estimates the rate of CVD mortality in a general population aged 65 – 79 years. However, while calibration is excellent, the discriminative power of the SCORE-OP is limited, and as such cannot be readily implemented in clinical practice for this population.


Introduction
In the next decades, the population of individuals aged 65 years and older will grow until 17% of the world's total population [1]. It is predicted that the global burden of cardiovascular diseases (CVD) will increase proportionally in this group [2]. While the effect of primary prevention is well documented in the younger population, there is increasing evidence that older individuals also benefit from primary prevention of CVD [3].
The European guideline on CVD prevention recommends using SCORE (Systematic COronary Risk Evaluation) as a decision-making tool in primary prevention [4]. However, the original SCORE charts were only developed and validated in individuals up to 65 years of age and not validated for individuals older than 65 years. Recently, Cooney et al. derived and validated a risk assessment function, SCORE-OP (Older Persons) for individuals over 65 years of age [5]. This risk assessment function has only been externally evaluated in limited analysis in a small sample of individuals aged 65-69 years [6]. We therefore studied the performance of the SCORE-OP in the European Prospective Investigation of Cancer Norfolk (EPIC-Norfolk) prospective population study, a large population-based United Kingdom (UK) cohort with individuals aged up to 79 years [7]. brief, 25,639 adults provided written informed consent for study participation. They attended a baseline health assessment and completed questionnaires about personal and family history of lifestyle including smoking status. Participants were asked whether they had any of the following conditions: diabetes mellitus, myocardial infarction or stroke (self-reported). Participants were followed-up for cause-specific mortality.

Study design
In accordance with the selection criteria of the SCORE-OP algorithm, we included all participants aged 65-79 years of the EPIC-Norfolk cohort. We excluded those with a history of CVD (myocardial infarction and stroke) at baseline, and participants with missing data on SCORE-OP variables. CVD mortality was defined as death where CVD was coded as the underlying or contributing cause. CVD was defined as ischaemic heart disease (ICD-10 codes I20-25), cardiac failure (ICD codes I11, I13 and I50), cerebrovascular disease (ICD-10 codes I60-I69), peripheral artery disease (ICD-10 codes I70-I79) and aortic aneurysm (ICD-10 code I71).

Statistical methods
Baseline characteristics are summarized for men and women and excluded individuals separately by using numbers and percentages for categorical data, mean and standard deviations (SD) for continuous data with a normal distribution and median and interquartile range for continuous variables with a non-normal distribution. Our main parameter of interest was predicted 10-year CVD mortality as calculated with the SCORE-OP algorithm compared to observed 10-year CVD mortality [5]. Variables included in the SCORE-OP algorithm are age, sex, systolic blood pressure, smoking status, total cholesterol, HDL cholesterol and diabetes. Correspondingly, we limited the observed mortality rates in our cohort to the first 10 years with Kaplan-Meier (KM) estimates. We evaluated SCORE-OP by using ratios of predicted and observed CVD mortality. Discriminative power of SCORE-OP was evaluated by calculating the area under the receiver operator characteristic curve (AUROC). The Hosmer-Lemeshow goodness-of-fit test based on chi-square statistics was performed to assess calibration of the SCORE-OP algorithm. In accordance with the SCORE-OP charts we stratified by age and sex subgroups of 65-69, 70-74, 75-79 years. In addition, we stratified the study population by groups of 2% increments in SCORE-OP risk, and analyzed differences of the SCORE-OP performance in these risk groups by calculating ratios of predicted and observed 10-year CVD mortality. A sensitivity analysis of the SCORE-OP was performed on normotensive (systolic blood pressure ≤ 140 mmHg) and hypertensive (systolic blood pressure ≥ 140 mmHg) individuals. SCORE-OP also provides coefficients for 5-year CVD mortality prediction [5]. We therefore compared the performance of 5-year SCORE-OP with observed 5-year CVD mortality (KM estimate), and evaluated with ratios of predicted and observed mortality, in addition to evaluating its discriminative power (AUROC) and calibration (Hosmer-Lemeshow).
We also compared the predicted 10-year CVD mortality as calculated using SCORE low-risk with SCORE-OP. Although the SCORE low-risk algorithm has not been developed and validated for individuals older than 65 years, we evaluated the performance in the same manner as SCORE-OP in the different age-sex groups (predicted/observed ratios, discrimination, and calibration) to compare the performance of both algorithms. Differences in discriminative power between SCORE low-risk and SCORE-OP were compared using the C-statistic.
To assess the clinical impact of SCORE-OP on the initiation of preventive therapies, we calculated the percentage of individuals above the 5% and 10% 10-year CVD mortality risk threshold for both the SCORE-OP and SCORE low-risk algorithms [4,5].

Results
The study population consisted of 8145 participants aged 65-79 years. A total of 1555 participants were excluded due to a history of CVD (n = 665), missing data on baseline CVD (n = 13) or missing data for the SCORE-OP variables (n = 877), leaving 6590 participants eligible for analysis (Fig. 1). Mean age was 70.2 years (SD 3.3), 45.8% were men and 8.3% were current smokers. Mean body mass index was 26.5 kg/m 2 (SD 3.7), mean total cholesterol was 6.4 mmol/l (SD 1.2), and mean LDL cholesterol was 4.2 mmol/l (SD 1.1). Excluded cases showed a 4.3% higher incidence of diabetes mellitus (Table 1). Table 2 presents predicted 10-year CVD mortality according to the SCORE-OP algorithm and observed 10-year CVD mortality. In the total population the predicted CVD mortality was 9.84% (95% CI 9.76-9.92) whereas observed CVD mortality (KM estimate) was 10.2% (95% CI 9.52-11.04), yielding a ratio of 0.96. Goodness-of-fit for the SCORE-OP algorithm was excellent with a X 2 of 3.26, (p = 0.92). Discriminative performance was limited, with an AUROC of 0.63 (95% CI 0.60-0.65).

Performance of SCORE-OP 10-year predicted cardiovascular mortality
In men and women, the predicted 10-year CVD mortality versus observed CVD mortality ratio was 0.92 and 1.004, respectively. Goodnessof-fit for the SCORE-OP algorithm was excellent in both men and women with a X 2 of 13.27, (p = 0.10) and 10.03 (p = 0.26), respectively. Discriminative performance was limited in both groups with an AUROC of 0.60 (95% CI 0.57-0.63) in men and 0.58 (95% CI 0.54-0.62) in women.
When analyzed according to age-sex groups, SCORE-OP underestimated CVD mortality in all groups, with the exception of men and women aged 65-69 years (Fig. 2). In men and women aged 65-69 years, predicted 10-year CVD mortality versus observed CVD mortality yielded a ratio of 1.29 and 1.46, respectively. Goodness-of-fit for the SCORE-OP algorithm in men and women aged 65-69 was excellent, however discriminative performance was severely limited with an AUROC of 0.54 (95% CI 0.49-0.60) in men and 0.49 (0.41-0.56) in women ( Table 2). In both men and women aged 70-74 and 75-79 years, SCORE-OP showed a similar magnitude of underestimation. In men and women aged 70-74 years, predicted 10-year CVD mortality versus observed CVD mortality yielded a ratio of 0.78 and 0.85, respectively. Goodness-of-fit for the algorithm showed excellent calibration with a X 2 of 7.93 (p = 0.44) in men and 6.52 (p = 0.59) in women. Discriminative performance in men and women of the same age was severely limited with an AUROC of 0.52 (95% CI 0.47-0.56) and 0.46 (0.41-0.52). In men and women aged 75-79 years, predicted 10-year CVD mortality versus CVD mortality yielded a ratio of 0.73 and 0.66, respectively. Goodness-of-fit for the algorithm remained excellent, with a X 2 of 2.66, (p = 0.95) in men and 6.28 (p = 0.62) in women while discriminative performance was severely limited with an AUROC of 0.47 (95% CI 0.39-0.55) in men 0.55 (95% CI 0.46-0.65) in women. e- Fig. 1 presents the ratios of predicted 10-year CVD mortality by SCORE-OP and observed CVD mortality in SCORE-OP risk groups of 2% increments. Prediction was most accurate in men and women with a risk score between 8 and 10%, yielding a ratio of 1.02. In the risk group of 6 to 8% SCORE-OP overestimated risk by 14%, whereas in the other risk groups it underestimated CVD mortality. In the risk groups of 2 to 4% and 18 to 20%, underestimation was nearly 50%. However, these groups consisted of a very limited number of individuals (2-4% n = 15, 18-20% n = 29).

Performance of the SCORE-OP versus SCORE low-risk in predicting 10year cardiovascular mortality
When calculated for the total population aged 65-79 years, SCORE low-risk performed poorer than SCORE-OP. Predicted CVD mortality was 7.61% (95% CI 7.49-7.73) whereas observed CVD mortality was 10.20% (95% CI 9.52-11.04), yielding a ratio of 0.75 compared to a ratio of 0.96 with SCORE-OP. The AUROC was 0.66 (95% CI 0.64-0.69) vs 0.63 (95% CI 0.60-0.65) and the X 2 was 13.65 (p = 0.09) vs a X 2 of 3.26 (p = 0.92) in SCORE low-risk and SCORE-OP, respectively. There was a significant difference between the AUROC's of both algorithms (X 2 9.97) (p ≤ 0.01). SCORE low-risk also performed poorer compared to SCORE-OP in men and women separately with SCORE low-risk ratios of 0.67 and 0.83 and SCORE-OP ratios of 0.92 and 1.00, respectively.
With a cut-off point of ≥10% risk of 10-year mortality, 41% (2708/ 6590) of all older individuals were above this level according to SCORE-OP in contrast to 22% (1466/6590) according to SCORE low-risk.

Discussion
In this validation study of the SCORE-OP algorithm in the EPIC-Norfolk cohort, we found that in a general population aged 65-79 years, SCORE-OP overall accurately estimates the rate of CVD mortality. While calibration was excellent, discriminative power was limited, both for the prediction of 5-and 10-year CVD mortality. When looking at sexes separately, point estimates of predicted and observed 10-year CVD mortality were accurate, the algorithm well-calibrated, but discriminative power markedly limited. Respectively, SCORE-OP overand underestimated in the younger (65-69) and older (70-79) agesex groups. These aspects should be addressed before widespread use of SCORE-OP in clinical practice is recommended. SCORE-OP is developed for individual CVD risk prediction [5]. Therefore, the limited discrimination in our external validation study warrants attention. In the original paper by Cooney et al., discriminative performance showed an AUROC of 0.74 in the overall population of 20,825 European individuals aged 65 years and over, and was comparable with their simulated external validation, which also reported an AUROC of 0.74 [5]. This is in contrast to our findings showing an AUROC of 0.63 in the overall population aged 65-79 years. Several factors could have influenced our contrasting findings. First, when analyzing our data according to age-sex subgroups, we found a complex interplay between predicted and observed CVD mortality. In individuals aged 65-69 years, SCORE-OP overestimated 10-year CVD mortality, whereas in individuals aged 70-79 years a considerable underestimation was observed. Second, a well fitted model can have poor discrimination [8]. This is due to the influence of population disease prevalence on which the model is developed and the individual risk estimation which is leading in the discriminative performance. However, limited discrimination does not translate into low accuracy per se. Third, in our sensitivity analysis on systolic blood pressure, we found that SCORE-OP overestimates CVD mortality in normotensive individuals (≤140 mmHg), and underestimates in hypertensive individuals (N140 mmHg), In addition, the discriminative performance was limited  in both groups. This implies that also when taking an additional contributing risk factor into account, the model does not gain discriminative accuracy. This was also confirmed when SCORE-OP was analyzed according to separate risk groups. We found that a higher SCORE-OP risk score does not necessarily lead to more accurate estimation. However, the ratios of predicted and observed 10-year CVD mortality in the lower (2-4%) and higher risk groups (16-18% and 18-20%) could have been influenced by the low number of included individuals. The current European CVD prevention guideline suggests preventive treatment in case of ≥5% risk of 10-year CVD mortality [4]. When calculated by the SCORE-OP algorithm, virtually all older individuals (98%) exceeded the 5% treatment threshold; above 70 years every individual had a risk ≥5%. Using an arbitrary threshold of 10% risk, this number was reduced to 41% of the total population [5]. With such exceedingly high numbers of individuals at high risk, using a risk assessment tool to determine whether preventive therapies should be initiated is of limited added value in clinical practice, and potentially leads to a significant overtreatment of older adults.
The majority of studies referred to in the European CVD prevention guideline on preventive treatment were performed in adults up to 65 years [4]. The treatment recommendations are therefore not directly transferrable to individuals above 65 years. The guideline describes the potential benefits of cholesterol lowering therapy in primary prevention in the older population, but extensive evidence-based recommendations are lacking. Nevertheless, in secondary prevention treatment benefits of statins have shown to be similar in elderly (N65 years) as compared to middle aged individuals [9,10]. In addition, blood pressure treatment in the very old (N80 years) has been found to be beneficial in reducing the risk of CVD [11]. In a recent study of a nurse-led multicomponent primary CVD prevention program in older adults aged 70 to 78 years, positive results on systolic blood pressure (2.39 mmHg (95% CI 0.87-3.90)) and cigarette smoking −1.85 (95% CI −3.36-0.35) were found in the intervention group [12]. Nevertheless, the intervention did not affect the SCORE-OP risk profile at six years follow-up. With the increasing possibilities to predict CVD risk in older adults, there is an increasing need for thorough evidence on CVD risk factor management to guide clinicians in clinical decision making.
In contrast to our findings, Brotons et al. found that in a Spanish population (N = 974) aged 65-69 years, SCORE-OP estimated lower rates of 10-year CVD mortality as compared with SCORE low-risk [6]. Our findings show a higher risk estimation by SCORE-OP. The contrast in findings could be explained by the different statistical approaches, where Brotons et al. performed an analysis chiefly consisting of Kappa values between both algorithms, whereas we rigorously evaluated the overall population and relevant subgroups, calculating and comparing both calibration and discriminative performance.
Although CVD mortality is a hard and currently a leading outcome in risk estimation models, morbidity is at least as important due to the individual and societal impact [4,13]. In the current study we focused on the validation of the SCORE-OP tool for risk estimation of 10-year CVD mortality, and we did not asses CVD morbidity. We have previously demonstrated that a complex relationship exists between CVD mortality and morbidity when analyzed according to age and sex beyond the scope of the SCORE charts [14]. Ratios of morbidity to mortality are especially high in younger individuals and in women, but decrease with increasing age. Therefore, we also do not recommend applying the  fixed multiplier (3×) as suggested by the European CVD prevention guideline in older individuals to calculate total CVD morbidity and mortality rates from calculated mortality rates alone [4].

Strengths and limitations
There are several strengths to our study. First, we used the EPIC-Norfolk cohort as a representative cohort for low-risk countries according to the European Society of Cardiology [15]. Of this cohort, 6590 adults aged 65-79 years were eligible for our study and more than half of the included individuals were women. Second, we were able to compare the performance of the SCORE-OP with the current risk algorithm (SCORE low-risk), which has been previously validated in this population [16]. Finally, we were able provide insight into the nuanced differences in performance of the SCORE-OP algorithm in the overall population and in different subgroups, using a thorough statistical approach.
When interpreting the results of our study, some aspects should be taken into account. First, we excluded approximately 10% of cases from the dataset due to missing SCORE-OP variables. We compared demographics in the missing cases with the baseline demographics of included cases and except for 4.3% more cases with diabetes mellitus among excluded cases, we did not find significant differences. Second, the prevalence of diabetes mellitus was low in our overall study population (3.1%). This can be partly explained by the excluded cases with missing values on the SCORE-OP algorithm and the exclusion of individuals with a history of CVD. Nevertheless, compared to the prevalence of diabetes mellitus in the population of the original validation cohort (7%), our prevalence was lower, which could have influenced the CVD risk estimation by the SCORE-OP [5]. Third, the EPIC-Norfolk cohort is limited to individuals aged up to 79 years, and therefore we were not able to study the performance of the SCORE-OP in the very old population (≥80 years) as was performed in the internal validation study of Cooney et al. [5]. Fourth, we did not compare our results with the performance of other well known risk algorithms, such as the Framingham and QRISK2 risk scores, algorithms that have incorporated interaction terms for age and other risk factors to adjust the risk scores for use in older adults [13]. This could provide further insight on alternative instruments with a more accurate performance in older individuals. Neither were we able to evaluate the effect of therapeutic strategies (initiation of lifestyle interventions and drug therapy) on cardiovascular mortality in our population due to a lack of data on these interventions after baseline data collection. Finally, although the ICD-10 codes of the outcomes in our study were mainly similar to the ICD-9 codes that were included in the original SCORE study, there are a few differences which could have contributed to a potential lower number of outcome events in our study [16].

Conclusion
The SCORE-OP algorithm overall accurately estimates the rate of CVD mortality in a general population aged 65-79 years. However, while calibration was excellent, discriminative power was limited, both for the 5-year and the 10-year predictions. Therefore, SCORE-OP cannot readily be implemented in clinical practice in this population. Further development and testing of the SCORE-OP to improve CVD risk stratification in older individuals is warranted.

Grant support
The EPIC-Norfolk study is funded by Cancer Research UK (14136) and the Medical Research Council (G1000143). This study is financed by the Netherlands Organisation for Scientific Research (NWO) (023.008.024).

Conflicts of interest
The funder had no role in the development and publication of the manuscript.