The identification of patients with high risk of death makes individual decision making more efficient, optimizing resources and improving the quality of the medical attention. The prognostic utility of APACHE II, SOFA, and CURB-65 in critical COVID-19 has not yet been determined.
ObjectiveThe present work intends to validate these scoring systems for the prediction of death within 60 days in patients hospitalized in intensive care with COVID-19.
MethodsA prospective cohort was conducted which included adults with confirmed COVID-19 hospitalized in ICU. The scores were calculated by building ROC curves and calculating the areas under the curve and the curves of decision analysis.
MeasurementsThe operating characteristics and Kaplan Meier curves were calculated.
Results320 patients between July and December 2020 were included, mortality within 60 days was 49.7%. CURB-65 had an AUC of 0.68 (CI 0.62–0.74), sensitivity 73.6%, and specificity 55.9%; APACHE-II had an AUC of 0.65 (CI 0.60–0.71), sensitivity 51.6%, and specificity 70.2%; and SOFA had an AUC of 0.70 (CI 0.64–0.75), sensitivity 83.6%, and specificity 52.2%. The three scoring systems obtained values of p<0.001 for the LongRank test in the survival curves, offering moderate increments in the net benefit.
ConclusionThe scoring systems for clinical prediction CURB-65, APACHE II, and SOFA exhibited moderate discriminatory ability for death within 60 days in patients with COVID-19 hospitalized in intensive care; for the optimal cut-off level, there was an adequate power of discrimination.
La identificación de pacientes con alto riesgo de muerte hace más eficiente la toma de decisiones individuales, optimizando recursos y mejorando la calidad de la atención médica. Aún no se ha determinado la utilidad pronóstica de las escalas Acute Physiology And Chronic Health Evaluation II (APACHE II), Sepsis-related Organ Failure Assessment (SOFA) y Confusion, Urea Nitrogen, Respiratory Rate, Blood Pressure, 65 years of age and older (CURB-65) en la COVID-19 crítica.
ObjetivoEl presente trabajo pretende validar estos sistemas de puntuación para la predicción de muerte en 60 días en pacientes hospitalizados en la Unidad de Cuidados Intensivos (UCI) con COVID-19.
MétodosSe realizó una cohorte prospectiva que incluyó adultos con COVID-19 confirmada hospitalizados en la UCI. Las puntuaciones se calcularon construyendo curvas Receiver-Operating-Characteristic (ROC) y estimando las áreas bajo la curva (AUC), así como las curvas de análisis de decisión.
MedicionesSe calcularon las características operativas y las curvas de Kaplan Meier.
ResultadosSe incluyeron 320 pacientes entre julio y diciembre de 2020, la mortalidad en 60 días fue de 49,7%. La escala CURB-65 tuvo un AUC de 0,68 (intervalo de confianza [IC] 0,62 - 0,74), sensibilidad de 73,6% y especificidad de 55,9%; APACHE-II mostró un AUC de 0,65 (IC 0,60 - 0,71), sensibilidad de 51,6% y especificidad de 70,2%; y SOFA exhibió un AUC de 0,70 (IC 0,64 - 0,75), sensibilidad de 83,6% y especificidad de 52,2%. Los tres sistemas de puntuación obtuvieron valores de p < 0,001 para el test de LongRank en las curvas de supervivencia, ofreciendo incrementos moderados en el beneficio neto.
ConclusionesLos sistemas de puntuación para la predicción clínica CURB-65, APACHE II y SOFA mostraron una capacidad discriminatoria moderada para la muerte en un plazo de 60 días en pacientes con COVID-19 hospitalizados en UCI; para el nivel de corte óptimo, hubo un poder de discriminación adecuado.
Since the COVID-19 pandemic was declared on 11 March 2020, the intensive care units (ICU) worldwide experienced periods of sudden increase in cases of acute respiratory insufficiency following the infection with the virus, which caused a deficit in the capacity of response of the hospital system.1 Between 5% and 32% of the patients hospitalized with COVID-19 worldwide, have required attention in ICU.2 Mechanical ventilation is a support measure for critical patients in intensive care units; however, the patients with this condition register a mortality rate of 27%, which can vary according to the presence of comorbidities and other clinical conditions, such as hypoxic respiratory failure, where the mortality rate reaches up to 50%.3
The APACHE II4 and SOFA5 scoring systems have been adapted and used to assess the risk of death in patients hospitalized in intensive care, while the CURB-65 was designed for the assessment of risk of death in patients with community-acquired pneumonia.6 A meta-analysis of 23 studies performed in 2010, which included 22753 participants, found that CURB-65≥2 had an odds ratio (OR) of 6.4 for mortality in patients with community-acquired pneumonia, the data grouped found sensitivity of 0.62%, specificity of 0.79%, PPV of 24%, and NPP of 95%7; another meta-analysis of 40 studies reported aggregated AUC of 0.80 for CURB-65 for mortality within 30 days.8 The PROWESS study, an analysis of 275 patients with pneumonia hospitalized in ICU, reported a C-statistic for CURB-65≥2 of 0.66 and for APACHE II≥25 of 0.64 for death within 28 days.9 A study that included 406 Japanese patients with pneumonia found that SOFA registered an AUC of 0.769 for patients with community-acquired pneumonia.10
The identification of patients with high risk of death in ICU makes support systems more efficient in accordance with the individual requirements, optimizing resources and improving the quality of the medical attention. Currently, there is insufficient information on the utility of APACHE II, SOFA, and CURB65 for the prognosis of death in critical COVID-19 patients hospitalized in ICU in the Colombian population. The purpose of the present work was to determine the utility of these scoring systems for the prognosis of death, including adjustments by other variables of clinical importance, in patients hospitalized in ICU in a hospital of Bogotá.
Materials and methodsAn analytic prospective cohort study was conducted, which included patients aged 18 years or older hospitalized in intensive care by diagnosed severe COVID-19, confirmed via polymerase chain reaction test in real time for SARS-CoV-2 between July and December 2020 in one of nine intensive care units of the Integrated Subnet of Health Services of the South – Hospital El Tunal of Bogotá. The patients who stayed for over 72h in ICU in a different institution, pregnant women, patients with conditions determining a low life expectation, and the patients who died within the first 24h upon admission, were excluded.
Upon screening the patients through the census at each department, the patients admitted were identified. The laboratory tests and imaging required for the first 72h upon admission were verified by entering the medical history data in a virtual form containing the demographic data, clinical presentation, antecedents, physical examination, laboratory tests, and imaging. Subsequently, we computed the APACHE II, SOFA, and CURB-65 scores for each patient upon admission.
Statistical analysisA convenience sample was selected based on the patients admitted during the period of time defined for the study. A descriptive analysis and frequency tables were made to characterize and summarize the clinical data for the target population. ROC (Receiver Operating Characteristic) curves were built in order to assess the performance of each scoring system, the areas under the curve (AUC) were analyzed and the cut-off points were set so that sensitivity and specificity were the best, obtaining the following values: for CURB-65 1.5, SOFA 5.5, and APACHE II 14.5; in addition, the Youden Index, the positive and negative predictive values, and the positive and negative likelihood ratio (LR) were calculated. A Kaplan–Meier curve was built for each scale using the cut-off point obtained by Youden, p was calculated through the LogRank test.
A decision curve analysis framework was used to compare the models with the strategy of treating all or none of the patients. The net benefit for each strategy was calculated by subtracting the false positives from the true positives weighted by the relative damage of a false positive result and a false negative. The analyses were made using the statistical software R version 4.0.2 (R Foundation, Vienna, Austria) using the packages “pROC”, “ROCit”, “rmda”, “survival” and “survminer”.
The present study was approved by the committee of ethics of the Integrated Subnet of Health Services of the South, register 138 of 2020 and the Universidad CES in accordance with act 167 of July 2021, it was determined that an informed consent was not necessary. No funding was received.
ResultsFig. 1 shows the flowchart of the patients indicating how the screening was made, 320 patients with COVID-19 admitted in ICU at Hospital El Tunal between July and December 2020 were included. 206 (64.4%) patients were male, 52.8% of the patients were between 60 and 79 years old, and 7.8% were older than 80. 134 (46.2%) registered a history of obesity, 79 (24.7%) smoking, 70 (21.9%) diabetes mellitus type 2, 128 (40.0%) arterial hypertension, and 66 (20.6%) a chronic pulmonary condition. Mortality within 60 days was 49.7% (159 patients). The patients who did not survive were elderly, with higher levels of ferritin, lactate dehydrogenase, relation Pa/Fi O2 and troponin, and developed shock, renal lesion, liver disease, and coagulopathy on a frequently basis (Table 1). Fig. 2 shows the ROC curve of each clinical prediction rule.
Clinical characteristics of the patients included in the study.
Characteristic | All patients(n=320) | Survivors(n=161) | Non survivors(n=159) | Value p |
---|---|---|---|---|
Females, n (%) | 114 (35.6%) | 64 (39.8%) | 50 (31.4%) | 0.152 |
Age (years), average (SD) | 59.4 (14.9) | 55.0 (14.4) | 64.0 (14.1) | <0.001 |
Obesity, number of patients with data (%) | 134/290 (46.2%) | 74/149 (49.7%) | 60/141 (42.6%) | 0.273 |
Comorbidities, n (%) | ||||
Hypertension | 128 (40.0%) | 63 (39.1%) | 65 (40.9%) | 0.908 |
Diabetes | 70 (21.9%) | 34 (21.1%) | 36 (22.6%) | 0.846 |
Chronic heart disease (except hypertension) | 41 (12.8%) | 14 (8.7%) | 27 (17.0%) | 0.040 |
Chronic renal disease | 15 (4.7%) | 6 (3.7%) | 9 (5.7%) | 0.580 |
Smoking | 79 (24.7%) | 44 (27.3%) | 35 (22.0%) | 0.330 |
Chronic pulmonary disease | 66 (20.6%) | 27 (16.8%) | 39 (24.5%) | 0.115 |
Chronic neurological disease | 14 (4.4%) | 5 (3.1%) | 9 (5.7%) | 0.400 |
Duration of the disease before hospitalization (days), median (SD) | 8.4 (4.2) | 8.4 (4.0) | 8.4 (4.3) | 0.892 |
Laboratories | ||||
Creatinine (mg/dL), average (SD) | 2.4 (4.0) | 2.1 (3.5) | 2.7 (4.4) | 0.203 |
High sensitivity C-reactive protein (mg/L), average (SD) | 17.8 (14.7) | 16.3 (15.8) | 19.3 (13.4) | 0.072 |
Ferritin (ng/mL), average (SD) | 1167 (639) | 1053 (605) | 1294 (654) | 0.002 |
D-dimer (μg/mL), average (SD) | 4.4 (6.6) | 3.8 (6.2) | 5.1 (6.9) | 0.084 |
Ratio PaO2/FiO2, average (SD) | 103 (65) | 114 (75) | 92 (51) | 0.002 |
Lactate dehydrogenase (U/L), average (SD) | 1083 (1084) | 897 (374) | 1272 (1472) | 0.002 |
High positive sensitivity cardiac troponin I, number of patients with data (%) | 127/304 (41.8%) | 45/137 (32.8%) | 72/139 (51.8%) | <0.001 |
Complications and supports, (n, %) | ||||
Shock | 214 (66.9%) | 77 (47.8%) | 137 (86.2%) | <0.001 |
Invasive mechanical ventilation requirement | 254 (79.4%) | 110 (68.3%) | 144 (90.6%) | <0.001 |
Acute renal lesion | 162 (50.6%) | 49 (30.4%) | 113 (71.1%) | <0.001 |
Renal replacement therapy | 81 (25.3%) | 16 (9.9%) | 65 (40.9%) | <0.001 |
Liver disease | 53 (16.6%) | 14 (8.7%) | 39 (24.5%) | <0.001 |
Coagulopathy | 43 (13.4%) | 9 (5.6%) | 34 (21.4%) | <0.001 |
Nervous central system compromise | 29 (9.1%) | 13 (8.1%) | 16 (10.1%) | 0.67 |
Venous thrombosis | 27 (8.4%) | 10 (6.2%) | 17 (10.7%) | 0.215 |
Duration of hospital stay (days), median (SD) | 23.4 (15.9) | 28.2 (17.9) | 18.6 (11.8) | <0.001 |
Scoring systems for risk prediction upon admission | ||||
APACHE II, average (SD) | 13.9 (6.7) | 12.2 (5.9) | 15.6 (7.1) | <0.001 |
SOFA, average (SD) | 6.8 (3.3) | 5.9 (3.4) | 7.8 (2.9) | <0.001 |
CURB-65, average (SD) | 1.8 (1.0) | 1.4 (0.9) | 2.1 (1.0) | <0.001 |
SD: standard deviation, APACHE II: Acute Physiology and Chronic Health Evaluation, SOFA: Sequential Organ Failure Assessment Score, CURB-65: Confusion, Blood urea nitrogen, Respiratory rate, Systolic BP and Age > or = 65.
CURB-65 had an average result of 1.76 and a standard deviation of 1.03, with an AUC of 0.68 (CI 0.62–0.74). Out of the 188 patients with high risk of death (score>1.5 by Youden) 117 died (62.2%), while 42 out of 132 (31.81%) patients with low risk died. The difference of ratios was 30.4% (CI 19.2–41.6). The patients with high risk registered a RR of 1.95 (CI 1.48–2.57) with p<0.001. The sensitivity for CURB-65 at a cut-off point of 1.5 was 73.6% (CI 95% 68.8–78.4%) and a had a specificity of 55.9% (CI 50.5–61.3%). The predictive positive and negative values were 62.2% and 68.2%, respectively.
APACHE-II reported a minimum value of 2 and a maximum of 44, with a median of 13.9 and a standard deviation of 6.7. It had an AUC of 0.65 (CI 0.60–0.71). Out of the 130 patients classified as high risk of death (score>14.5 identified by Youden), 82 died (63.1%) compared with 77 (40.5%) out of 190 of low risk, with a difference of ratios of 22.55% (CI 11.5–34). The subjects with high risk of death had a RR of 1.55 (CI 1.25–193) and p<0.001, a sensitivity of 51.6% (CI 46.1–57%), and a specificity of 70.2% (CI 65.2–75.2).
SOFA had a minimum value of 2 and a maximum of 17 with an average of 6.8 and a standard deviation of 3.3. The AUC was 0.70 (CI 0.64–0.75) (Fig. 2c). Out of the 210 patients classified as high risk of death, 133 died (63.3%) compared with 26 (23.6%) out of 110 in the low risk of death group, with a difference of ratios of 39.7% (CI 28.7–50.7). The subjects with high risk of death (score>5.5 by Youden) had a RR of 2.67 (CI 2.67–1.88) and p<0.001, a sensitivity of 83.6% (CI 79.6–87.7%), and a specificity of 52.2% (CI 46.7–57.6%).
Fig. 3 shows the Kaplan and Meier survival curves for the three scoring systems for clinical prediction assessed in accordance with the cut-off values obtained through the Youden test (1.5 for CURB-65, 14.5 for APACHE II, and 5.5 for SOFA), a significant difference was found for survival within 60 days in cases with values for p smaller than 0.001 in all the cases.
For the threshold range between 1% and 50% in the analysis of the decision curve (Fig. 4), CURB-65 might prevent a larger number of unnecessary interventions; however, the three scoring systems offer moderate increments in the net benefit compared with the risk strategies of ‘treating all’ and ‘treating none’.
DiscussionAlmost three years after the COVID-19 pandemic was declared, more data about the prognostic markers for decision making in patients with critical disease are still necessary. Even with the proliferation of tools designed specifically for COVID-19, it is remarkable that the traditional tools for intensive care continue to be valid as comparators and in many cases these tools have demonstrated their superiority.
The present work has demonstrated that the average of the three scoring systems assessed was significantly higher for non survivors (APACHE II 15.6 vs 12.2; SOFA 7.8 vs 5.9; and CURB-65 2.1 vs 1.4) (Table 1). Other studies show similar findings, such as a multi-center cohort of ICU in Spain and Andorra that included 663 patients with critical COVID-19, in which the average for APACHE II was 17 vs 11 (p<0.001) and for SOFA 7 vs 4 (p<0.001) including non survivors and survivors.1 Another study in Argentina, SATICOVID, with a cohort of 1909 patients undergoing invasive ventilatory support, registered a difference for APACHE II of 16 vs 13 (p<0.0001) and for SOFA 6 vs 4 (p<0.0001).11
A moderate discriminatory ability for death within 60 days was found in each of the systems, with an AUC for APACHE II of 0.70 (CI 0.64–0.75), SOFA 0.65 (CI 95% 0.60–0.71), and CURB-65 0.68 (CI 0.62–0.74). A cross-sectional study that included 1990 patients for a period of two years hospitalized in ICU in India showed that APACHE II had sensitivity of 89.9%, specificity of 97.6% and an AUC of 0.983 for mortality; while the results for SOFA were 90.1%, 96.6%, and 0.986, respectively. With regard to the necessity of mechanical ventilation, APACHE-II had sensitivity of 93.4%, specificity of 89.7% and an AUC of 0.966; SOFA had sensitivity of 90.5%, specificity of 95.8% and an AUC 0.976.12 The report for a cohort of 204 patients hospitalized for COVID-19 in ICU at the hospital complex Imam Khomeini in Iran found that the average scores for APACHE II and SOFA were significantly higher for non survivors than for survivors (14.4 vs 9.5 p≤0.001, and 7.3 vs 3.1 p≤0.001, respectively); the AUC was 0.895 for SOFA and 0.730 for APACHE II. An APACHE II>13 had an adjusted OR of 10.7 (CI 5.0–22.9; p≤0.001) while the SOFA>5 had 32.4 (CI 14.4–72.9; p≤0.001).13
A study that included 249 patients hospitalized in ICU in Lithuania reported that the most accurate scoring system was APACHE II, with an AUC of 0.772 (CI 0.714–0.830; p<0.001), finding that a one point increase in the score increased the risk of mortality by 1155 (OR 1155, CI 95% 1085–1.229; p<0.001); SOFA had an AUC of 0.679 (IC 0.611–0.747; p<0.001).14 Another cohort that included 140 patients with COVID-19 in critical condition found an AUC of 0.890 (CI 0.826–0.955)15 for SOFA in the prediction of mortality, an analysis using the same data shows that a SOFA score ≥3 is a moderately good positive predictor (LR +5.35; CI 3.43–8.36) and a strong negative predictor (LR −0.12; CI 0.03–0.45) for in-hospital mortality.16 These positive findings contrast the reports that show the opposite results; a retrospective study that included 675 patients with mechanical ventilation in ICU for COVID-19 found an AUC for SOFA of 0.59 (CI 0.55–0.63) for death,17 similar to the result described in an extensive US cohort of 5122 patients with ventilation in 86 ICUs, which had an AUC of 0.66 (CI 0.65–0.67) showing that the SOFA score itself had deficient discriminatory accuracy for in-hospital mortality.18 Furthermore, a study that modeled the performance of SOFA to guide the indication of mechanical ventilation found a limited utility for decision making.19
Upon assessing cohorts of patients hospitalized in general ward, reports with similar success rates can be found. The Brazilian multi center cohort RECOVER-SUS, that included 1589 patients hospitalized for COVID-19, showed that a SOFA score ≥10 was one of the factors associated in an independent manner with in-hospital mortality (HR of 1.51; CI 1.08–2.10).20 In a retrospective observational study that included 237 adults hospitalized for COVID-19 outside ICU, a SOFA score≥2 had an AUC of 0.732 (0.67–0.79) with sensitivity of 83.33%, specificity of 65.43%, PPV of 38.1% and NPV of 93.84% to predict mortality within 28 days.21
A register that included 1363 patients from Sao Paulo and Barcelona hospitalized for COVID-19 found for CURB-65 an AUC of 0.74 (CI 0.72–0.77) for in-hospital death.22 An extensive Spanish multi-center register called SEMI-COVID-19, which included 10238 patients hospitalized for COVID-19, reported an AUC of 0.825 for CURB-65 for in-hospital mortality, admission to ICU, and use of mechanical ventilation; there was sensitivity of 82.1% (CI 80.4–83.8), specificity of 70.6% (CI 69.6–71.6), PPV of 42.2% (CI 40.6–43.7) and NPV of 93.8% (CI 93.2–94.4) for a score>2.23 A study that compared 682 patients hospitalized for COVID-19 in eight hospitals for adults with 7449 community-acquired pneumonia cases in Louisville, Kentucky, found that CURB-65 had an AUC of 0.79 (CI 95%, 0.75–0.84) and 0.75 (CI 95%, 0.73–0.77), respectively.24 A study that included 481 patients from Turkey found that CURB-65 had better performance to predict in-hospital mortality and ICU requirement in patients hospitalized for COVID-19 (AUC of 0.846 and 0.898, respectively).25
A non-systematic review that assessed the scoring systems with respect to mortality within 90 days in patients hospitalized for COVID-19, which included 76 studies, found that the score for APACHE II had the highest AUC values (0.966 and 0.937 in two studies), followed by SOFA (with 0.915, 0.926 and 0.876 in three studies). This may be explained by the fact that both age and comorbidities are taken into consideration, which are components recognized as prognostic markers for COVID-19.26 In a cohort of 247 patients infected with COVID-19 in Ecuador, the utility of the CURB-65 score≥2 was assessed to predict mortality within 30 days, obtaining sensitivity of 84%, specificity of 54%, PPV of 56% and NPV of 83%, concluding that CURB-65 is adequate to predict mortality in this group of patients.27
We can mention as limitations of the present study the size of the population since a calculation of the sample size was not carried out. We cannot rule out a selection bias when including only patients with complete data for the calculation of clinical rules on admission. A limitation of our study is that we have data for the SOFA score only at admission, and we lack daily or periodic measurements of its variables over the time. Although the present work has a unicenter character, it can be considered representative of a wide population since this reference center is located in the South of Bogotá and covers about two million people.
In conclusion, the scoring systems for clinical prediction CURB-65, APACHE II, and SOFA, used with patients hospitalized for COVID-19, showed a moderate discriminatory ability for death within 60 days (AUC 0.68, 0.70 and 0.65, respectively) and demonstrated that for a maximum discrimination cut-off value (via Youden test) these scoring systems have an adequate prediction ability, which was confirmed through the Kaplan–Meier curves and the clinical decision curves.
Authors’ contributionsJS, AM and JP conceived and designed the study. JS, AM and OQ performed the statistical analysis. All authors participated in the collection and analysis of information, as well as the writing, revision, and approval of the final manuscript.
Availability of data and materialsAll data generated and/or analyzed during this study are available from the corresponding author on reasonable request conditioned by its review by the institutional ethics and research committee.
Ethics approval and consent to participateThe study was approved by the ethics and research committees of each of the institutions (act number 138 of the Health Services Unit from the El Tunal Hospital, 0498-2020 of the San José Hospital and SDM-026-20 of the University Children's Hospital San Jose) and informed consent was not required.
Consent for publicationNot applicable.
FundingNo funding was received for the creation, development, and analysis of the data.
Conflict of interestsThe authors declare that they have no competing interests.
We thank the relatives of the participants for their support. We also want to thank all the health and non-health care personnel involved in the care of COVID-19 patients.