目的:在我国城乡居民疾病死亡构成比里,心血管疾病位居首位。患者通常在出现症状时才前往就医,而且诊断心血管疾病的传统手段既复杂又昂贵。鉴于此,本研究旨在借助一般人口特征、合并症以及常规体检血检指标来识别心血管疾病患者。方法:样本选取自CHARLS数据库13,420的参与者。删除缺失值后,运用逻辑回归、决策树、K-最邻近算法、随机森林、神经网络构建模型,通过比较接收者操作特征曲线下面积(ROC_AUC)值选择最优模型进一步构建各心血管疾病亚组模型,并采用SHAP算法对模型予以解释。结果:通过逻辑回归构建的模型效能最佳,其ROC_AUC值为0.7644 (95% CI: 0.7397~0.7890),其中对心脏病的识别效能较好,ROC_AUC值为0.7747。SHAP算法对模型的解释显示,年龄、体重指数、糖尿病以及吸烟史在识别心血管病方面有着重要贡献。结论:基于机器学习方法能够识别心血管病患者,可利用简易检查结果在早期对高风险人群进行识别并实施干预。Objective: Cardiovascular diseases account for the highest proportion of deaths among both urban and rural residents in our country. Patients typically seek medical attention only after the onset of symptoms, and traditional diagnostic methods for cardiovascular diseases are often complex and costly. Therefore, this study aimed to identify patients with cardiovascular diseases based on general population characteristics, comorbidities, and routine physical blood test indicators. Methods: Samples were drawn from 13,420 participants in the CHARLS database. After removing missing values, models were constructed using logistic regression, decision trees, the K-nearest neighbor algorithm, random forests, and neural networks. The optimal model was selected by comparing the area under the receiver operating characteristic curve (ROC_AUC) which facilitated the construction of subgroup models for each type of cardiovascular disease. The SHAP algorithm
目的:评估血红蛋白、白蛋白、淋巴细胞和血小板评分(HALP评分)与患有糖尿病的心血管疾病人群的全因和心因死亡风险的相关性。方法:基于1999~2018年美国国家健康与营养检查调查(NHANES)数据库,采用自然对数转化后的HALP评分(LnHALP)进行研究。使用加权多变量调整Cox分析、Kaplan-Meier生存曲线分析其与全因和心因特异性死亡率的关系,并通过限制性立方样条(RCS)分析评估非线性关系。结果:最终共纳入2621名参与者,在调整了混杂因素后,加权多因素Cox回归提示,LnHALP每升高1单位,参与者全因死亡率显著降低26% [HR = 0.74, 95%CI: 0.64~0.85],心因死亡率降低33% [HR = 0.67, 95%CI: 0.52~0.87]。与最低三分位数人群相比,LnHALP最高三分位数人群的全因和心因死亡风险分别下降了23% [HR = 0.77, 95%CI: 0.65~0.91]、31% [HR = 0.69, 95%CI: 0.53~0.91]。限制性立方样条分析显示LnHALP评分与死亡风险呈现非线性关系,LnHALP评分与全因和心因死亡风险呈J形曲线。结论:HALP评分与患有糖尿病的心血管疾病人群的全因死亡率和心因死亡率之间独立相关。Objective: To evaluate the association between hemoglobin, albumin, lymphocyte, and platelet (HALP) score and the risk of all-cause and cardiovascular mortality among patients with diabetes and cardiovascular disease. Methods: This study utilized data from the National Health and Nutrition Examination Survey (NHANES) database from 1999 to 2018. Due to the skewed distribution of HALP scores, we employed the natural logarithm-transformed HALP score (LnHALP) for subsequent analyses. The associations between LnHALP scores and all-cause and cardiovascular-specific mortality were examined using weighted multivariate-adjusted Cox regression analysis and Kaplan-Meier survival curves. Restricted cubic spline (RCS) analysis was performed to evaluate potential non-linear relationships. Results: A total of 2621 participants were included in the final analysis. After