基于大语言模型的中医泄泻临床决策与处方生成研究

Clinical decision and prescription generation for diarrhea in traditional Chinese medicine based on large language model

  • 摘要:
    目的 本研究通过构建专用大语言模型Qwen-TCM-Dia,开发了中医泄泻临床决策与处方生成系统,用于标准化泄泻的辨证诊断与处方产出。
    方法 研究构建了2个主要数据集:评估基准数据集和微调数据集,后者包括泄泻基础知识、医案和思维链推理数据集。在对16个开源大语言模型进行推理时间、准确率和输出质量的初步评估后,Qwen2.5因其更优的整体性能而被选作基础模型。研究采用两阶段低秩适应(LoRA)微调策略,将基于领域特定知识的继续预训练与使用思维链增强的临床医案相结合进行指令微调,在将临床逻辑(症状 → 病机 → 治则 → 处方)嵌入模型的推理能力中,由此得到的针对中医泄泻的微调模型被命名为 Qwen-TCM-Dia。研究通过准确率、精确率、召回率和F1分数评估了模型在疾病诊断和证型鉴别方面的性能,并与已有的开源中医大语言模型进行对比评估了处方生成质量。
    结果 Qwen-TCM-Dia的性能优于基础模型Qwen2.5和其他5个开源中医大语言模型,其疾病诊断准确率为97.05%,F1分数为91.48%;证型鉴别准确率为74.54%,F1分数为74.21%。与现有的开源中医大语言模型(BianCang、HuangDi、LingDan、TCMLLM-PR和ZhongJing)相比,Qwen-TCM-Dia在重构“症状→病机→治则→处方”逻辑链方面表现出更高的保真度,能够提供完整的处方,而其他模型常常遗漏剂量或生成不匹配的处方。
    结论 研究通过整合继续预训练、思维链推理和两阶段微调策略,构建了中医泄泻临床决策和处方生成系统,证明了继续预训练强化领域表征与思维链激活逻辑推理具有协同效应。研究不仅为泄泻的规范化诊疗提供了关键技术支持,也为中医专家经验的数字化传承和中医智能化转型提供了一种可扩展的范式。

     

    Abstract:
    Objective To develop a clinical decision and prescription generation system (CDPGS) specifically for diarrhea in traditional Chinese medicine (TCM), utilizing a specialized large language model (LLM), Qwen-TCM-Dia, to standardize diagnostic processes and prescription generation.
    Methods Two primary datasets were constructed: an evaluation benchmark and a fine-tuning dataset consisting of fundamental diarrhea knowledge, medical records, and chain-of-thought (CoT) reasoning datasets. After an initial evaluation of 16 open-source LLMs across inference time, accuracy, and output quality, Qwen2.5 was selected as the base model due to its superior overall performance. We then employed a two-stage low-rank adaptation (LoRA) fine-tuning strategy, integrating continued pre-training on domain-specific knowledge with instruction fine-tuning using CoT-enriched medical records. This approach was designed to embed the clinical logic (symptoms → pathogenesis → therapeutic principles → prescriptions) into the model’s reasoning capabilities. The resulting fine-tuned model, specialized for TCM diarrhea, was designated as Qwen-TCM-Dia. Model performance was evaluated for disease diagnosis and syndrome type differentiation using accuracy, precision, recall, and F1-score. Furthermore, the quality of the generated prescriptions was compared with that of established open-source TCM LLMs.
    Results Qwen-TCM-Dia achieved peak performance compared to both the base Qwen2.5 model and five other open-source TCM LLMs. It achieved 97.05% accuracy and 91.48% F1-score in disease diagnosis, and 74.54% accuracy and 74.21% F1-score in syndrome type differentiation. Compared with existing open-source TCM LLMs (BianCang, HuangDi, LingDan, TCMLLM-PR, and ZhongJing), Qwen-TCM-Dia exhibited higher fidelity in reconstructing the “symptoms → pathogenesis → therapeutic principles → prescriptions” logic chain. It provided complete prescriptions, whereas other models often omitted dosages or generated mismatched prescriptions.
    Conclusion By integrating continued pre-training, CoT reasoning, and a two-stage fine-tuning strategy, this study establishes a CDPGS for diarrhea in TCM. The results demonstrate the synergistic effect of strengthening domain representation through pre-training and activating logical reasoning via CoT. This research not only provides critical technical support for the standardized diagnosis and treatment of diarrhea but also offers a scalable paradigm for the digital inheritance of expert TCM experience and the intelligent transformation of TCM.

     

/

返回文章
返回