In a prototype outpatient diabetes decision-support system, an EHR–fine-tuned compact LLM (GLM4-9B) generated clinically reasonable treatment plans plus lab and medication prompts, scoring BLEU-4 67.93 (SD 2.74) with additional ROUGE metrics (ROUGE-1 44.30; ROUGE-2 27.34; ROUGE-L 37.67). The study fine-tuned three small models (Llama 3.1-8B, Qwen3-8B, GLM4-9B) on deidentified outpatient records using parameter-efficient low-rank adaptation and deployed them with retrieval-augmented generation in a prototype hospital information system.
Why It Matters To Your Practice
Diabetes mellitus (DM) care depends on individualized choices (therapy, monitoring, labs), and LLMs are being positioned to draft patient-specific plans from routine EHR data.
This work suggests smaller, locally deployable models—when tuned to your institution’s data—may produce usable drafts rather than generic guidance.
Clinical Implications
Potential near-term use: clinician-in-the-loop drafting of treatment recommendations, lab test suggestions, and medication prompts for outpatient DM visits.
Workflow fit: retrieval-augmented generation can ground outputs in available patient demographics and clinical context, supporting faster note prep and decision support.
Guardrails still required: outputs should be treated as reference recommendations, with clinician verification—especially for medication guidance and patients with complex comorbidities.
Insights
Among the three compact models tested, the fine-tuned GLM4-9B performed best for producing clinically reasonable plans and appropriate lab/medication suggestions.
Fine-tuning used a parameter-efficient low-rank adaptation approach, a practical path for health systems that can’t retrain large foundation models end-to-end.
The reported evaluation relied on text-overlap metrics (BLEU/ROUGE), which can track similarity to reference text but may not fully capture clinical correctness or safety.
The Bottom Line
EHR-tuned compact LLMs, deployed with retrieval-augmented generation, can generate personalized outpatient DM care drafts with strong text-similarity scores (e.g., BLEU-4 67.93), but need clinician oversight and stronger medication/comorbidity handling before routine use.