A regional hospital network operating eleven acute-care facilities across three U.S. states had a problem common to the sector: 30-day readmission rates were trending upward, penalties under the CMS Hospital Readmissions Reduction Program were accumulating, and the organization's existing Cerner deployment held all the data needed to do something about it — but no path to turn that data into actionable bedside decisions.
The challenge
The network's Chief Medical Officer framed the problem clearly at our first working session: "We know which patients will be back within thirty days. We know it on paper. We just don't know it in time to do anything about it."
The data told the story. Across the network, 30-day readmissions were running at 16.4% — meaningfully above the CMS benchmark for the clinical mix. Internal analyses, run on historical data by the quality team, had identified the clinical and social factors that predicted readmission: heart failure patients with a specific combination of lab values, pneumonia patients without scheduled follow-up, certain medication combinations, prior-year utilization patterns. The patterns were real. But they were identified retrospectively, in quarterly reports nobody read until months after discharge.
The network had twice attempted to operationalize the analysis. The first attempt built a standalone application outside Cerner; clinicians refused to use it because it required logging into a separate system. The second attempt proposed to implement risk scoring directly inside Cerner using its native capabilities; the vendor quoted a multi-year program and a seven-figure license adjustment. Both attempts were shelved.
Why the earlier approaches failed
Both failures pointed at the same underlying issue. Any solution that required clinicians to change their workflow — to log into a new system, to click into a new screen, to remember to check a new queue — would fail. And any solution that required modifying the EHR itself would take years and bring its own risks in a patient-safety environment.
We proposed a third path: build the intelligence around the EHR, and deliver it into the EHR through the native inbox and alerting mechanisms clinicians already used. No new system for anyone to log into. No changes to Cerner. Just a new class of alert showing up where alerts already showed up.
What we built
The architecture had three components:
- A governed data projection — a near real-time replica of the Cerner clinical database, plus feeds from labs, pharmacy, and scheduling. Protected by HIPAA-compliant tokenization and role-based access. Lived inside the client's own AWS tenant; data never left.
- A readmission risk model — an ensemble of gradient-boosted classifiers trained on five years of historical encounters, scoring patients at multiple touchpoints during admission (admission, 24h post-admission, day before expected discharge, discharge). Outputs included a risk score, the specific factors driving it, and a recommended intervention bundle.
- An integration layer writing scores back into Cerner's HL7 inbox as a custom flowsheet element, visible in the patient's chart exactly where the attending physician already looked. High-risk patients also triggered notifications to care-management, who could schedule discharge planning earlier in the stay.
Every model decision was logged with the specific feature contributions that drove it — so any clinician could see, for any patient, exactly why the system flagged them. That was non-negotiable for the clinical review board.
We chose to display specific interventions alongside each risk score, not just the score itself. Early research showed that physicians engaged substantially more with the tool when it said "consider cardiology consult, early follow-up scheduling, pharmacy reconciliation" than when it simply said "high risk." The model's job wasn't to tell clinicians something was wrong. It was to save them time making a decision they would have made anyway, given the full picture.
The timeline
Eleven weeks end-to-end:
- Weeks 1-2 — Discovery and data access. Read-only credentials to Cerner, mapping of clinical data elements, HIPAA compliance review, security sign-off.
- Weeks 3-5 — Model development. Feature engineering against five years of historical encounters. Three candidate model families evaluated; gradient-boosted ensemble selected based on calibration and explainability.
- Weeks 6-7 — Integration. HL7 write-back developed in parallel to model training. Sandbox environment testing with synthetic patients.
- Week 8 — Clinical review. Presentation to the medical executive committee. Minor adjustments to intervention bundles. Approval to proceed to shadow deployment.
- Weeks 9-10 — Shadow mode. Live scoring with results visible to care-management only. Clinician workflow unchanged. Performance metrics tracked against actual outcomes.
- Week 11 — Full production. Scores visible in all relevant views. Care-management workflows activated. Monitoring dashboards live.
Results
We measured against three metrics at three time horizons. Baseline was the trailing twelve-month readmission rate across all eleven hospitals.
The readmission reduction came through two channels. About 60% of the improvement was attributable to earlier discharge planning for high-risk patients — care-management teams were able to start scheduling follow-ups, coordinating with primary care, and arranging home health services two to four days earlier than before. The remaining 40% came from targeted interventions during admission: pharmacy reconciliation, specialist consults, and patient education focused specifically on the factors driving individual patients' risk scores.
What didn't work — and what we changed
Two things didn't work as planned. First, our initial intervention-recommendation logic was too prescriptive — physicians pushed back on what felt like the model telling them what to do. We rebuilt the interface to frame recommendations as "consider" rather than "recommended," and engagement rose substantially. Second, the emergency department found our 24-hour post-admission score less useful than we'd expected; patients in the ED were too early in their trajectory for most of our features to be informative. We ended up building a separate, simpler triage-stage score for ED use.
Broader implications
Three takeaways we believe generalize beyond this engagement:
- The EHR is the intervention surface, not a data source. Any clinical AI system that doesn't deliver its output into the clinician's existing workflow will fail, regardless of how accurate the underlying model is.
- Explainability isn't a feature. It's a precondition. The clinical review board would not have approved deployment without per-patient feature contributions. We build every clinical model with this assumption from day one.
- Shadow mode is worth the two weeks. Running the model live but invisible for two weeks before full deployment caught three edge cases we hadn't anticipated. It is cheap insurance.