GUIDE

AI-powered clinical guideline generation interface

GUIDE (AutoEvidence)

Python Version License PRs Welcome

English

Read-Friendly Website

What is GUIDE πŸ—ΊοΈ

GUIDE (Guideline Update Intelligence Decision Engine) is an AI-powered, human-in-the-loop clinical guideline generation interface.

GUIDE Invents AI Generation of Clinical Guideline βœ¨βœ¨πŸ€–πŸ§ βœ¨βœ¨

GUIDE supports evidence-based medicine (EBM) workflow for clinical practice guideline generation and updating.

The system is designed to support:

GUIDE Advocates Human-in-the-loop πŸ§‘β€βš•οΈπŸ©Ί 🀝 πŸ€–πŸ’»

GUIDE is designed as an interactive tool for augmenting human experts. High-volume and rule-based tasks are delegated to specialized AI agents, while medical experts retain oversight, adjudication authority, and final responsibility for judgment-intensive steps.

It is NOT intended to replace:

Quick Start πŸš€

The recommended machine for deployment are:

Prerequisites βš™οΈπŸ”§

First-Time Machine Setup

For a first-time user on Linux/macOS machine, run the bootstrap script from the project root:

bash ./run_bootstrap_script.sh
# After this succeeded, test run in development mode:
bash ./start_development.sh

This installs the local development prerequisites used by this repository:

The development startup flow does not use Docker. start_development.sh starts the local R plumber service on 127.0.0.1:8102 and points the backend at that URL.

Start-up in production mode

Deploying with production mode would require:

bash start_docker.sh

Notes

We are not making the source code of GUIDE publicly available at this stage owing to the safety implications of unmonitored use of such a system in medical guideline workflows. Inappropriate deployment outside expert-supervised settings could lead to misuse in evidence appraisal, recommendation formulation, or clinical decision-support contexts beyond the validated scope of this study.

For peer review only, we have provided the source code, testing dataset, and demonstration video as supplementary materials for the editors and expert reviewers in the submission system.

Introduction to Each Step

1. Clinical Question to Protocol πŸ™‹ ➑️ πŸ—ΊοΈ

GUIDE starts from decomposing an arbitary clinical questions into PICO elements, and generates a structured Review Protocol table, which serves as the key controlling paradigm for subsequent steps.

The Review Protocol covers the following terms:

To enhance human-computer-interaction, all fields mentioned above are editable by human experts to control downstream evidence retrieval and screening, outcome extraction, etc.

Clinical question to protocol interface
Protocol Generation A free-text clinical question is transformed into editable PICO fields and a structured review protocol.

2. Search Strategy Generation πŸ—ΊοΈ ➑️ πŸ”

Generate database-specific search strategies from Review Protocol, including:

GUIDE supports dual-agent search strategy generation, where two LLM agents with different LLM configurations independently generate search strategies. AI reviewer compares retrieval results and flags potential unreliable searches, such as large discrepancies in retrieval volume or low overlap between search outputs.

Dual-agent search strategy generation interface
Dual search strategy generation. Dual agents draft database-specific retrieval logic

3. Literature Retrieval πŸ” ➑️ πŸ“šπŸ“šπŸ“š

Execute literature searches through biomedical databases and retrieve:

Supported sources include:

Literature retrieval and screening queue interface
Retrieved Articles Search results are collected into a working set that supports bulk review, export, and staged for batch screening

4. Two-Stage Article Screening πŸ“šπŸ“šπŸ“š ➑️ βœ¨πŸ“šβœ¨

GUIDE supports sequential evidence screening:

Screening agents evaluate articles against the Protocol-derived inclusion and exclusion criteria and provide:

In dual-agent mode, two independent agents screen the same evidence pool. When disagreement occurs, the AI reviewer triggers expert adjudication.

Title and abstract screening workspace
Title-Abstract Screening
Screening reason popover
Agent Rationale Review
Full-text screening workspace
Get PDF files forFull-text Screening
Full-text Extraction
Batch Upload PDF and Extract Full-text with OCR
Batch PDF upload and article assignment modal
Full-text Screening Results

5. Full-Text Evidence Extraction πŸ“š ➑️ πŸ“–

For included studies, GUIDE extracts structured evidence from full-text articles to support guideline evidence tables and downstream evidence synthesis.

The extracted information includes:

GUIDE supports extraction of multiple outcome types, including:

GUIDE also extracts numerical results into structured tables according to outcome type, including:

GUIDE supports text and table extraction from PDFs. When informationis missing, unclear, or not applicable, system marks it explicitly to support transparent expert review.

Data extraction tables
Extracted Tables from each included articles
Clinical Evidence Summary Table
Clinical Evidence Summary Table
Outcomes discrepency report
Show dual-agent discrepency in extracted outcome fields for human edition

6. GRADE-Based Effect Estimate and Quality Assessment πŸ’―πŸ“ˆ

GUIDE supports GRADE-ready evidence synthesis by selecting an appropriate effect-estimate approach for each outcome before certainty assessment.

For each outcome, GUIDE identifies:

Domain experts group study-level outcomes into comparison- and outcome-level evidence bodies for effect synthesis.

GUIDE supports effect estimates including:

When appropriate, GUIDE calculates pooled effect estimates using random-effects meta-analysis via R script.

GUIDE then supports GRADE-oriented certainty assessment across key domains:

AI agents can perform preliminary effect-estimate synthesis and grading, while expert review remains central for judgment-intensive domains and final certainty decisions.

GRADE grouping and meta-analysis preparation view
Automatic & Manual grouping for Meta analysis preparation
Meta-analysis result and certainty summary
Pooled results and certainty summary. Meta-analysis outputs are presented with effect estimates, inconsistency, and a GRADE-ready narrative interpretation.

7. Body-of-Evidence Assembly

GUIDE organizes extracted study-level evidence into structured bodies of evidence for each clinical outcome.

This includes:

8. Delphi-Style Multi-Agent Consensus

GUIDE includes a multi-agent virtual consensus module that simulates a guideline committee.

Supported roles may include:

Each AI panelist provides an independent role-specific assessment. A committee chair agent synthesizes agreement, disagreement, and unresolved issues. Multiple deliberation rounds can be conducted until consensus criteria are met or expert intervention is required.

Delphi-style multi-agent consensus discussion
Virtual committee deliberation. Role-specific panelists debate recommendation wording and certainty until the committee converges on a consensus.

9. Evidence-to-Recommendation Synthesis

GUIDE supports structured Evidence-to-Recommendation synthesis by integrating:

This module helps bridge the gap between evidence appraisal and clinically interpretable recommendations.

Evidence-to-recommendation synthesis report
Recommendation report synthesis. The committee output is consolidated into a structured Evidence-to-Recommendation report with rationale, tradeoffs, and implementation framing.

10. Human-in-the-Loop Reviewer Mechanism

GUIDE is built around an AI reviewer triggered governance model.

The AI reviewer can flag:

This design allows experts to focus on high-value arbitration rather than continuously monitoring every automated step.

11. AI Assistant Interface

A LangChain-based conversational agent orchestrates the workflow through natural language.

The Assistant can:

AI assistant interface for workflow orchestration
Conversational workflow control. The assistant translates natural-language instructions into tool-backed workflow actions such as PICO extraction and strategy refinement.