Clarion Sentinel Platform · Hematology Division

Engine Technical
Design Document

Architecture, pipeline design, model specification, and performance validation across eight AI detection engines for complete blood count intelligence.

Document Class
Technical Design Specification
Platform
Sentinel Hema · Blood Intelligence
Version
3.2.0
Classification
Confidential — Internal
Table of Contents
01CBC Pattern IntelligenceMulti-parameter constellation analysis02Peripheral Smear AIDiffusion-based morphology classification03Leukemia DetectionBlast identification & subtype screening04Anemia ClassificationMorphological etiology determination05Coagulation IntelligencePlatelet & coagulation cascade analysis06Infection Typing & SeverityPre-culture pathogen differentiation07Bone Marrow Stress IndicatorsNon-invasive marrow function assessment08Longitudinal Trend IntelligenceTemporal trajectory & predictive modeling
Executive Summary

Sentinel Hema is built on a thesis that the complete blood count — the most frequently ordered and most underread laboratory test in medicine — contains layers of diagnostic intelligence that standard reference-range flagging systematically fails to extract. Each CBC generates 37 or more discrete parameters. Evaluated in isolation, these values yield binary normal/abnormal flags. Evaluated as an interconnected constellation, they reveal disease signatures invisible to conventional interpretation.

This document specifies the technical architecture, processing pipeline, model design, and validation performance for each of the platform's eight AI detection engines. Together, these engines transform the CBC from a diagnostic checklist into a continuous surveillance and classification system spanning hematological malignancy, anemia etiology, coagulation risk, infection typing, bone marrow function, and predictive trajectory analysis.

The engine suite employs a shared data ingestion layer for HL7/FHIR interoperability, with each engine operating as an independent analytical module that can trigger cross-engine cascades when its findings implicate a related domain. This architecture enables both real-time clinical decision support and longitudinal monitoring across the full hematological spectrum.

Model architectures range from gradient-boosted ensembles for tabular CBC data (Engine 01) to diffusion-based generative classifiers for morphological image analysis (Engine 02), hybrid CNN–Vision Transformer networks for malignancy screening (Engine 03), and LSTM-based temporal networks for longitudinal pattern detection (Engine 08). All engines undergo multi-center validation with independent external test cohorts and are designed for SMART on FHIR integration with existing EHR infrastructure.

8
Analysis Engines
37+
CBC Parameters
500K+
Training Images
<2s
End-to-End Inference
Engine 01 · Core Analytic Layer

CBC Pattern Intelligence

Thirty-seven parameters as a constellation — not a checklist. Every CBC tells a story most systems never read.

37+
Parameters
96.2%
AUROC
0.4s
Latency
Processing Pipeline
01
Data Ingestion
HL7/FHIR intake from Sysmex XN, Beckman DxH, Abbott Alinity. Unit harmonization and delta-check flagging across vendor formats.
HL7v2FHIR R4LOINC
02
Feature Engineering
22 derived ratio computations: NLR, dNLR, PLR, LMR, SII, SIRI, AISI, HPR. RDW-to-MCV coupling. Reticulocyte production index.
22 RatiosDAG Selection
03
Pattern Recognition
Gradient-boosted ensemble (CatBoost + XGBoost) across multi-dimensional parameter space. 2.1M anonymized CBC training records.
CatBoostXGBoostEnsemble
04
Constellation Mapping
SHAP-based feature attribution maps parameter clusters to 84 diagnostic phenotypes. Interpretable constellation diagrams.
SHAPUMAPt-SNE
05
Clinical Output
Risk-stratified suggestions with confidence intervals. Downstream engine triggers. EHR alert dispatch via CDS Hooks.
SMART on FHIRCDS Hooks
Model Architecture

The core of Engine 01 is a gradient-boosted ensemble that processes all 37+ CBC parameters simultaneously rather than evaluating each against isolated reference ranges. Feature importance analysis consistently identifies PDW, immature platelet fraction, neutrophil percentage, and RDW as the most discriminative predictors.

Eight CBC-derived inflammatory ratios — NLR, dNLR, LMR, PLR, SII, SIRI, AISI, and HPR — transform nonspecific individual markers into precise composite signatures. A directed acyclic graph method selects optimal feature combinations, enabling a reduced model with as few as four features to retain AUC above 94%.

Training & Validation

The training corpus comprises 2.1 million anonymized CBC records from academic medical centers, community hospitals, and ambulatory clinics. Disease representation is balanced through hybrid synthetic data generation based on statistical feature distributions.

Validation follows a discovery-validation cohort design with independent external testing. Precision: CV under 3% for WBC, under 2.5% for hemoglobin, under 6% for RBC — meeting European Federation of Clinical Chemistry guidelines.

Diagnostic Phenotype Coverage
  • Iron deficiency (microcytic + elevated RDW) before frank anemia
  • Occult malignancy screening via NLR/PLR inflammatory signatures
  • Sepsis risk stratification through immature granulocyte fraction
  • MDS flagging via multi-lineage dysplasia patterns
  • Hemolysis detection through reticulocyte-haptoglobin coupling
  • Nutritional deficiency profiling (B12, folate, iron trilogy)
  • Chronic inflammatory quantification for autoimmune monitoring
  • Bone marrow production stress from output indices
Integration Architecture

Engine 01 triggers downstream engines: morphological anomalies activate Engine 02 (Peripheral Smear AI), inflammatory abnormalities cascade to Engine 06 (Infection Typing), lineage-specific cytopenias trigger Engine 07 (Bone Marrow Stress).

All outputs structured as FHIR DiagnosticReport resources with CDS Hooks for real-time EHR integration. SMART on FHIR launch supports in-context clinical display alongside native analyzer results.

Performance Validation
MetricScore
Overall AUROC
96.2%
Anemia Detection
97.8%
Leukemia Flagging
94.1%
Infection Typing
92.6%
Reduced Model (4 features)
94.9%
Clinical Impact Assessment

By analyzing 37+ parameters as an interconnected constellation, Engine 01 identifies patterns that reference-range checks miss — including early malignancy signatures in inflammatory ratios and pre-anemic iron depletion visible only through RDW–MCV coupling dynamics.

23%
More early iron deficiency detections vs. standard flagging
3.2×
Increase in confirmed subclinical malignancy referrals
41%
Reduction in unnecessary repeat CBC orders
Engine 02 · Visual Morphology Layer

Peripheral Smear AI

Humans cannot examine every cell in a smear. This engine can — and knows when it is uncertain.

500K+
Training Images
97.4%
Accuracy
26
Cell Subtypes
Processing Pipeline
01
Digital Capture
100× oil-immersion digitization. Wright-Giemsa stain quality validation. Z-stack 3D imaging beyond diffraction limits.
100× ImmersionZ-Stack 3D
02
Segmentation
U-Net isolates individual cells from complex backgrounds. 98.1% extraction accuracy handling overlaps, artifacts, debris.
U-NetInstance Seg.
03
Generative Classification
Diffusion-based generative classifier models full morphology distribution — accurate classification with anomaly detection and domain-shift resistance.
Diffusion ModelGenerative
04
Anomaly Detection
OOD scoring identifies rare morphologies. Uncertainty quantification calibrated to surpass clinical expert benchmarks.
OOD ScoringUQ
05
Clinical Triage
Routine smears auto-cleared with audit trail. Abnormal cells flagged with annotated morphology gallery for hematologist review.
Auto-VerifyHuman-in-Loop
Model Architecture

A diffusion-based generative classifier models the full distribution of blood cell morphology rather than discriminating boundaries. This yields accurate classification combined with anomaly detection, domain-shift resistance, and uncertainty quantification surpassing clinical experts.

Each cell is processed through a denoising diffusion probabilistic framework generating per-class likelihood scores — inherently data-efficient and adaptable to staining and imaging variation across institutions.

Cell Classification Taxonomy
  • WBC (10): Neutrophil, Band, Hypersegmented, Lymphocyte, Reactive Lymphocyte, Monocyte, Eosinophil, Basophil, Myeloblast, Lymphoblast
  • RBC (16): Normocyte, Microcyte, Macrocyte, Spherocyte, Schistocyte, Target, Teardrop, Sickle, Elliptocyte, Echinocyte, Stomatocyte, Bite, Pencil, Knizocyte, Hypochromic, Normoblast
  • Platelets: Normal, Giant, Clumped, Satellitism
  • Artifacts: Debris, Staining artifact, Bubble, Fiber
Training Dataset

Over 500,000 peripheral blood smear images — the largest curated collection of its kind. Includes common types, rare variants, and features that confuse both automated systems and humans: reactive lymphocytes mimicking blasts, fragments near platelet size, staining artifacts resembling inclusions.

Inter-observer studies show 15–20% discordance between experienced microscopists on identical smears. Engine 02 eliminates this variability with consistent, reproducible classification and calibrated uncertainty.

Uncertainty Quantification

Per-cell confidence distributions route uncertain cases to human review with annotated differential possibilities. Hematologists focus expertise on genuinely ambiguous cells rather than routine classification.

Dual-mode operation (auto-verify routine / flag uncertain) reduces hematologist workload 60–70% while maintaining precision for rare pathologies: circulating blasts, microangiopathic changes, parasitic inclusions.

Performance Validation
MetricScore
Cell Classification
97.4%
WBC Differential
95.8%
RBC Morphology
93.4%
Anomaly Detection
96.1%
Cross-Lab Generalization
94.2%
Clinical Impact Assessment

A standard blood smear contains thousands of cells — far more than any human can examine one by one. Engine 02 automates exhaustive analysis, triages routine cases, and highlights unusual findings, transforming the peripheral smear from bottleneck to rapid diagnostic asset.

65%
Reduction in manual smear review time
<2 min
Full smear analysis vs. 15–20 min manual
15–20%
Inter-observer discordance eliminated
Engine 03 · Malignancy Screening Layer

Leukemia Detection

Every hour of delay costs therapeutic options. This engine buys them back.

94.8%
Sensitivity
97.1%
Specificity
4
Leukemia Types
Processing Pipeline
01
Multi-Signal Intake
Fuses CBC constellation (Engine 01), morphology (Engine 02), and immature cell fractions from automated analyzers.
Multi-Engine Fusion
02
Blast Identification
CNN-based blast detector differentiates true blasts from reactive lymphocytes and monocyte precursors.
ResNet-152Attention
03
Subtype Classification
Hierarchical classifier: ALL, AML, CLL, CML via N:C ratio, chromatin texture, granulation profiling.
Vision TransformerHybrid CNN
04
MDS Screening
Multi-lineage dysplasia analysis across WBC, RBC, and platelet morphology for early MDS detection.
Multi-LineageEnsemble
05
Urgent Escalation
Critical alerts with immunophenotyping panels. Flow cytometry pre-order. Direct hematologist page for blast >5%.
Critical AlertAuto-Reflex
Detection Methodology

Hybrid CNN–Vision Transformer captures local cellular features (nuclear morphology, granulation) and global slide-level patterns (blast %, distribution). Transfer learning from Engine 02's 500K+ corpus provides the backbone; task-specific fine-tuning enables precise subtype discrimination.

Subtype Signatures
  • ALL: Lymphoblasts, high N:C ratio, fine chromatin, PAS-positive cytoplasm
  • AML: Myeloblasts with Auer rods, irregular nuclei, azurophilic granulation
  • CLL: Mature small lymphocytes, smudge cells, monotonous population
  • CML: Full myeloid spectrum, basophilia, dwarf megakaryocytes
  • MDS: Hyposegmented neutrophils, ring sideroblasts, micromegakaryocytes
Performance Validation
MetricScore
Blast Detection
94.8%
ALL vs. AML
91.2%
CLL Screening
96.3%
MDS Flagging
89.6%
False Positive Rate
2.9%
Clinical Impact Assessment

~62,000 new leukemia cases annually in the US. The CBC is often the first signal — yet subtle blasts and early dysplasia are routinely missed by automated differentials. Engine 03 transforms the CBC into active malignancy surveillance.

8.4 h
Time saved to hematology consult
31%
More MDS detected before transfusion dependence
Engine 04 · Red Cell Intelligence Layer

Anemia Classification

Beyond hemoglobin — determining why the patient is anemic, not merely that they are.

93.4%
F1 Score
κ 0.89
Expert Agree
12
Subtypes
Processing Pipeline
01
Index Analysis
MCV/MCH/MCHC clustering with RDW. Mentzer index for thalassemia. Reticulocyte production index for marrow response.
MCV GatingRPI
02
Morphology Fusion
Engine 02 RBC data: microcytes, targets, sickle cells, schistocytes, teardrops, spherocytes mapped to etiology clusters.
16 RBC TypesCross-Engine
03
Etiology Modeling
Semi-supervised classifier (FixMatch, 25% annotation). 12 subtypes. 93.4% F1, κ = 0.89 with expert diagnoses.
FixMatchSemi-Supervised
04
Iron Studies Prediction
Surrogate model predicts ferritin/TIBC/transferrin saturation from CBC morphology for provisional classification.
Surrogate Model
05
Treatment Guidance
Etiology-specific workup recommendations. Reticulocyte response prediction at 7 and 14 days post-intervention.
Decision Support
Classification Taxonomy
  • Microcytic: Iron deficiency, thalassemia trait, chronic disease, sideroblastic
  • Normocytic: Acute blood loss, chronic disease, renal insufficiency, mixed deficiency
  • Macrocytic: B12 deficiency, folate deficiency, MDS, hepatic disease
  • Hemolytic: Autoimmune, microangiopathic (TTP/HUS), spherocytosis, sickle cell
Morphological Decision Logic

Microcytic + elevated RDW → iron deficiency. Microcytic + normal RDW + targets → thalassemia trait. Schistocytes → microangiopathic hemolysis workup. Teardrops + nucleated RBCs → marrow infiltration flag.

Semi-supervised approach achieves κ = 0.89 expert agreement while reducing diagnostic turnaround — especially valuable for sickle cell and microcytic populations.

Performance Validation
MetricScore
Overall F1
93.4%
Iron Deficiency
96.7%
Sickle Cell
95.2%
Thalassemia Trait
91.8%
Hemolytic Subtypes
89.3%
Clinical Impact Assessment

Anemia affects one-third of the global population, yet etiology is frequently misclassified. Engine 04 transforms the CBC from a hemoglobin threshold into an etiological classification system — guiding targeted workup rather than empiric iron supplementation.

47%
Reduction in empiric iron for non-iron-deficient anemias
2.1 d
Faster correct etiology determination
Engine 05 · Hemostasis Layer

Coagulation Intelligence

Platelet count alone is a number. This engine reveals the mechanism — and predicts the trajectory.

91.7%
DIC Predict
88.4%
TTP Flag
6 h
Early Warning
Processing Pipeline
01
Platelet Profiling
Count, MPV, PDW, IPF, P-LCR. Giant platelet and clump detection from Engine 02.
IPFP-LCRMPV
02
Consumption Analysis
Platelet trajectory slope. Schistocyte quantification. Fibrinogen consumption surrogate from CBC parameters.
TrajectorySchistocyte %
03
DIC Scoring
Modified ISTH DIC from CBC. Bayesian net: platelet trend + schistocytes + IPF kinetics + clinical context.
ISTH ModifiedBayesian Net
04
TMA Detection
TTP, HUS, HELLP screening via schistocyte-platelet-LDH surrogate coupling and PLASMIC approximation.
TMA ScreenPLASMIC
05
Intervention Triggers
Transfusion alerts. HIT 4T scoring. Platelet refractoriness via corrected count increment monitoring.
4T ScoreCCI
DIC Detection Architecture

DIC mortality exceeds 40% when treatment is delayed. Engine 05 builds a modified ISTH score from platelet trajectory (not just absolute count), schistocyte percentage, and IPF kinetics as fibrinogen consumption surrogate.

Bayesian network enables probabilistic DIC staging (non-overt vs. overt) with 6-hour early warning — identifying consumption before coagulation panels alarm.

Thrombocytopenia Differential
  • Decreased Production: Marrow failure, MDS, chemo — low IPF, normal MPV
  • Increased Destruction: ITP, DIC, TTP/HUS — elevated IPF, large MPV
  • Sequestration: Hypersplenism — moderate TCP with pancytopenia
  • Pseudothrombocytopenia: EDTA clumping — Engine 02 morphology detection
  • HIT: Day 5–10, >50% drop, integrated 4T scoring
Performance Validation
MetricScore
DIC Prediction
91.7%
TTP / HUS Flagging
88.4%
HIT Detection
85.9%
Pseudo-TCP ID
97.3%
Clinical Impact Assessment

EDTA-dependent pseudothrombocytopenia accounts for ~17% of low platelet flags. Engine 05 eliminates this artifact while providing coagulopathy risk stratification hours before traditional panels alarm.

6 h
Earlier DIC identification vs. standard protocol
17%
Pseudo-TCP cases correctly reclassified
Engine 06 · Infection Intelligence Layer

Infection Typing & Severity

Before the culture returns — the CBC already holds the answer.

92.3%
Etiology Acc
94.6%
Severity AUC
48 h
Pre-Culture
Processing Pipeline
01
WBC Differential
Neutrophils, bands, IG fraction, lymphocyte subtypes, monocytes, eosinophil patterns from analyzer.
5-Part DiffIG Fraction
02
Left-Shift Analysis
I:T ratio. Band:seg ratio with toxic granulation, Döhle bodies, vacuolization from Engine 02.
I:T RatioToxic Changes
03
Pathogen Typing
Bacterial vs. viral vs. parasitic probability on continuous spectrum. Random forest ensemble.
Random ForestEnsemble
04
Severity Scoring
NLR severity index. SII and SIRI composites. Bandemia alerts for sepsis cascade risk.
NLRSIISIRI
05
Stewardship Output
Bacterial probability guides empiric therapy. Viral pattern reduces unnecessary ABX. Sentinel Sepsis integration.
ABX Stewardship
Infection Signatures

Bacterial: neutrophilia + left shift + toxic changes (heavy granulation, Döhle bodies, vacuolization). Viral: lymphocytosis + reactive morphology + relative neutropenia. Engine quantifies a continuous bacterial–viral probability spectrum for mixed and atypical presentations.

Antimicrobial Stewardship

Pre-culture bacterial probability score guides empiric therapy — reducing unnecessary antibiotic exposure for viral infections while ensuring rapid coverage for bacterial processes. Integrates with Sentinel Sepsis when severity scores exceed threshold for seamless escalation.

Performance Validation
MetricScore
Bacterial vs. Viral
92.3%
Severity Prediction
94.6%
Left-Shift Detection
96.8%
Parasitic Pattern
87.2%
Clinical Impact Assessment

Cultures take 24–72 hours. Engine 06 provides probabilistic pathogen typing from CBC alone — guiding antibiotic stewardship at maximum clinical uncertainty.

28%
Fewer unnecessary ABX for viral presentations
48 h
Earlier pathogen-class guidance vs. culture
Engine 07 · Production Intelligence Layer

Bone Marrow Stress Indicators

A non-invasive window into marrow function — reading production stress without aspiration.

89.6%
MDS Flag
91.2%
Failure Detect
3
Lineages
Processing Pipeline
01
Multi-Lineage Assessment
Erythroid (RBC + reticulocyte), myeloid (neutrophil + IG), megakaryocytic (platelet + IPF) production indices.
Tri-LineageProduction Index
02
Dysplasia Scoring
Engine 02 morphology: hyposegmented neutrophils, hypogranulation, megaloblastoid changes, giant platelets.
Dysplasia %Morphology
03
Failure Recognition
Aplastic vs. MDS vs. infiltrative differentiation through production kinetics and morphological profiles.
Pattern MatchKinetic Model
04
Recovery Monitoring
Post-chemo nadir prediction. Engraftment tracking via reticulocyte and IPF recovery kinetics.
Nadir PredictEngraftment
05
Biopsy Recommendation
Evidence-weighted scoring. Risk-benefit analysis based on urgency and non-invasive confidence.
Decision Score
Non-Invasive Marrow Assessment

RPI reflects erythroid output, IG fraction indicates myeloid activity, IPF mirrors megakaryopoietic stress. Combined with Engine 02 dysplasia scoring, the system generates a multi-lineage marrow health report — confirming biopsy need or providing confidence to defer.

Failure Syndrome Differentiation
  • Aplastic: Pancytopenia + low reticulocytes/IG/IPF — emptying
  • MDS: Cytopenias + dysplasia ≥10%, paradoxical reticulocyte response
  • Infiltrative: Leukoerythroblastic picture + teardrops
  • Nutritional: Megaloblastic + hypersegmented neutrophils — correctable
  • Post-Chemo: Predictable nadir, sequential recovery
Performance Validation
MetricScore
MDS Flagging
89.6%
Marrow Failure
91.2%
Engraftment Prediction
87.8%
Biopsy Recommendation
92.4%
Clinical Impact Assessment

Engine 07 identifies patients who genuinely require biopsy while sparing those with sufficient peripheral blood clarity. Real-time engraftment monitoring enables precision-timed growth factor and transfusion support.

34%
Fewer unnecessary bone marrow biopsies
1.8 d
Earlier engraftment detection post-transplant
Engine 08 · Temporal Intelligence Layer

Longitudinal Trend Intelligence

A single CBC is a photograph. A series is a motion picture. This engine reads the film.

14 d
Avg Warning
93.1%
Trend AUROC
History Depth
Processing Pipeline
01
Temporal Aggregation
Complete CBC history as multivariate time-series. Cross-vendor normalization for longitudinal consistency.
Time-SeriesNormalization
02
Trajectory Modeling
Hybrid LSTM–Temporal CNN captures acute changes and slow drifts across variable time intervals.
LSTMTemporal CNN
03
Change-Point Detection
Bayesian analysis separates meaningful clinical shifts from biological noise.
Bayesian CPDAnomaly Score
04
Predictive Forecasting
7-day and 14-day parameter forecasts with confidence intervals. Critical threshold crossing prediction.
ForecastConfidence Int.
05
Pattern Alerting
Progression alerts. Treatment response tracking. Relapse signatures. Sickle cell crisis prediction.
ProgressionRelapse Detect
Temporal Architecture

LSTM captures long-range dependencies (gradual hemoglobin decline over months → occult GI loss). Temporal CNN detects acute changes (sudden platelet drops → consumption). Time-aware attention handles variable intervals while preserving historical value.

Clinical Trajectory Patterns
  • Occult Blood Loss: Hgb drift + rising RDW — 14d early warning
  • MDS Progression: Deepening cytopenias + emerging dysplasia
  • Treatment Response: Expected vs. actual recovery curves
  • Sickle Crisis: Pre-crisis WBC/reticulocyte patterns
  • CML Acceleration: Basophil/blast trend → blast crisis
  • Relapse: Post-remission baseline deviation
Performance Validation
MetricScore
Trend Detection
93.1%
Change-Point Accuracy
90.7%
7-Day Forecast
88.3%
Relapse Prediction
86.9%
Crisis Prediction
84.5%
Clinical Impact Assessment

Engine 08 transforms hematological monitoring from snapshots to predictive trajectories — detecting drift before crisis and forecasting where counts are heading, not only where they are.

14 d
Average early warning before critical threshold
42%
Fewer emergency transfusions via proactive monitoring
2.7×
Earlier relapse detection vs. scheduled surveillance