Anaemia in pregnancy remains a critical public health challenge in India, affecting nearly 50% of pregnant women and contributing directly to maternal and neonatal mortality. The gold standard — laboratory haemoglobin measurement — requires venipuncture, trained phlebotomists, and laboratory infrastructure, making frequent monitoring infeasible in resource-limited settings.
In rural primary health centres and community outreach camps, the choice is often binary: perform invasive testing (expensive, slow, requires cold chain) or rely on clinical pallor assessment (subjective, sensitivity <60%). A non-invasive, smartphone-based screening tool could bridge this gap, enabling community health workers to triage pregnant women for anaemia without drawing blood.
Standardised Image Capture
Smartphone photographs of the palpebral conjunctiva (required) and buccal mucosa (optional) are captured alongside a colour calibration card with known colour patches, enabling white-balance normalisation across different smartphones and lighting conditions.
YOLOv8 Region-of-Interest Segmentation
A fine-tuned YOLOv8 model automatically segments conjunctival or mucosal tissue from surrounding anatomy, isolating the diagnostically relevant region. Calibration card patches are simultaneously detected for colour normalisation.
Multi-Modal Feature Extraction
Image features (mean RGB, colour histograms, texture metrics) are combined with 52 sociodemographic and clinical features: age, dietary patterns, obstetric history, and medical history — yielding a 61-feature vector per patient (70 when both sites are combined).
ML Model Ensemble & Prediction
Seven algorithms (Ridge, Lasso, ElasticNet, SVR, Random Forest, Gradient Boosting, XGBoost) are evaluated with 5-fold cross-validation. The model outputs a continuous haemoglobin estimate (g/dL) with severity classification.
Try the Live Demo
Upload a conjunctival image, enter clinical parameters, and receive an instant haemoglobin estimate.
Open on HuggingFace →Among 600 pregnant women (mean age 25.34 ± 3.89 years), 32.17% were primigravida and 39.2% were anaemic by laboratory confirmation. Palpebral conjunctiva imaging with Random Forest regression achieved the best overall performance.
Best model: Random Forest on palpebral conjunctiva — MAE = 1.018 g/dL, AUC-ROC = 0.610, with 57.3% of predictions within ±1.0 g/dL of laboratory values.
Clinical Accuracy — Palpebral Conjunctiva
Site Comparison (Random Forest)
| Imaging Site | MAE (g/dL) | R² | AUC-ROC | Sensitivity | Specificity |
|---|---|---|---|---|---|
| Palpebral Conjunctiva | 1.018 | 0.022 | 0.610 | 0.557 | 0.608 |
| Buccal Mucosa | 1.039 | 0.014 | 0.593 | 0.536 | 0.627 |
| Combined | 1.028 | 0.017 | 0.599 | 0.549 | 0.600 |
Honest assessment: The low R² values (0.014–0.022) indicate that the current pipeline explains very little variance in haemoglobin levels. The primary contributions are the validated 600-patient paired dataset, the end-to-end pipeline infrastructure, and the comparative evidence that conjunctival imaging outperforms buccal mucosa. These represent a foundation for deeper architectures, not a deployment-ready diagnostic.
Despite modest model performance, the contribution is infrastructural. This work establishes a reproducible, colour-calibrated capture protocol replicable across any smartphone; a curated dataset of 600 paired image–laboratory observations from a real clinical cohort; and evidence that palpebral conjunctiva provides a stronger imaging signal than buccal mucosa, guiding future research toward the more informative anatomical site.
In the context of postpartum haemorrhage triage, where any haemoglobin estimate — even approximate — outperforms having no information at all, a screening tool with MAE of ~1 g/dL could differentiate severe anaemia from normal levels, enabling triage decisions that currently rely on subjective pallor assessment alone.
600 standardised, colour-calibrated smartphone images paired with laboratory haemoglobin values from pregnant women in a resource-limited Indian clinical setting — a dataset designed for foundation model fine-tuning and transfer learning that could substantially improve on the classical ML baseline reported here.
The immediate next step is applying pretrained medical vision foundation models (BiomedCLIP, PubMedCLIP) via transfer learning, replacing handcrafted colour features with deep learned representations. We are also exploring whether a vision transformer fine-tuned on conjunctival images alone — without sociodemographic features — can outperform the current multi-modal pipeline, simplifying deployment to a single-photo screening tool.
Longer-term goals include multi-site validation across diverse populations and smartphone hardware, integration with the COGNIT semantic protocol for real-time triage transmission, and a prospective clinical trial comparing smartphone-based screening against standard laboratory testing in community health worker settings.