This is a systematic literature review that synthesizes evidence on computational methods for drug-drug interaction prediction. The authors report that model performance metrics such as AUROC/AUPR are improved, but they do not provide specific effect sizes, p-values, or confidence intervals.
The review identifies several key limitations. Most models rely on a small set of public datasets, and split protocols are heterogeneous and sometimes optimistic. There is limited external or prospective validation, and uncertainty quantification remains underexplored. Label quality assessment and end-to-end integration into prescribing workflows are also underexplored.
The authors do not report a specific study population, sample size, or intervention comparator. Safety data, including adverse events, are not reported. The review does not quantify the clinical impact of these computational methods.
Practice relevance is not reported, and the authors do not make specific recommendations for clinical adoption. The evidence is preliminary, and the limitations noted suggest that current models are not ready for routine clinical use.
View Original Abstract ↓
IntroductionDrug-drug interactions (DDIs) are a major cause of preventable harm in polypharmacy and remain difficult to anticipate as formularies, indication profiles, and interaction labels evolve. Over the last few years, the DDI modeling landscape has shifted rapidly toward graph-native, multimodal, and contrastive or self-supervised learning, alongside renewed interest in extraction, decision support, and pharmacovigilance pipelines. ObjectiveThis systematic literature review (SLR) synthesizes computational work on DDI prediction, event-type classification, text extraction, and safety signal detection published between 2022 and 2025. We aim to (i) organize recent methods into a feature–method taxonomy, (ii) compare their evaluation setups and reported performance, and (iii) assess progress on generalization, explainability, and clinical translation.MethodsUsing a prespecified review protocol and PRISMA 2020 reporting guidance, we searched major bibliographic databases and screened peer-reviewed studies that proposed or evaluated computational methods for DDIs or closely related interaction tasks. Eligible work spans molecular graph and descriptor models, multimodal pharmacological representations, heterogeneous and knowledge graphs, text-based extraction and retrieval, and real-world evidence from EHRs, FAERS, and similar sources. We grouped methods into similarity and matrix-factorization baselines, conventional machine learning, deep neural architectures (CNNs, RNNs, and Transformers), graph neural networks and knowledge-graph representation learning, multimodal fusion, contrastive/self-supervised objectives, and emerging LLM-based frameworks. For each study, we extracted feature modalities, tasks, datasets and splits, metrics, explainability tools, and any form of clinical or user-centred evaluation.ResultsRecent work consistently reports improved AUROC/AUPR on DrugBank-derived, TWOSIDES-like, and DDIExtraction benchmarks, driven by substructure-aware GNNs, KG-augmented architectures, multimodal fusion, and inductive or out-of-distribution training regimes. However, most models still rely on a small set of public datasets, heterogeneous and sometimes optimistic split protocols, and limited external or prospective validation. Event-level and long-tailed risk modeling, prompt- or prototype-based learning, and LLM-assisted extraction strengthen coverage of rare but clinically important interaction types, yet uncertainty quantification, label quality assessment, and end-to-end integration into prescribing workflows remain underexplored.DiscussionBetween 2022 and 2025, DDI modeling has moved decisively toward graph-centric, multimodal, and contrastive/self-supervised paradigms that clearly advance benchmark performance but only partially close the gap to reliable, mechanism-aware clinical decision support. We distill design guidelines and a research agenda around transparent dataset construction, realistic and standardized evaluation protocols, mechanism- and direction-aware modeling, robustness to novel drugs and regimens, and prospective, clinician-in-the-loop validation.