This is an observational, methodological study using single-cell RNA sequencing data from the ROSMAP cohort for Alzheimer's disease. The intervention was a generative AI-assisted Bayesian-frequentist hybrid inference framework using ChatGPT-4o for prior elicitation in single-cell RNA sequencing analysis. The comparator was theoretical frequentist and full Bayes estimators.
The main result was that the framework identifies biologically coherent pathways previously undetected. No specific effect size, absolute numbers, p-values, or confidence intervals were reported for this outcome.
Safety and tolerability were not reported, as this was a methodological study with no clinical intervention. Key limitations include that this is a preprint or early-phase methodological study (not a clinical trial), validation was simulation-based only, and it was applied to a single cohort (ROSMAP) without external validation.
The practice relevance is that the framework offers a principled and computationally scalable approach to genome-wide Bayesian analysis for omics platforms and disease settings. However, the causality note states association only; no causal claims are made, and certainty is limited because results are based on a single observational cohort and simulations.
View Original Abstract ↓
Alzheimer's disease genomics and other high-dimensional omics studies demand powerful statistical methods, yet Bayesian inference remains underutilized despite its advantages in small-sample settings, owing to the prohibitive cost of eliciting reliable priors across thousands or millions of parameters. We propose an AI-assisted Bayesian-frequentist hybrid inference framework that couples large language model based prior elicitation with the hybrid inference theory of Yuan (2009). ChatGPT-4o is queried via a standardized prompt to assess the strength of evidence linking each gene to a disease of interest, and the response is mapped to an informative normal prior via a standardized effect-size calibration. Parameters for covariates of secondary interest are treated as frequentist parameters, preserving efficiency and avoiding sensitivity to mis-specified priors. We derive closed-form hybrid estimators under uniform and conjugate normal priors in linear models, establish their asymptotic equivalence to the frequentist and full Bayes estimators, and show in simulations that hybrid inference using unconditional variance estimation leads to high statistical power while accurately controlling the Type I error rate. Applied to single-cell RNA sequencing data from the ROSMAP cohort for Alzheimer's disease as an example, the framework identifies biologically coherent pathways (such as gamma-secretase pathways) previously undetected. The proposed framework offers a principled and computationally scalable approach to genome-wide Bayesian analysis, with potential for broad application across omics platforms and disease settings.