Mode
Text Size
Log in / Sign up

Machine learning quantification of Gleason Pattern 4 showed higher discrimination for adverse pathology than grade group in prostate cancerAI Measures Prostate Cancer Risk More Accurately Than Standard Methods

AI-generated summary of the cited source, checked by automated accuracy review. How we work

Key Takeaway
Note that ML quantification of GP4 showed higher discrimination than GG in this proof-of-principle cohort.

This cohort study, classified as a proof-of-principle investigation, included 726 patients with grade group 2-4 prostate cancer identified on systematic biopsy who subsequently underwent radical prostatectomy. The setting and specific funding sources were not reported. The primary exposure involved machine learning (ML) quantification approaches for Gleason Pattern 4 (GP4) utilizing the PAIGE-AI algorithm, which was compared against standard Grade Group (GG) assessment. The main outcome assessed was adverse radical prostatectomy pathology and biochemical recurrence.

Regarding discrimination for adverse radical prostatectomy pathology, the pixel-counting method yielded an area under the curve (AUC) of 0.648. Quantification of GP4 using the ML approach outperformed standard GG assessment, with an AUC of 0.627 versus 0.608 for GG. Findings regarding biochemical recurrence were consistent with adverse pathology, though specific effect sizes were not reported. The predictive value of GP3 was noted as non-predictive once GP4 was known. Safety data, including adverse events and tolerability, were not reported.

Key limitations include the use of a convenience sample and the study's designation as a proof-of-principle effort. The authors intend to apply this method on larger cohorts to determine the best prediction of oncologic outcomes. Consequently, practice relevance is currently limited to supporting the use of ML as a research tool to compare different GP4 quantification approaches. Larger cohorts are needed to determine the best prediction of oncologic outcome.

John, 62, sat in the urologist’s office, heart pounding. The biopsy showed prostate cancer. But how dangerous was it? The answer could change everything — his treatment, his anxiety, his future.

For millions of men like John, a prostate cancer diagnosis brings uncertainty. Doctors rely on a system called Grade Group to predict how aggressive the cancer is. But it’s not perfect — two men with the same score can have very different outcomes.

One key factor is Gleason Pattern 4, or GP4. This describes how abnormal the cancer cells look under the microscope. More GP4 usually means higher risk — but measuring it has always been tricky.

Doctors used to eyeball how much GP4 was on the slide. Now, AI may do it better — and more consistently. A new study tested whether machine learning can measure GP4 with precision.

The AI sees what doctors miss

The study used an AI system called PAIGE-AI to analyze digitized biopsy slides from 726 men. All had Grade Group 2 to 4 cancer — the middle to higher-risk range. Every man had a radical prostatectomy, so researchers could compare AI findings to actual surgery results.

The AI tried 15 different ways to measure GP4. Some measured the length of GP4 areas. Others counted pixels — tiny dots of color — to calculate how much GP4 was truly there.

One method stood out: counting pixels. It was the most accurate at predicting what surgeons found during prostate removal. It also did better than the standard Grade Group in forecasting cancer return.

Think of it like measuring rain in a field. Old methods were like guessing from a few puddles. Pixel counting is like measuring every drop — giving a fuller picture.

The AI didn’t just match human grading — it beat it. For predicting aggressive cancer found during surgery, pixel counting scored 0.648 on a 1.0 scale. Standard grading scored 0.608 — a small but meaningful gap.

But there’s a catch.

The amount of less aggressive Gleason Pattern 3 didn’t matter once GP4 was known. This suggests GP4 is the real driver of risk — and measuring it right is critical. That’s where AI could make the biggest difference.

This doesn't mean this treatment is available yet.

Experts say this isn’t about replacing pathologists — it’s about helping them. AI acts like a super-magnifying glass, spotting patterns humans might overlook. It could one day help doctors decide who needs surgery — and who can safely wait.

For patients, this means more personalized care. A man with a low amount of GP4 might avoid aggressive treatment. Another with more GP4 could start sooner — with better confidence in the decision.

But the system isn’t ready for clinics yet. The study only looked at men who already had surgery — not those on active surveillance. And the AI hasn’t been tested across diverse hospitals or populations.

Also, AI needs high-quality digital slides. Not all labs have them. Some biopsies are still read the old way — under a microscope, by hand.

The next step is testing in larger, more diverse groups. Researchers plan to use this method on thousands more cases. Goal: find the single best way to measure GP4 — and make it standard.

Right now, AI is a research tool — not part of your doctor’s report. But it’s moving fast. One day, your biopsy might be read by both a pathologist and a smart algorithm.

That future isn’t here yet. But for men facing prostate cancer, more precise answers may be closer than ever. And that could mean fewer guesses — and better choices.

Study Details

Study typeCohort
Sample sizen = 726
EvidenceLevel 3
PublishedApr 2026
View Original Abstract ↓
Objective: To demonstrate the proof of principle that machine learning (ML) can be used to quantify Gleason Pattern (GP) 4 on digitized biopsy slides using multiple measurement approaches, allowing direct comparison of their prognostic performance. Methods: We assembled a convenience sample of 726 patients with grade group 2-4 prostate cancer on systematic biopsy who underwent radical prostatectomy between 2014 and 2023. Digitized biopsy slides were analyzed using a machine-learning algorithm (PAIGE-AI) to quantify GP4 using multiple measurement approaches, particularly with respect to how gaps between cancer foci (interfocal stroma) were handled. GP4 extent was quantified using linear measurements or a pixel-based area metric. Discrimination of each GP4 quantification approach, along with Grade Group (GG), was assessed for adverse radical prostatectomy pathology and biochemical recurrence. Results: We identified 15 different quantification approaches and observed differences between their discrimination. The highest discrimination was in the pixel-counting method (AUC 0.648). GP4 quantification outperformed GG for predicting adverse pathology (AUC 0.627 vs 0.608). Amount of GP3 was non-predictive once GP4 was known. These findings were consistent for BCR. Conclusions: We were able to measure slides using 15 distinct measurement approaches and replicated prior findings using ML to quantify GP4. Our findings support the use of ML as a research tool to compare different GP4 quantification approaches. We intend to use our method on larger cohorts to determine with which measurement approach best predicts oncologic outcome.
Free Newsletter

Clinical research that matters. Delivered to your inbox.

Join thousands of clinicians and researchers. No spam, unsubscribe anytime.