Narrative review reports Onca model performance on pancreatic cancer tasks versus other LLMs
This source is a narrative review and model development report focusing on the development of an open 9B dense language model, Onca, fine-tuned from Qwopus3.5-9B-v3. The model was compared against Woollie-7B, CancerLLM-7B, OpenBioLLM-8B, and the unmodified Qwopus base. The scope includes performance on secondary outcomes such as trial eligibility screening, case-specific clinical reasoning, structured pathology report extraction, and molecular variant evidence reasoning. The training dataset consisted of 37,364 rows.
Key synthesized findings indicate that Onca demonstrated the strongest overall results across all evaluated tasks. For trial screening, the model achieved an 81.6 F1 score. In clinical reasoning, it scored 14.1 composite. For structured pathology report extraction, the field exact-match score was 30.5. Additionally, the model achieved a 68.3 macro-F1 on PubMedQA Cancer and a 66.5 macro-F1 on PubMedQA.
The authors acknowledge that many existing oncology-focused language models depend on private institutional corpora, which limits reproducibility and practical reuse across centers. The report concludes that clinically targeted pancreatic-cancer language models can be built from open data with competitive performance while remaining practical to train on a single workstation-scale GPU setup. Practice relevance is tempered by the note that clinicians should not infer clinical efficacy or patient outcomes from model performance metrics on unstructured text tasks.