Probing PSA and PREVENT Again
Volume 12, Issue 26 | December 11, 2025
ALSO IN THIS ISSUE
A better way to reduce unnecessary prostate biopsies
Why false-positive PSAs matter
Predicting CVD with scoring systems alone is inadequate (PREVENT and more)
Sensing spiciness with a synthetic tongue
AI for chest X-rays still not yet ready for prime time
A better way to reduce unnecessary prostate biopsies
FDA recently approved a more sophisticated variant of the traditional prostate-specific antigen (PSA) test - one that clinical studies suggest can cut unnecessary biopsies by up to 50%. (The test had previously been available as an LDT.) Whereas PSA levels can rise for any of several benign reasons, this test looks for the changes in PSA chemistry that only result from a malignant tumor. Several clinical trials of this test have demonstrated its effectiveness, the most recent published in July 2025 together with an editorial in Urology.
COMMENTARY: This PSA structure test is about twice as accurate as the traditional high total PSA test. When followed by multi-parametric MRI, it is very effective at ruling out prostate cancer, with just 1.5% false negatives. The false-positive rate remains somewhat high at 36%, but that compares to 75% for high total PSA alone. Not perfect, but given the high prevalence of prostate cancer in older men, this is an important advance in reducing unnecessary biopsy with its not trivial risks.
Sensing spiciness with a synthetic tongue
In this newsletter, we typically write about tests and tools that are used to diagnose humans, or occasionally, animals. This tool is different - it’s used to diagnose plants. Peppers, to be specific.
The tool is a gel-based artificial tongue, and it’s intended to measure how spicy peppers are. (Okay, maybe that sort of assessment doesn’t exactly count as diagnosis. Whatever. Close enough.) Why would such a thing be necessary? Because tasting peppers on the truly spicy end of the spectrum hurts. Plus, humans are subjective on the subject of taste - an objective tool could enable more precise and consistent measurement.
The mechanism behind the artificial tongue was inspired by the fact that milk decreases the perceived intensity of spicy flavor in foods. And in fact, the gel itself contains milk powder — along with acrylic acid and choline chloride, which provide chloride and hydrogen ions that conduct electricity. As Nature described it, “Capsaicin — the compound that gives chili peppers their spice — interacts with the milk proteins in the gel to form bulky complexes that disrupt ion flow. As a result, when the tongue encounters capsaicin, its conductivity drops.” The bigger the change in conductivity, the spicier the pepper.
COMMENTARY: Why false-positive PSAs matter
The universal challenge of screening is a high false-positive rate. When 98 - 99% of a tested population is disease-free, even the best screening test will generate more false than true positives, often by an order of magnitude.
Of course, the objective of screening is to spot cases early enough for successful treatment (which requires high sensitivity), but high sensitivity can only be achieved at the cost of high false positives. That being said, better tests generate fewer of those, while less accurate tests generate more.
So what happens when healthy men with a high PSA (4 - 10ng/ml) are referred for biopsy follow-up? Since only about 25 - 33% of biopsies confirm cancer, the other 67 - 75% undergo biopsy without benefit (one of our readers forwarded a detailed list of references so that we could highlight this question - thank you). Performing multiparametric MRI after PSA cuts “unnecessary” biopsies in half, but MRI is expensive and has only recently become standard of care.
Setting aside substantial anxiety and cost, there are inevitable health consequences after most biopsies. Nearly all those biopsied suffer at least some temporary bleeding. More concerningly, 2 - 7% will get an infection, and half of those require hospitalization. Of all folks biopsied, 0.1% die, most frequently from sepsis. That may seem like a small number, but with about one million biopsies performed in the US, that translates to approximately 1,000 deaths a year, most in healthy men. On the positive front, there is some evidence that biopsy complications have been declining - this chart of post-biopsy hospitalizations is one example.
PSA is the poster child of “ineffective” screening tests (see our reviews here and here). Both US and UK authorities recommend against its use for general screening and have officially declared that, on a population level, its harms exceed its benefits. Unfortunately, PSA is the only test we have for prostate cancer, which remains the most prevalent cancer diagnosis in men, second only to lung cancer in mortality (about 36,000 US deaths in 2025). PSA advocates point out that the test has been the primary reason why prostate-cancer mortality has dropped by more than half since 1992. That’s 39,000 lives saved each year, versus about 1,000 deaths from biopsy. It is starkly clear why PSA remains controversial.
Predicting CVD with scoring systems alone is inadequate
A few weeks ago, we covered the American Heart Association’s new cardiovascular risk screening tool (PREVENT). Since then, a comparison report announced that PREVENT performed worse than the tool it replaced (ASCVD+). This news caught our attention because both these methods grew out of the 2013 pooled cohort system, they’re both based on millions of people’s experiences, and they work with basically the same inputs.
The comparison report asked how 465 admitted patients suffering an acute coronary syndrome would have been scored if they had been assessed 48 hours earlier, using these scoring systems. Neither PREVENT nor ASCVD+ would have provided early warning for about half of these patients (61% under PREVENT, 41% under ASCVD+ - the difference isn’t statistically significant). The authors suggest that both scoring systems underweight high blood pressure and overweight increasing age.
So if these algorithms don’t work, and the disease is all too often asymptomatic until it becomes deadly, what are we to do? The study’s authors say it may be time to move away from these types of risk scores altogether and instead use imaging (calcium scoring and coronary CT) to detect early arterial plaque formation.
COMMENTARY: In the research performed to develop PREVENT, that tool scored (slightly) better than ASCVD+ at identifying those at higher risk. This report does not come at the problem from the other end - i.e. what happened to those who were scored higher risk which was how both systems were developed. What it does show clearly is that about half of acute cardiac events cannot be predicted by any current scoring system, nor are symptoms a useful guide, 53% of patients in the comparison study had no symptoms before their cardiac event.
Of course, the most important point is that few of those with high blood pressure and/or early ischemic disease get tested at all, so two thirds of those at risk have no idea of their status and thus remain untreated. As for the question of whether to use imaging as a general screening tool, that’s already standard of care for those at elevated risk, but extending these more expensive techniques to folks who aren’t high-risk is a more complex question.
AI for chest X-rays still not yet ready for prime time
Interpreting 2D chest X-rays ought to be low-hanging fruit for AI applications, but apparently we need a taller ladder than we thought. According to STAT News’ report on the recent annual meeting of the Radiological Society of North America, not a single radiologist in the audience thought that AI was ready to read chest X-rays without human supervision or review.
The problems they cited are all the usual AI suspects: wild hallucinations (AKA making stuff up) in 15 - 20% of cases, flagging overwhelming numbers of concerns (up to 124, for one model), and inadequate and out-of-date training data (AKA data drift). While one report indicated that using AI provided a 15% time savings for clinicians, another 2025 assessment of clinical accuracy demonstrated the dilemma: Most of the time the AI report is mostly right, but 10 - 20% of the time it is horribly wrong. As a result, every report has to be reviewed by a human, dramatically reducing the time saved.










Thanks Mara. Just to say dCXR is already used extensively without supervision as a TB screening tool in high burden countries.