Advertisement

Methylation-Based AI Model Classifies Tumors of Unknown Origin


Advertisement
Get Permission

An artificial intelligence (AI) model using DNA methylation patterns was able to classify tumors of unknown origin with high accuracy, according to the results of a study presented at the American Association for Cancer Research (AACR) Annual Meeting 2026 (Abstract 3869). 

“One of the most important findings from our study is that we were able to accurately predict the origin of many different cancer types using a very small subset of DNA markers, about 1,000 CpG regions selected from hundreds of thousands across the genome,” said Marco A. De Velasco, PhD, Faculty Member in the Department of Genome Biology, Kindai University, Japan. “This is important because it shows that we can simplify complex molecular data while still maintaining strong predictive performance.” 

Study Methods 

Researchers developed and validated a prediction model for classifying cancers of unknown primary based on DNA methylation profiling. They collected methylation data (Infinium HumanMethylation450) from 7,476 patients across 21 cancer types from The Cancer Genome Atlas and other public data sets, which were separated into training and test cohorts. 

“Instead of relying on large and complex data sets, we aimed to identify a smaller, more practical set of markers that still retains strong predictive power,” said Dr. De Velasco. “The long-term goal is to create a tool that could support physicians in identifying the likely tissue of origin and helping inform more effective treatment decisions.”

The researchers used a hybrid feature selection approach to develop the prediction model, incorporating Shapley values for explainability and gradient boosting for improved accuracy to identify CpG regions. The Louvain method was used for clustering and detection to explore tumor phenotypes and determine associations between heterogeneity and prediction. 

Independent validation was conducted on 31 cases across 17 cancer types from Kindai University using Infinium MethylationEPIC v2.0 data.

Key Findings 

The researchers selected 1,000 CpG regions. 

Of all tested AI models, the best performance was seen in the model using ridge regression. That model achieved an average classification accuracy of 95.4%, an area under the curve of 0.998, an F1 performance score of 0.953, and a Matthews correlation coefficient of 0.951 across classes in the training cohort. 

In the test cohort, the classification accuracy was 94.7%, the area under the curve was 0.998, the F1 performance score was 0.945, and the Matthews correlation coefficient was 0.943. 

Independent validation showed a classification accuracy of 87.1%, an area under the curve of 0.9993, an F1 performance score of 0.847, and a Matthews correlation coefficient of 0.867. 

“One of the most important findings from our study is that we were able to accurately predict the origin of many different cancer types using a very small subset of DNA markers, about 1,000 CpG regions selected from hundreds of thousands across the genome,” said Dr. De Velasco. “This is important because it shows that we can simplify complex molecular data while still maintaining strong predictive performance.”

Twenty Louvain clusters were found on unsupervised analysis, demonstrating heterogeneity across cancer types. Heterogeneity and purity were correlated with Matthews correlation coefficient, but regression analysis could not confirm any independent predictive effect after adjustments for confounding variables.  

“Overall, we see this research as part of a broader effort to better understand cancer using molecular information, with the goal of supporting more informed and personalized care in the future. However, this work is still in the research stage,” added Dr. De Velasco, noting that the model was developed on cancers with known origins rather than true cancers with unknown origins. "We next have to evaluate how well this approach performs in a prospective analysis of patients with true cancers of unknown primary."

DISCLOSURES: Funding for this study was provided by the Japan Society for the Promotion of Science. Dr. De Velasco reported no conflicts of interest. For full disclosures of the other study authors, visit abstractsonline.com

The content in this post has not been reviewed by the American Society of Clinical Oncology, Inc. (ASCO®) and does not necessarily reflect the ideas and opinions of ASCO®.
Advertisement

Advertisement




Advertisement