Abstract
The primary function of microRNAs (miRNAs) is to maintain cell homeostasis. In cancerous tissues miRNAs' expression undergo drastic alterations. In this study, we use miRNA expression profiles from The Cancer Genome Atlas of 24 cancer types and 3 healthy tissues, collected from >8500 samples. We seek to classify the cancer's origin and tissue identification using the expression from 1046 reported miRNAs. Despite an apparent uniform appearance of miRNAs among cancerous samples, we recover indispensable information from lowly expressed miRNAs regarding the cancer/tissue types. Multiclass support vector machine classification yields an average recall of 58% in identifying the correct tissue and tumor types. Data discretization had led to substantial improvement, reaching an average recall of 91% (95% median).We propose a straightforward protocol as a crucial step in classifying tumors of unknown primary origin. Our counter-intuitive conclusion is that in almost all cancer types, highly expressing miRNAs mask the significant signal that lower expressed miRNAs provide.
Original language | English |
---|---|
Pages (from-to) | 5048-5060 |
Number of pages | 13 |
Journal | Nucleic Acids Research |
Volume | 45 |
Issue number | 9 |
DOIs | |
State | Published - 19 May 2017 |
Bibliographical note
Publisher Copyright:© The Authors 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.