TY - JOUR
T1 - MORE-Q, a dataset for molecular olfactorial receptor engineering by quantum mechanics
AU - Chen, Li
AU - Medrano Sandonas, Leonardo
AU - Traber, Philipp
AU - Dianat, Arezoo
AU - Tverdokhleb, Nina
AU - Hurevich, Mattan
AU - Yitzchaik, Shlomo
AU - Gutierrez, Rafael
AU - Croy, Alexander
AU - Cuniberti, Gianaurelio
N1 - Publisher Copyright:
© 2025. The Author(s).
PY - 2025/2/22
Y1 - 2025/2/22
N2 - We introduce the MORE-Q dataset, a quantum-mechanical (QM) dataset encompassing the structural and electronic data of non-covalent molecular sensors formed by combining 18 mucin-derived olfactorial receptors with 102 body odor volatilome (BOV) molecules. To have a better understanding of their intra- and inter-molecular interactions, we have performed accurate QM calculations in different stages of the sensor design and, accordingly, MORE-Q splits into three subsets: i) MORE-Q-G1: QM data of 18 receptors and 102 BOV molecules, ii) MORE-Q-G2: QM data of 23,838 BOV-receptor configurations, and iii) MORE-Q-G3: QM data of 1,836 BOV-receptor-graphene systems. Each subset involves geometries optimized using GFN2-xTB with D4 dispersion correction and up to 39 physicochemical properties, including global and local properties as well as binding features, all computed at the tightly converged PBE+D3 level of theory. By addressing BOV-receptor-graphene systems from a QM perspective, MORE-Q can serve as a benchmark dataset for state-of-the-art machine learning methods developed to predict binding features. This, in turn, can provide valuable insights for developing the next-generation mucin-derived olfactory receptor sensing devices.
AB - We introduce the MORE-Q dataset, a quantum-mechanical (QM) dataset encompassing the structural and electronic data of non-covalent molecular sensors formed by combining 18 mucin-derived olfactorial receptors with 102 body odor volatilome (BOV) molecules. To have a better understanding of their intra- and inter-molecular interactions, we have performed accurate QM calculations in different stages of the sensor design and, accordingly, MORE-Q splits into three subsets: i) MORE-Q-G1: QM data of 18 receptors and 102 BOV molecules, ii) MORE-Q-G2: QM data of 23,838 BOV-receptor configurations, and iii) MORE-Q-G3: QM data of 1,836 BOV-receptor-graphene systems. Each subset involves geometries optimized using GFN2-xTB with D4 dispersion correction and up to 39 physicochemical properties, including global and local properties as well as binding features, all computed at the tightly converged PBE+D3 level of theory. By addressing BOV-receptor-graphene systems from a QM perspective, MORE-Q can serve as a benchmark dataset for state-of-the-art machine learning methods developed to predict binding features. This, in turn, can provide valuable insights for developing the next-generation mucin-derived olfactory receptor sensing devices.
UR - http://www.scopus.com/inward/record.url?scp=85219600221&partnerID=8YFLogxK
U2 - 10.1038/s41597-025-04616-6
DO - 10.1038/s41597-025-04616-6
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
C2 - 39987132
AN - SCOPUS:85219600221
SN - 2052-4463
VL - 12
SP - 324
JO - Scientific data
JF - Scientific data
IS - 1
ER -