国产三级精品三级在线观看,国产高清无码在线观看,中文字幕日本人妻久久久免费,亚洲精品午夜无码电影网

小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

材料科學極大地受益于機器學習和深度學習技術(shù)的進步。這些技術(shù)徹底改變了對分子性質(zhì)的預(yù)測,促使傳統(tǒng)計算方法得以改變。機器學習/深度學習技術(shù)作為數(shù)據(jù)驅(qū)動材料科學領(lǐng)域中不可或缺的工具,其性能預(yù)測的準確性和速度都在逐步提高。
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習
Fig. 1 Overview of extrapolative prediction of molecular property?based on the range of molecular properties and the diversity of?molecular structures.
但在機器學習/深度學習技術(shù)中仍然存在一個關(guān)于其固有外推困難的基本矛盾,即對于超越現(xiàn)有數(shù)據(jù)的預(yù)測能力。數(shù)據(jù)驅(qū)動材料探索的主要目標是識別尚未在數(shù)據(jù)庫中出現(xiàn)的高性能分子/材料。因此,機器學習/深度學習模型必須具有僅從現(xiàn)有數(shù)據(jù)中推斷未知數(shù)據(jù)的能力。

小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

Fig. 2 Model description used for the benchmark.
然而,材料數(shù)據(jù)集通常由小型實驗結(jié)果組成,因而不可避免地會存在偏差。確定機器學習/深度學習模型能否克服這些偏差,并有效地推斷分子性質(zhì)至關(guān)重要。
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

Fig. 3 Evaluation methods for assessing interpolation and extrapolative performance.

來自日本東京大學工程學院電氣工程與信息系統(tǒng)系的Hajime Shimakawa等,提出了一個全面的基準來評估12種有機分子性質(zhì)的外推性能。他們的大規(guī)?;鶞蕼y試顯示,傳統(tǒng)的機器學習模型在屬性范圍和分子結(jié)構(gòu)的訓練分布之外表現(xiàn)出顯著的性能下降,特別是對于小型數(shù)據(jù)屬性。

小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

Fig. 4 Evaluation results of the interpolation test using all data points of each dataset and extrapolation tests of property range and?molecular structure (cluster) at data size for interpolation Nin = 200 (50 for EBD) with RMSE relative to σall, where σall represents the?standard deviation of each dataset as listed in Table 1.?

為解決這一挑戰(zhàn),他們引入了一個稱為QMex的量子力學描述符數(shù)據(jù)集,以及包含量子力學描述符和分子結(jié)構(gòu)分類信息之間交互項的交互式線性回歸。基于QMex的交互式線性回歸在保持其可解釋性的同時,實現(xiàn)了最先進的外推性能。
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習
Fig. 5 Ratio of models ranking within the top three for each data size Nin.
他們的基準結(jié)果、QMex數(shù)據(jù)集和所提出的模型對于改進小型實驗數(shù)據(jù)集的外推預(yù)測,并發(fā)現(xiàn)超越現(xiàn)有候選材料的新材料/分子極具價值。該文近期發(fā)布于npj Computational Materials 10: 11 (2024).
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

Fig. 6 Model performance comparison for extrapolation tests.

Editorial Summary

Extrapolative prediction of small-data molecular propertyQuantum mechanics-assisted machine learning

Materials science has greatly benefited from advancements in machine learning (ML) and deep learning (DL) techniques. These techniques have revolutionized the prediction of molecular properties, leveraging traditional computational approaches.ML/DL techniques continue to enhance the accuracy and speed of property prediction, serving as indispensable tools for data-driven materials science.?
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習

Fig. 7 Summary of ML/DL model selection for interpolation and?extrapolation of molecular property prediction.

However, a fundamental contradiction persists in ML/DL techniques regarding their inherent extrapolation difficulty, i.e., the ability to predict beyond the available data. The primary objective of data-driven materials exploration is to identify high-performance molecules/materials that are not yet represented in databases. Hence, ML/DL models must possess the capability to extrapolate unexplored data solely from the available data. However, materials datasets often consist of small experimental results, which inevitably carries biases. It is crucial to determine whether ML/DL models can overcome these biases and effectively extrapolate molecular properties.?
小數(shù)據(jù)分子性質(zhì)的外推預(yù)測:量子力學輔助機器學習
Fig. 8 Model performance comparison between QMex-LR and QMex-ILR.
Hajime Shimakawa et al. from the Department of Electrical Engineering & Information Systems, School of Engineering, University of Tokyo, presented a comprehensive benchmark for assessing extrapolative performance across 12 organic molecular properties. Their large-scale benchmark revealed that conventional ML models exhibit remarkable performance degradation beyond the training distribution of property range and molecular structures, particularly for small-data properties. To address this challenge, they introduced a quantum-mechanical (QM) descriptor dataset, called QMex, and an interactive linear regression (ILR), which incorporates interaction terms between QM descriptors and categorical information pertaining to molecular structures. The QMex-based ILR achieved state-of-the-art extrapolative performance while preserving its interpretability. Their benchmark results, QMex dataset, and proposed model serve as valuable assets for improving extrapolative predictions with small experimental datasets and for the discovery of novel materials/molecules that surpass existing candidates. This article was recently published in npj Computational Materials 10: 11 (2024).
原文Abstract及其翻譯
Extrapolative prediction of small-data molecular property using quantum mechanics-assisted machine learning (量子力學輔助機器學習對小數(shù)據(jù)分子性質(zhì)外推預(yù)測)
Hajime Shimakawa, Akiko Kumada & Masahiro Sato
Abstract Data-driven materials science has realized a new paradigm by integrating materials domain knowledge and machine-learning (ML) techniques. However, ML-based research has often overlooked the inherent limitation in predicting unknown data: extrapolative performance, especially when dealing with small-scale experimental datasets. Here, we present a comprehensive benchmark for assessing extrapolative performance across 12 organic molecular properties. Our large-scale benchmark reveals that conventional ML models exhibit remarkable performance degradation beyond the training distribution of property range and molecular structures, particularly for small-data properties. To address this challenge, we introduce a quantum-mechanical (QM) descriptor dataset, called QMex, and an interactive linear regression (ILR), which incorporates interaction terms between QM descriptors and categorical information pertaining to molecular structures. The QMex-based ILR achieved state-of-the-art extrapolative performance while preserving its interpretability. Our benchmark results, QMex dataset, and proposed model serve as valuable assets for improving extrapolative predictions with small experimental datasets and for the discovery of novel materials/molecules that surpass existing candidates.
摘要數(shù)據(jù)驅(qū)動材料科學通過整合材料領(lǐng)域知識和機器學習(ML)技術(shù),實現(xiàn)了一種新的范式。然而,基于機器學習的研究往往忽略了其預(yù)測未知數(shù)據(jù)的固有局限性:即外推性能,特別是在處理小規(guī)模實驗數(shù)據(jù)集時。在這里,我們提出了一個全面的基準來評估12種有機分子性質(zhì)的外推性能。我們的大規(guī)?;鶞蕼y試顯示,傳統(tǒng)的機器學習模型在屬性范圍和分子結(jié)構(gòu)的訓練分布之外表現(xiàn)出顯著的性能下降,特別是對小數(shù)據(jù)屬性。為解決這一挑戰(zhàn),我們引入了一個稱為QMex的量子力學(QM)描述符數(shù)據(jù)集,以及包含量子力學描述符和分子結(jié)構(gòu)分類信息之間交互項的交互式線性回歸(ILR)?;?/span>QMex的交互式線性回歸在保持其可解釋性的同時,實現(xiàn)了最先進的外推性能。我們的基準結(jié)果、QMex數(shù)據(jù)集和所提出的模型對于改進小型實驗數(shù)據(jù)集的外推預(yù)測,并發(fā)現(xiàn)超越現(xiàn)有候選材料的新材料/分子極具價值。

原創(chuàng)文章,作者:計算搬磚工程師,如若轉(zhuǎn)載,請注明來源華算科技,注明出處:http://www.xiubac.cn/index.php/2024/03/26/75f1454dda/

(0)

相關(guān)推薦

商南县| 黄平县| 泾阳县| 朔州市| 黎城县| 台南县| 宁陕县| 威海市| 安泽县| 嘉禾县| 从江县| 辉县市| 常州市| 洮南市| 宿迁市| 通江县| 庄河市| 河池市| 丽江市| 台前县| 贵定县| 奉贤区| 石景山区| 梁平县| 广宗县| 大田县| 镇安县| 凉城县| 高碑店市| 巩留县| 乐东| 大荔县| 乐陵市| 沭阳县| 和静县| 宜良县| 石台县| 微博| 阿坝县| 油尖旺区| 普宁市|