科学研究

Research

首页 >  论文  >  详情

Geometry-enhanced Pretraining on Interatomic Potentials

发表会议及期刊:arXiv

Taoyong Cui1,2†, Chenyu Tang1,3,4†, Mao Su1*, Shufei Zhang1*,Yuqiang Li1*,

Lei Bai1, Yuhan Dong2, Xingao Gong5,6, Wanli Ouyang1

1Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China.

2Shenzhen International Graduate School, Tsinghua University, Shenzhen,

518055, China.

3CAS Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, 

Chinese Academy of Sciences, Beijing, 100190, China.

4School of Physical Sciences, University of Chinese Academy of Sciences,

Beijing, 100049, China.

5Key Laboratory for Computational Physical Sciences (MOE), State Key

Laboratory of Surface Physics, Department of Physics, Fudan University,

Shanghai, 200433, China.

6Shanghai Qi Zhi Institute, Shanghai, 200232, China.

 

*Corresponding author(s). E-mail(s): sumao@pjlab.org.cn;

zhangshufei@pjlab.org.cn;

†These authors contributed equally to this work. This work was done during their internship at Shanghai Artificial Intelligence Laboratory.

 

Abstract

Machine learning interatomic potentials (MLIPs) describe the interactions between atoms in materials and molecules by learning them from a reference database generated by ab initio calculations. MLIPs can accurately and efciently predict such interactions and have been applied to various felds of physical science. However, high-performance MLIPs rely on a large amount of labelled data, which are costly to obtain by ab initio calculations. Here we propose a geometric structure learning framework that leverages unlabelled confgurations to improve the performance of MLIPs. Our framework consists of two stages: first, using classical molecular dynamics simulations to generate unlabelled configrations of the target molecular system, and second, applying geometry-enhanced selfsupervised learning techniques, including masking, denoising and contrastive learning, to capture structural information. We evaluate our framework on various benchmarks ranging from small molecule datasets to complex periodic molecular systems with more types of elements. We show that our method signifcantly improves the accuracy and generalization of MLIPs with only a few additional computational costs and is compatible with diferent invariant or equivariant graph neural network architectures. Our method enhances MLIPs and advances the simulations of molecular systems.