Discovery of new and improved materials is an essential aspect of materials science. However, a common challenge faced by various disciplines is the large search space that often renders both high-throughput experiments and simulations intractable. One potential solution is to employ active learning, a semi-supervised machine learning approach, to efficiently explore the search space with minimal number of candidate evaluations. In this tutorial, we will demonstrate the use of active learning via Bayesian optimization (BO) to identify ideal molecular candidates for an energy storage application. Our step-by-step walkthrough of the code will go over the following:
- Preprocessing the candidate database:
- Feature generation
- Dimensionality reduction
- Running the BO cycles:
- Uncertainty prediction via Gaussian Process Regression
- Acquisition function evaluation
- Selection of new candidate
- Tuning the model performance
Hieu A. Doan is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Hieu Doan received a Ph.D. in Chemical Engineering from the University of Houston in 2015 under the supervision of professor Lars Grabow. Prior to joining Argonne, he held a postdoctoral position at Northwestern University (with Prof. Randall Snurr) from 2016 to 2018. Dr. Doan’s research interests include fundamental understanding of chemical reactions at interfaces and accelerated discovery of functional materials for chemical production, energy storage, and emission control. Density functional theory, kinetic modeling, and machine learning form his primary tool box. Currently, Dr. Doan is a member of the Molecular Materials Group (advisor: Dr. Rajeev Assary) and carrying out collaborative research with the Joint Center for Energy Storage Research (JCESR) and the Consortium for Computational Physics and Chemistry (CCPC).
Garvit Agarwal is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Dr. Agarwal’s research interests include atomic scale modeling, high-throughput computing, material informatics, and machine learning to design and discover novel materials for clean energy applications. His current research focuses on fundamental understanding of electrode-electrolyte interfacial reactions for development of next-generation multivalent battery technologies and material screening for redox flow battery.
Cite this work
Researchers should cite this work as follows: