Active Learning via Bayesian Optimization for Materials Discovery

By Hieu Doan1, Garvit Agarwal1

1. Materials Science Division, Argonne National Laboratory, Lamont, IL

Published on


Discovery of new and improved materials is an essential aspect of materials science. However, a common challenge faced by various disciplines is the large search space that often renders both high-throughput experiments and simulations intractable. One potential solution is to employ active learning, a semi-supervised machine learning approach, to efficiently explore the search space with minimal number of candidate evaluations. In this tutorial, we will demonstrate the use of active learning via Bayesian optimization (BO) to identify ideal molecular candidates for an energy storage application. Our step-by-step walkthrough of the code will go over the following:

  1. Preprocessing the candidate database:
    • Feature generation
    • Dimensionality reduction
  2. Running the BO cycles:
    • Uncertainty prediction via Gaussian Process Regression
    • Acquisition function evaluation
    • Selection of new candidate
  3. Tuning the model performance
This tutorial uses the Bayesian Optimization Tutorial using Jupyter Notebook found on nanoHUB.


Hieu A. Doan Hieu A. Doan is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Hieu Doan received a Ph.D. in Chemical Engineering from the University of Houston in 2015 under the supervision of professor Lars Grabow. Prior to joining Argonne, he held a postdoctoral position at Northwestern University (with Prof. Randall Snurr) from 2016 to 2018. Dr. Doan’s research interests include fundamental understanding of chemical reactions at interfaces and accelerated discovery of functional materials for chemical production, energy storage, and emission control. Density functional theory, kinetic modeling, and machine learning form his primary tool box. Currently, Dr. Doan is a member of the Molecular Materials Group (advisor: Dr. Rajeev Assary) and carrying out collaborative research with the Joint Center for Energy Storage Research (JCESR) and the Consortium for Computational Physics and Chemistry (CCPC). 

Garvit Agarwal Garvit Agarwal is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Dr. Agarwal’s research interests include atomic scale modeling, high-throughput computing, material informatics, and machine learning to design and discover novel materials for clean energy applications. His current research focuses on fundamental understanding of electrode-electrolyte interfacial reactions for development of next-generation multivalent battery technologies and material screening for redox flow battery.

Sponsored by

Cite this work

Researchers should cite this work as follows:

  • Hieu Doan, Garvit Agarwal (2021), "Active Learning via Bayesian Optimization for Materials Discovery,"

    BibTex | EndNote