Bayesian optimization tutorial using Jupyter notebook

Active learning via Bayesian optimization for materials discovery

Launch Tool

You must login before you can run this tool.

Version 1.01 - published on 17 Jun 2021

doi:10.21981/8RQS-B991 cite this

Open source: license | download

View All Supporting Documents



Published on


Discovery of new and improved materials is an essential aspect of materials science. However, a common challenge faced by various disciplines is the large search space that often renders both high-throughput experiments and simulations intractable. One potential solution is to employ active learning, a semi-supervised machine learning approach, to efficiently explore the search space with minimal number of candidate evaluations.  In this tutorial, we will demonstrate the use of active learning via Bayesian optimization (BO) to identify ideal molecular candidates for an energy storage application. Our step-by-step walkthrough of the code will go over the followings:

  1. Preprocessing the candidate database:
  • Feature generation
  • Dimensionality reduction
  1. Running the BO cycles:
  • Uncertainty prediction via Gaussian Process Regression
  • Acquisition function evaluation
  • Selection of new candidate
  1. Tuning the model performance


Hieu A. Doan is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Dr. Doan’s research interests include computational catalysis, battery chemistry, cheminformatics, and machine learning. He currently focuses on developing accelerated materials screening methods for applications in biomass conversion and redox flow battery.

Garvit Agarwal is a postdoctoral researcher in the Materials Science Division at Argonne National Laboratory. Dr. Agarwal’s research interests include atomic scale modeling, high-throughput computing, material informatics, and machine learning to design and discover novel materials for clean energy applications. His current research focuses on fundamental understanding of electrode-electrolyte interfacial reactions for development of next-generation multivalent battery technologies and material screening for redox flow battery.


Doan, H. A., Agarwal, G., Qian, H., Counihan, M. J., Rodríguez-López, J., Moore, J. S., & Assary, R. S. (2020). Quantum Chemistry-Informed Active Learning to Accelerate the Design and Discovery of Sustainable Energy Storage Materials. Chemistry of Materials.

Cite this work

Researchers should cite this work as follows:

  • Hieu Doan, Garvit Agarwal (2021), "Bayesian optimization tutorial using Jupyter notebook," (DOI: 10.21981/8RQS-B991).

    BibTex | EndNote