Citrine Tools for Materials Informatics

Jupyter notebooks for sequential learning in the context of materials design. Run your own models, explore various methods and adapt the notebooks to your needs.

Launch Tool

You must login before you can run this tool.

Version 1.9 - published on 28 Jun 2023

doi:10.21981/D6GP-0860 cite this

Open source: license | download

View All Supporting Documents




Published on


The Jupyter Notebooks in this tool implement methods developed by Citrine Informatics for materials design. Users can modify the notebooks to explore different models, try new ideas and adapt them for their own problems.

The examples under Sequential Learning seek to solve materials design problems (posed as a maximization or minimization problem) with the fewest number of experiments possible. In the examples below, we start with a given number of previously performed experiments, a random subset of which is revealed at the beginning of the exercise. The goal is the find the optimal experiment in the smallest possible number of trials. There are three main steps in the sequential learning approach used:

  • Step 1. Establish the state of knowledge from the data available using machine learning. Capturing uncertainties is important at this step.
  • Step 2. Use the model in Step 1 to decide what experiment to carry out next (in our toy examples, this will be revealing the information in one of the pre-performed experiments). This step uses a so-called information acquisition function that seeks to maximize the amount of information gained by the selected experiment. The notebooks below compare various approaches
  • Step 3. The information source selected in Step 2 is queried and, if the optimal material is not selected, the new test is added to the available data and the process starts again in Step 1.

The notebooks below use sequential learning to identify the material with the highest bulk modulus and highest ionic conductivity. They all obtain their data from citrination databases (, build models using random forest or neural networks, and compare different information acquisition startegies against random searches.

Powered by

Citrination, Matminer, PyMatGen, Tensorflow, Keras, scikit-learn. 


Strachan group webpage: 

Sponsored by

This effort was supported by the US National Science Foundation, DMREF program under contract number 1922316-DMR.


Ling J, Hutchinson M, Antono E, Paradiso S, Meredig B. High-dimensional materials and process optimization using data-driven experimental design with well-calibrated uncertainty estimates. Integrating Materials and Manufacturing Innovation. 2017 Sep 1;6(3):207-17.

Cite this work

Researchers should cite this work as follows:

  • Juan Carlos Verduzco Gastelum, Alejandro Strachan (2023), "Citrine Tools for Materials Informatics," (DOI: 10.21981/D6GP-0860).

    BibTex | EndNote