Module 3: Materials Descriptors for Data Science

By Alejandro Strachan1; Juan Carlos Verduzco Gastelum1; Zachary D McClure1

1. Materials Engineering, Purdue University, West Lafayette, IN

Published on


Run the Tool: Feature Selection for Machine Learning This module focuses on the use of descriptors to improve the description of materials in machine learning. Augmenting input parameters with appropriate descriptors (a process sometimes called featurization) can often significantly improve the accuracy of predictive models. Ideal descriptors are strongly correlated with the quantify of interest (QoI) and are relatively easy to obtain. They can involve simple calculations with periodic table data, physics-based simulations or models, or even experiments that are easier to conduct than the QoI. In this hands-on module you will learn about descriptors and explore their importance using interactive computing in nanoHUB.

This end-to-end module is designed to be self-contained and easy to incorporate in existing courses or used for self-study. The module consists of three components:

This module is part of a series on data science and machine learning for engineering and physical sciences. Users will be able to run interactive code online using nanoHUB, no need to download or install any software.

Learning objectives. After completing this module, you will:

  • Enhance materials models using descriptors
    • Periodic table data
    • Surrogate properties and physics-based models
  • Analyze descriptors, calculate correlations to rank descriptors


Sponsored by

Cite this work

Researchers should cite this work as follows:

  • Alejandro Strachan, Juan Carlos Verduzco Gastelum, Zachary D McClure (2021), "Module 3: Materials Descriptors for Data Science,"

    BibTex | EndNote