Hands-On Data Science and Machine Learning in Undergraduate Education

By Alejandro Strachan1, Saaketh Desai1, Juan Carlos Verduzco Gastelum1, Michael N Sakano1, Zachary D McClure1, Joseph M. Cychosz2, Jared Gray West2

1. Materials Engineering, Purdue University, West Lafayette, IN 2. Network for Computational Nanotechnology, Purdue University, West Lafayette, IN

View Courses

Audio podcast
Video podcast
Slides/Notes podcast

Licensed under General Performance Usage

Category

Courses

Published on

Abstract

This series of modules introduce key concepts in data science in the context of application in materials science and engineering. The end to end modules include:

  • A recorded lecture that introduces each topic and provides background material,
  • A hands-on tutorial with step-by-step instructions to perform interactive online activities and run interactive code,
  • A homework assignment designed to help users explore the concepts using online models and simulations and adopt the code to problems of their interest.

The modules are self-contained and modular, they are designed for easy incorporation into existing courses or for those interested in self-study.

All interactive computing is performed using cloud computing in nanoHUB, there is no need to download or install any software. All resources are open and free.

Knowledge and Skills

  1. Data handling
    • Data collection, completeness, & provenance – See Module 1 and Module 2
    • Data storage and sharing – See Module 1 and Module 2
    • Data querying, organization, and filtering – See Module 2
  2. Predictive modeling
    • Data visualization – See Module 2 and Module 3
    • Digital representation and descriptors for materials – See Module 3
    • Simple regression models – See Module 4
    • Machine learning models for regression and classification - See Module 5
    • Random forests and decision trees – See Module 7
  3. Decision making
    • Uncertainty quantification – See Module 6
    • Active learning for design of experiments – See Module 7

Pre-requisites

The interactive computing is performed using python through Jupyter notebooks. Basic programing skills are required. An introductory tutorial on Jupyter, python and plotting is available at: https://nanohub.org/resources/33266

Sponsored by

Cite this work

Researchers should cite this work as follows:

  • Alejandro Strachan, Saaketh Desai, Juan Carlos Verduzco Gastelum, Michael N Sakano, Zachary D McClure, Joseph M. Cychosz, Jared Gray West (2020), "Hands-On Data Science and Machine Learning in Undergraduate Education," https://nanohub.org/resources/34285.

    BibTex | EndNote

Hands-on Learning Modules on Data Science and Machine Learning in Engineering

Lecture Number/Topic Online Lecture Video Lecture Notes Supplemental Material Suggested Exercises
Querying Materials Data Repositories View Notes (pdf) Notes (pptx) Hands-on Tutorial
Hands-on Tutorial
YouTube
Homework Assignment Homework Assignment
This module introduces modern tools for data acquisition, including performing large queries using application programming interfaces (APIs), with hands-on online workflows.

Linear Regression Models View Notes (pdf) Notes (pptx) Hands-on Tutorial - Young's Modulus
Hands-on Tutorial - Young's Modulus
Hands-on Tutorial - Correlations
Hands-on Tutorial - Correlations
Homework Assignment - Young's Modulus
Homework Assignment - Young's Modulus
Homework Assignment - Correlations
Homework Assignment - Correlations
YouTube
This module introduces linear regression in the context of materials science and engineering.

Neural Networks for Regression and Classification View Hands-on Tutorial - Regression
Hands-on Tutorial - Regression
Hands-on Tutorial - Classification
Hands-on Tutorial - Classification
Homework Assignment - Regression
Homework Assignment - Regression
Homework Assignment - Classification
Homework Assignment - Classification
YouTube
Neural Networks for Regression and Classification
Neural Networks for Regression and Classification
This module introduces neural networks for material science and engineering with hands-on online simulations. Neural networks are a subset of machine learning models used to learn mappings between...

Active Learning for Design of Experiments View Notes (pdf) Notes (pptx) Hands-on Tutorial
Hands-on Tutorial
YouTube
Homework Assignment Homework Assignment
This module introduces active learning in the context of materials discovery with hands-on online simulations.