Hands-on Data Science and Machine Learning Training Series

By Alejandro Strachan1; Saaketh Desai2; Arun Kumar Mannodi Kanakkithodi1

1. Materials Engineering, Purdue University, West Lafayette, IN 2. Purdue University, West Lafayette, IN

Category

Courses

Published on

Abstract

This series of workshops introduces participants to important concepts and techniques in data science and machine learning in the context engineering and physical sciences applications. All workshops include hands-on activities, where participants apply the techniques to solve real problems using online resources at nanoHUB, no need to install any software.

The hands-on tutorials are designed to jump start your use of data science and machine learning in research or teaching. This series will cover the following topics:

  • Learn how to use Jupyter notebooks for your research
  • Interact with data repositories and manage data
  • Train neural networks and random forests
  • Use unsupervised learning techniques such as Principal Component Analysis
  • Perform Design of Experiments using machine learning

All exercises will use nanoHUB cloud computing resources, no need to download or install any software. All you need is an internet connection and a browser. After the training sessions, you will be able to continue using nanoHUB for your research or class.

Upcoming workshops and registration information can be found on the Hands-on Data Science and Machine Learning Training page. Registration and online compute resources are free of charge.

Be notified of future workshops - sign up for the nanoHUB newsletter to receive news, events, and more.

Target audience: The workshops are designed for students, researchers, and industrial practitioners interested in exploring data science in a hands-on manner. The offerings assume little prior experience with machine learning and minimal programming experience. The Spring 2020 tutorials contain introductory material.

Sponsored by

Cite this work

Researchers should cite this work as follows:

  • Alejandro Strachan, Saaketh Desai, Arun Kumar Mannodi Kanakkithodi (2020), "Hands-on Data Science and Machine Learning Training Series," https://nanohub.org/resources/33245.

    BibTex | EndNote

Location

Purdue University, West Lafayette, IN

Tags

Data Science and Machine Learning

Data Science and Machine Learning group image

Lecture Number/Topic Online Lecture Video Lecture Notes Supplemental Material Suggested Exercises
Introduction to Jupyter Notebooks, Data Organization and Plotting (1st offering) View on YouTube View Notes (pdf)
This tutorial gives an introductory demonstration of how to create and use Jupyter notebooks. It showcases the libraries Pandas to manipulate and organize data with functionalities similar to those...

Introduction to Jupyter Notebooks, Data Organization and Plotting (2nd offering) View on YouTube View Notes (pdf)
This tutorial gives an introductory demonstration of how to create and use Jupyter notebooks. It showcases the libraries Pandas to manipulate and organize data with functionalities similar to those...

Repositories and Data Management (1st offering) View on YouTube View Notes (pdf)
This tutorial introduces database infrastructure and APIs for performing different scales of querying. You will learn how to access different suites of information from three prominent databases,...

Repositories and Data Management (2nd offering) View on YouTube View Notes (pdf)
This tutorial introduces database infrastructure and APIs for performing different scales of querying. You will learn how to access different suites of information from three prominent databases,...

Hands-on Supervised Learning: Part 1 - Linear Regression and Neural Networks View on YouTube View Notes (pdf)
This tutorial introduces supervised learning via Jupyter notebooks on nanoHUB.org. You will learn how to setup a basic linear regression in a Jupyter notebook and then create and train a...

Hands-on Supervised Learning: Part 2 - Classification and Random Forests (1st offering) View on YouTube View Notes (pdf)
This tutorial introduces neural networks for classification tasks and random forests for regression tasks via Jupyter notebooks on nanoHUB.org. You will learn how to create and train a neural...

Hands-on Supervised Learning: Part 2 - Classification and Random Forests (2nd offering) View on YouTube View Notes (pdf)
This tutorial introduces neural networks for classification tasks and random forests for regression tasks via Jupyter notebooks on nanoHUB.org. You will learn how to create and train a neural...

Hands-on Unsupervised Learning using Dimensionality Reduction via Matrix Decomposition (1st offering) View on YouTube View Notes (pdf)
This tutorial introduces unsupervised machine learning algorithms through dimensionality reduction via matrix decomposition techniques in the context of chemical decomposition of reactive...

Hands-on Unsupervised Learning using Dimensionality Reduction via Matrix Decomposition (2nd offering) View on YouTube View Notes (pdf)
This tutorial introduces unsupervised machine learning algorithms through dimensionality reduction via matrix decomposition techniques in the context of chemical decomposition of reactive...

Hands-on Sequential Learning and Design of Experiments View on YouTube View Notes (pdf)
This tutorial introduces the concept of sequential learning and information acquisition functions and how these algorithms can help reduce the number of experiments required to find an optimal...

Machine Learning in Materials - Center for Advanced Energy Studies and Idaho National Laboratory Notes (pdf)
Notes (pptx)
his hands-on tutorial will introduce participants to modern tools to manage, organize, and visualize data as well as machine learning techniques to extract information from it. ...

nanoHUB: Online Simulation and Data Notes (pdf)
Notes (pptx)
These slides introduce nanoHUB, an open platform for online simulations and collaboration.

Hands-on Deep Learning for Materials Science: Convolutional Networks and Variational Autoencoders View on YouTube View Notes (pdf)
This tutorial introduces deep learning techniques such as convolutional neural networks and variational auto encoders from a materials standpoint.

Machine Learning Framework for Impurity Level Prediction in Semiconductors View on YouTube View
In this work, we perform screening of functional atomic impurities in Cd-chalcogenide semiconductors using high-throughput computations and machine learning.

Unsupervised Clustering Methods for Image Segmentation: Application to Scanning Electron Microscopy Images of Graphene View on YouTube View Hands-0n Tutorial
Pre-Workshop Tutorial: Setting up a nanoHUB account and generating API keys for databases
This tutorial will introduce you to some basic image segmentation techniques driven by unsupervised machine learning techniques such as the Gaussian mixture model and k-means clustering. You will...

U-Net Convolutional Neural Networks for Image Segmentation: Application to Scanning Electron Microscopy Images of Graphene View on YouTube View
This tutorial introduces you to U-Net, a popular convolutional neural network commonly developed for image segmentation in biomedicine. Using an assembled data set, you will learn how to create and...

Convenient and efficient development of Machine Learning Interatomic Potentials View on YouTube View
This tutorial introduces the concepts of machine learning interatomic potentials (ML-IAPs) in materials science, including two components of local environment atomic descriptors and machine...

Constructing Accurate Quantitative Structure-Property Relationships via Materials Graph Networks View on YouTube View
This tutorial covers materials graph networks for modeling crystal and molecular properties. We will introduce the graph representation of crystals and molecules and how the convolutional...

Parsimonious Neural Networks Learn Interpretable Physical Laws View on YouTube View Notes (pdf)
Handout
An Introduction to Machine Learning for Materials Science: A Basic Workflow for Predicting Materials Properties View HTML
View Notes (pdf)
This tutorial will introduce core concepts of machine learning through the lens of a basic workflow to predict material bandgaps from material compositions.

The Materials Simulation Toolkit for Machine Learning (MAST-ML): Automating Development and Evaluation of Machine Learning Models for Materials Property Prediction View HTML
View Notes (pdf)
This tutorial contains an introduction to the use of the Materials Simulation Toolkit for Machine Learning (MAST-ML), a python package designed to broaden and accelerate the use of machine learning...

A Hands-on Introduction to Physics-Informed Neural Networks View HTML
View Notes (pdf)
Batch Reification Fusion Optimization (BAREFOOT) Framework View HTML
View Notes (pdf)
Active Learning via Bayesian Optimization for Materials Discovery View on YouTube View Notes (pdf)
In this tutorial, we will demonstrate the use of active learning via Bayesian optimization (BO) to identify ideal molecular candidates for an energy storage application.

A Machine Learning Aided Hierarchical Screening Strategy for Materials Discovery View HTML
View Notes (pdf)
In this tutorial, we illustrate this approach using the example of wide band gap oxide perovskites. We will sequentially search a very large domain space of single and double oxide perovskites to...

Debugging Neural Networks View HTML
View Notes (pdf)
The presentation will start with an overview of deep learning theory to motivate the logic in NetDebugger and end with a hands-on NetDebugger tutorial involving PyTorch, RDKit, and polymer data

Autonomous Neutron Diffraction Experiments with ANDiE View on YouTube View
This tutorial will cover the working principles of ANDiE, how physics was encoded into the design, and demonstrate how ANDiE can be used to autonomously control neutron diffraction experiments.

Integrating Machine Learning with a Genetic Algorithm for Materials Exploration View HTML
View Notes (pdf)
In this talk, we will explore how this algorithm can be used for materials discovery.

Data Analysis with MATLAB View HTML
View
Learn how MATLAB can be used to visualize and analyze data, perform numerical computations, and develop algorithms. Through live demonstrations and examples, you will see how MATLAB can help you...

Machine Learning with MATLAB View HTML
View
In this session, we explore the fundamentals of machine learning using MATLAB. We introduce machine learning techniques available in MATLAB to quickly explore your data, evaluate machine learning...

Gaussian Process Regression for Surface Interpolation View HTML
View
This tutorial will introduce the fundamentals of GPR and its application to surface interpolation. We will also introduce a new technique called filtered kriging (FK), which uses a pre-filter to...

Machine Learning Predicts Additive Manufacturing Part Quality: Tutorial on Support Vector Regression View HTML
View
This tutorial introduces and demonstrates the use of machine learning (ML) to address this need. Using data collected from an AM factory, you will train a support vector regression (SVR) model to...

Introduction to a Basic Machine Learning Workflow for Predicting Materials Properties View HTML
View
This tutorial will introduce core concepts of machine learning through the lens of a basic workflow to predict material bandgaps from material compositions.

The Materials Simulation Toolkit for Machine Learning (MAST-ML): Automating Development and Evaluation of Machine Learning Models for Materials Property Prediction View HTML
View
Hands-on activities, we will use MAST-ML to (1) import materials datasets from online databases and clean and examine our input data, (2) conduct feature engineering analysis, including generation,...