Unsupervised learning using dimensionality reduction via matrix decomposition

By Michael N Sakano1; Alejandro Strachan1

1. Purdue University

Learn PCA and NMF via chemistry example

Launch Tool

You must login before you can run this tool.

Version 1.0.2 - published on 20 Apr 2020

doi:10.21981/8ZRP-9821 cite this

Open source: license | download

View All Supporting Documents

Category

Tools

Published on

Abstract

Matrix decomposition/factorization of multivariate data is a useful procedure to reduce the dimensionality of the information by representing the data as a product of two matrices. The most common method is Principle component analysis (PCA) which employs singular value decomposition on the data matrix to extract possible correlated features. While useful, the outputs are rarely interpretable. Therefore, a similar decomposition method called Non-negative matrix factorization (NMF or NNMF) forces an additional constraint that the values of the sub-matrices must be non-negative. These two techniques are applied to a simulation of thermal decomposition of an energetic material RDX. As the chemical pathways become increasingly intricate throughout the decomposition evolution, we aim to group the entire process into three features. As will be seen, NMF does a better job of interpreting what might be contained in the features in the context of chemical bonds.

For more information about PCA and NMF in Python, see:

https://scikit-learn.org/stable/modules/decomposition.html#decompositions

This tool was used in the Hands-on Machine Learning and Data Science Training Workshop conducted by nanoHUB in April 2020. Offerings for the tutorial can be found in nanoHUB resources here and here.

Cite this work

Researchers should cite this work as follows:

  • Michael N Sakano, Alejandro Strachan (2020), "Unsupervised learning using dimensionality reduction via matrix decomposition," https://nanohub.org/resources/dimredmatdecmp. (DOI: 10.21981/8ZRP-9821).

    BibTex | EndNote

Tags