About this Abstract |
Meeting |
2020 TMS Annual Meeting & Exhibition
|
Symposium
|
ICME Gap Analysis in Materials Informatics: Databases, Machine Learning, and Data-Driven Design
|
Presentation Title |
Gaps, Limitations, and Pitfalls of Materials Informatics |
Author(s) |
Taylor D. Sparks |
On-Site Speaker (Planned) |
Taylor D. Sparks |
Abstract Scope |
Materials Informatics, or applying data science to address materials research challenges, has become an in important part of computational materials science and the Materials Genome Initiative. However, despite the enormous promise and potential of materials informatics, there still remains significant gaps, limitations, and pitfalls preventing successful implementation. In this talk I will go over several of these challenges and discuss best practices for materials informatics utilization. Specific examples will include the following: How to avoid model overfitting during hyper parameter tuning by incorporating leave-one-cluster-out cross validation. Insight from chemistry-encoded data visualization. Bias and imbalance in training data and fundamental data engineering approaches. Physics informed feature engineering. Regression versus classification when extrapolating using machine learning. Model deployment, repositories, and code availability. Using standardized benchmark data sets when developing features, algorithms, and cross-validation approaches. |
Proceedings Inclusion? |
Planned: Supplemental Proceedings volume |