Abstract Scope |
Machine learning (ML) offers transformative potential in materials science by enabling data-driven discovery and optimization within complex materials systems. However, training high quality models requires access to data and meta-data which can often be sparse. This presentation will explore how CALPHAD can address key data challenges in developing ML models for materials science such as data cleaning, data quantity, feature selection, interpretability, and generalization.
Through thermodynamic and kinetic simulations, CALPHAD can be used to fill data gaps, generating extensive datasets, and ensuring data consistency by identifying outliers and validating assumptions. Additionally, CALPHAD enhances feature selection by identifying key physical attributes and aids interpretability by grounding ML predictions in physics-based insights. By generating data across diverse compositions and conditions, CALPHAD also improves model generalization, enabling broader applicability. This talk highlights CALPHAD’s role in supporting reliable, interpretable, and generalizable ML models while also warning of potential unreasonable uses of CALPHAD data. |