Abstract Scope |
In 2018, we presented the concept of “ocean of data” for a new paradigm for CALPHAD modeling incorporating machine learning for a sustainable data ecosystem[1]. Its three pillars are data generation for configurations based on MPDD[2] and DFTTK[3], data process for individual phases based on CALPHAD[4,5], and property data of materials based on ULTERA[6], and are integrated through Mat-X with a high throughput tools[7]. There are over 4.4 million DFT-relaxed or experimental configurations in MPDD together with several ML models and a connection with DFTTK. The DFT-validated ML data can be either directly used as inputs for CALPHAD modeling or further processed by zentropy theory[8]. ULTERA features a robust data curation infrastructure with a set of data validation, processing, and aggregation tools[9–11] as part of our open-source tools[12], including oxidation[13,14]. [1]https://doi.org/10.1007/s11669-018-0654-z. [2]https://mpdd.org/. [3]https://www.dfttk.org/. [4]https://pycalphad.org. [5]https://espei.org. [6]https://ultera.org. [7]http://mat-x.org. [8]https://doi.org/10.1088/1361-648X/ad4762. [9]https://pyqalloy.readthedocs.io/. [10]https://arxiv.org/abs/2403.02340. [11]https://arxiv.org/abs/2402.03528. [12]https://github.com/PhasesResearchLab/SoftwareProjects. [13]https://doi.org/10.1111/jace.18707. [14]http://arxiv.org/abs/2403.00705. |