Abstract Scope |
Machine learning (ML) studies to predict materials properties continue to be of increasing importance in a number of areas, including energy storage and conversion. Previous ML studies have used different literature sources or density functional theory calculations as input. Recently, we began utilizing the huge starrybase dataset with over 200,000 thermoelectric data points. Among the several supervised ML models implemented, eXtreme Gradient Boosting (XGBoost) was revealed to be the best on five-fold validations, closely followed by the faster Light Gradient Boosting Machine (LightGBM). We successfully tested the models obtained from the starrybase data on three more datasets available in the literature, as well as our own data collected over the last decade or so. Additionally, with the aid of feature selection and importance analysis, useful chemical features were chosen that ultimately led to higher accuracy in the test sets. |