About this Abstract |
Meeting |
2022 TMS Annual Meeting & Exhibition
|
Symposium
|
AI/Data Informatics: Computational Model Development, Validation, and Uncertainty Quantification
|
Presentation Title |
Extracting and Making Use of Materials Data from Millions of Journal Articles via Natural Language Processing Techniques |
Author(s) |
Anubhav Jain |
On-Site Speaker (Planned) |
Anubhav Jain |
Abstract Scope |
Historically, both data and knowledge in the materials domain has been recorded mainly as text, figures, or tables in journal articles. Such data is critical to developing, training, and validating machine learning models. In this talk, I will describe some of our efforts to extract information from the research literature automatically based on natural language processing techniques. For example, data on the dopability of materials is difficult to simulate and not part of a standard database, but is present either implicitly or explicitly as part of many published research studies. Similarly, data on materials synthesis can be difficult to simulate and compile but can be extracted from the historical research literature. The talk will summarize our most recent progress towards extracting both individual data items as well as "knowledge" in various areas. |
Proceedings Inclusion? |
Planned: |
Keywords |
Machine Learning, Computational Materials Science & Engineering, |