About this Abstract |
Meeting |
2025 TMS Annual Meeting & Exhibition
|
Symposium
|
Novel Strategies for Rapid Acquisition and Processing of Large Datasets from Advanced Characterization Techniques
|
Presentation Title |
From Chaos to Clarity: Managing the Materials Data Surge |
Author(s) |
Taylor D. Sparks, Ramsey Issa, Layla Purdy, Federico Ottomano |
On-Site Speaker (Planned) |
Taylor D. Sparks |
Abstract Scope |
Data has long been the Achilles heel of materials informatics, but recent advances along multiple fronts are making materials data increasingly abundant. For example, natural language processing now allows relatively accurate and simple harvesting of materials data from literature. Similarly, high-throughput experimentation and autonomous labs are increasing the data generation volume. The community will increasingly need to ask difficult questions like “How do we best use this data?” or “How can we quantify the trade-offs between model performance, data acquisition cost, and dataset augmentation and selection?” Previous work has shown that simple strategies for data set aggregation from disparate heterogeneous datasets results in subpar performance. Here, I will demonstrate new approaches for generating a curated dataset from both existing datasets as well as from the experimental design space with case studies in NMR peak modelling and materials property aggregation. I’ll also describe a new approach in multi-attempt acquisition function calculation. |
Proceedings Inclusion? |
Planned: |
Keywords |
Machine Learning, Modeling and Simulation, Computational Materials Science & Engineering |