Online citations, reference lists, and bibliographies.

Machine-learning-assisted Materials Discovery Using Failed Experiments

Paul Raccuglia, Katherine C. Elbert, Philip Adler, Casey Falk, Malia B. Wenny, Aurelio Mollo, Matthias Zeller, Sorelle A. Friedler, Joshua Schrier, Alexander J. Norquist
Published 2016 · Computer Science, Medicine
Cite This
Download PDF
Analyze on Scholarcy
Inorganic–organic hybrid materials such as organically templated metal oxides, metal–organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure–property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on ‘dark’ reactions—failed or unsuccessful hydrothermal syntheses—collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions for new organically templated inorganic product formation with a success rate of 89 per cent. Inverting the machine-learning model reveals new hypotheses regarding the conditions for successful product formation.
This paper references
Will it crystallise? Predicting crystallinity of molecular materials
Jerome G. P. Wicker (2015)
a library for support vector machines
Chang (2011)
Mail-Order Metal–Organic Frameworks (MOFs): Designing Isoreticular MOF-5 Analogues Comprising Commercially Available Organic Molecules
Richard L. Martin (2013)
Role of N-Donor Sterics on the Coordination Environment and Dimensionality of Uranyl Thiophenedicarboxylate Coordination Polymers
Sonia G. Thangavelu (2015)
Sixth blind test of organic crystal-structure prediction methods.
Colin R. Groom (2014)
The Elements of Statistical Learning
Eric R. Ziegel (2003)
Exploration of a simple universal route to the myriad of open-framework metal phosphates
C. N. R. Rao (2000)
Learning with many relevant features
Joachims (1998)
The Elements of Statistical Learning 2nd edn
T Hastie (2009)
Reduced Molybdenum Phosphates: Octahedral‐Tetrahedral Framework Solids with Tunnels, Cages, and Micropores
Robert C. Haushalter (1992)
current status and future outlook
S. R. Kalidindi (2015)
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
Thorsten Joachims (1998)
The hydrothermal synthesis of zeolites: history and development from the earliest days to the present time.
Colin Stewart Cundy (2003)
Metal-halide perovskites for photovoltaic and light-emitting devices.
Samuel D. Stranks (2015)
Material Genome Initiative Strategic Plan
Holdren (2014)
The Hydrothermal Synthesis of Zeolites: History and Development from the Earliest Days to the Present Time
Colin Stewart Cundy (2003)
Identifying Zeolite Frameworks with a Machine Learning Approach
Shujiang Yang (2009)
A new era for ab initio molecular crystal lattice energy prediction.
Gregory J. O. Beran (2015)
Introduction to metal-organic frameworks.
H. Zhou (2012)
Crystal structure and prediction.
Tejender S Thakur (2015)
Data-Driven Review of Thermoelectric Materials: Performance and Resource Considerations
Michael W Gaultois (2013)
The Cambridge Structural Database: a quarter of a million crystal structures and rising.
Frank H. Allen (2002)
A Language and Environment for Statistical Computing http:// (R Foundation for
R R Core Team (2015)
An Introduction to Chemoinformatics Ch
A Leach (2007)
High-throughput computational screening of metal-organic frameworks.
Yamil J Colón (2014)
Support-Vector Networks
Corinna Cortes (2004)
The WEKA data mining software: an update
Mark A. Hall (2009)
Organically-templated metal sulfates, selenites and selenates.
C. N. R. Rao (2006)
Finding Nature’s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory
Geoffroy Hautier (2010)
Eclectic rule-extraction from support vector machines
Nahla H. Barakat (2005)
Formation principles for vanadium selenites: the role of pH on product composition.
Jacob H Olshansky (2014)
LIBSVM: A library for support vector machines
Chih-Chung Chang (2011)
An Update.
Katherine Marshall Johnson (1984)
Open-source platform to benchmark fingerprints for ligand-based virtual screening
Sereina Riniker (2013)
Microporous Solids: From Organically Templated Inorganic Skeletons to Hybrid Frameworks...Ecumenism in Chemistry
Gérard Férey (2001)
Facilitating the application of Support Vector Regression by using a universal Pearson VII function based kernel
Bülent Üstün (2006)
Rapid and Accurate Machine Learning Recognition of High Performing Metal Organic Frameworks for CO2 Capture.
Michael Fernandez (2014)
Oxyfluorinated microporous compounds ULM-n: chemical parameters, structures and a proposed mechanism for their molecular tectonics
Gérard Férey (1995)
performance and resource considerations
Gaultois (2013)
Materials Data Science: Current Status and Future Outlook
Surya R. Kalidindi (2015)
New stories of zeolite structures: their descriptions, determinations, predictions, and evaluations.
Yi Li (2014)
From computational discovery to experimental characterization of a high hole mobility organic crystal
Anatoliy N. Sokolov (2011)
Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry - the Harvard Clean Energy Project
Johannes Hachmann (2014)
High-throughput experimental tools for the materials genome initiative
Jingchan Zhao (2014)
Open-Framework Inorganic Materials.
Cheetham (1999)

This paper is referenced by
Machine-learning quantum mechanics: Solving quantum mechanics problems using radial basis function networks
Peiyuan Teng (2018)
Processing Parameter Correlations in Material Extrusion Additive Manufacturing
Daniel J. Braconnier (2020)
Biomimetic molecular design tools that learn, evolve, and adapt
David A. Winkler (2017)
Holistic computational structure screening of more than 12 000 candidates for solid lithium-ion conductor materials
Austin D Sendek (2017)
Unsupervised learning using topological data augmentation
Oleksandr Balabanov (2019)
New frontiers for the materials genome initiative
Juan J. de Pablo (2019)
Solving Materials’ Small Data Problem with Dynamic Experimental Databases
Michael McBride (2018)
Linking synthesis and structure descriptors from a large collection of synthetic records of zeolite materials
Koki Muraoka (2019)
Metaheuristic-based inverse design of materials – A survey
T. Warren Liao (2020)
Machine learning in materials science
Jing Wei (2019)
Data analysis of multi-dimensional thermophysical properties of liquid substances based on clustering approach of machine learning
Gota Kikugawa (2019)
Kreativ per Mausklick
Brigitte Osterath, (2019)
Can artificial intelligence create the next wonder material?
Nicola Nosengo (2016)
Deep learning bandgaps of topologically doped graphene
Yuan Dong (2018)
An updated roadmap for the integration of metal-organic frameworks with electronic devices and chemical sensors.
Ivo Stassen (2017)
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
Edward Kim (2017)
Direct imaging of molecular rotation with high-order-harmonic generation
Yanqing He (2019)
Crystal Structure Prediction via Deep Learning.
Kevin Ryan (2018)
Systems of natural-language-facilitated human-robot cooperation: A review
Rui Liu (2017)
Data mining assisted prediction of liquidus temperature for primary crystallization of different electrolyte systems
Hui Lu (2020)
Uncovering the eutectics design by machine learning in the Al-Co-Cr-Fe-Ni high entropy system
Qingfeng Wu (2020)
Synthesis Insights from Scienti fi c Literature via Text Extraction and Machine Learning
Edward Kim (2017)
Predicting the onset temperature (Tg) of GexSe1−x glass transition: a feature selection based two-stage support vector regression method
Yanjv Liu (2019)
Machine learning driven synthesis of few-layered WTe2
Manzhang Xu (2019)
The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules
Justin S. Smith (2020)
Prediction and understanding of AIE effect by quantum mechanics-aided machine-learning algorithm.
Jia Qiu (2018)
Machine Detection of Enhanced Electromechanical Energy Conversion in PbZr0.2 Ti0.8 O3 Thin Films.
Joshua C. Agar (2018)
Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient?
Grzegorz Skoraczyński (2017)
An informatics approach to transformation temperatures of NiTi-based shape memory alloys
Dezhen Xue (2017)
Machine Learning Protocol for Surface Enhanced Raman Spectroscopy.
Wei Hu (2019)
Identifying superionic conductors by materials informatics and high-throughput synthesis
Masato Matsubara (2020)
The Directed Migration of Neutrophil-Like Cells Through Engineered Chemokine Secretion
Francisca Vasconcelos (2020)
See more
Semantic Scholar Logo Some data provided by SemanticScholar