Online citations, reference lists, and bibliographies.

Machine-learning-assisted Materials Discovery Using Failed Experiments

Paul Raccuglia, Katherine C. Elbert, Philip Adler, Casey Falk, Malia B. Wenny, Aurelio Mollo, Matthias Zeller, Sorelle A. Friedler, Joshua Schrier, Alexander J. Norquist
Published 2016 · Computer Science, Medicine
Cite This
Download PDF
Analyze on Scholarcy
Share
Inorganic–organic hybrid materials such as organically templated metal oxides, metal–organic frameworks (MOFs) and organohalide perovskites have been studied for decades, and hydrothermal and (non-aqueous) solvothermal syntheses have produced thousands of new materials that collectively contain nearly all the metals in the periodic table. Nevertheless, the formation of these compounds is not fully understood, and development of new compounds relies primarily on exploratory syntheses. Simulation- and data-driven approaches (promoted by efforts such as the Materials Genome Initiative) provide an alternative to experimental trial-and-error. Three major strategies are: simulation-based predictions of physical properties (for example, charge mobility, photovoltaic properties, gas adsorption capacity or lithium-ion intercalation) to identify promising target candidates for synthetic efforts; determination of the structure–property relationship from large bodies of experimental data, enabled by integration with high-throughput synthesis and measurement tools; and clustering on the basis of similar crystallographic structure (for example, zeolite structure classification or gas adsorption properties). Here we demonstrate an alternative approach that uses machine-learning algorithms trained on reaction data to predict reaction outcomes for the crystallization of templated vanadium selenites. We used information on ‘dark’ reactions—failed or unsuccessful hydrothermal syntheses—collected from archived laboratory notebooks from our laboratory, and added physicochemical property descriptions to the raw notebook information using cheminformatics techniques. We used the resulting data to train a machine-learning model to predict reaction success. When carrying out hydrothermal synthesis experiments using previously untested, commercially available organic building blocks, our machine-learning model outperformed traditional human strategies, and successfully predicted conditions for new organically templated inorganic product formation with a success rate of 89 per cent. Inverting the machine-learning model reveals new hypotheses regarding the conditions for successful product formation.
This paper references
10.1039/C4CE01912A
Will it crystallise? Predicting crystallinity of molecular materials
Jerome G. P. Wicker (2015)
a library for support vector machines
Chang (2011)
10.1021/JP401920Y
Mail-Order Metal–Organic Frameworks (MOFs): Designing Isoreticular MOF-5 Analogues Comprising Commercially Available Organic Molecules
Richard L. Martin (2013)
10.1021/ACS.CGD.5B00549
Role of N-Donor Sterics on the Coordination Environment and Dimensionality of Uranyl Thiophenedicarboxylate Coordination Polymers
Sonia G. Thangavelu (2015)
10.1107/S2052520614015923
Sixth blind test of organic crystal-structure prediction methods.
Colin R. Groom (2014)
10.1198/tech.2003.s770
The Elements of Statistical Learning
Eric R. Ziegel (2003)
10.1021/ja993892f
Exploration of a simple universal route to the myriad of open-framework metal phosphates
C. N. R. Rao (2000)
Learning with many relevant features
Joachims (1998)
The Elements of Statistical Learning 2nd edn
T Hastie (2009)
10.1002/chin.199217320
Reduced Molybdenum Phosphates: Octahedral‐Tetrahedral Framework Solids with Tunnels, Cages, and Micropores
Robert C. Haushalter (1992)
current status and future outlook
S. R. Kalidindi (2015)
10.1007/BFb0026683
Text Categorization with Support Vector Machines: Learning with Many Relevant Features
Thorsten Joachims (1998)
10.1021/cr020060i
The hydrothermal synthesis of zeolites: history and development from the earliest days to the present time.
Colin Stewart Cundy (2003)
10.1038/nnano.2015.90
Metal-halide perovskites for photovoltaic and light-emitting devices.
Samuel D. Stranks (2015)
Material Genome Initiative Strategic Plan
Holdren (2014)
10.1002/chin.200319217
The Hydrothermal Synthesis of Zeolites: History and Development from the Earliest Days to the Present Time
Colin Stewart Cundy (2003)
10.1021/JP907017U
Identifying Zeolite Frameworks with a Machine Learning Approach
Shujiang Yang (2009)
10.1002/anie.201409823
A new era for ab initio molecular crystal lattice energy prediction.
Gregory J. O. Beran (2015)
10.1021/cr300014x
Introduction to metal-organic frameworks.
H. Zhou (2012)
10.1146/annurev-physchem-040214-121452
Crystal structure and prediction.
Tejender S Thakur (2015)
10.1021/CM400893E
Data-Driven Review of Thermoelectric Materials: Performance and Resource Considerations
Michael W Gaultois (2013)
10.1107/S0108768102003890
The Cambridge Structural Database: a quarter of a million crystal structures and rising.
Frank H. Allen (2002)
A Language and Environment for Statistical Computing http:// www.R-project.org/ (R Foundation for
R R Core Team (2015)
An Introduction to Chemoinformatics Ch
A Leach (2007)
10.1039/c4cs00070f
High-throughput computational screening of metal-organic frameworks.
Yamil J Colón (2014)
10.1023/A:1022627411411
Support-Vector Networks
Corinna Cortes (2004)
10.1145/1656274.1656278
The WEKA data mining software: an update
Mark A. Hall (2009)
10.1039/b510396g
Organically-templated metal sulfates, selenites and selenates.
C. N. R. Rao (2006)
10.1021/CM100795D
Finding Nature’s Missing Ternary Oxide Compounds Using Machine Learning and Density Functional Theory
Geoffroy Hautier (2010)
Eclectic rule-extraction from support vector machines
Nahla H. Barakat (2005)
10.1021/ic501736r
Formation principles for vanadium selenites: the role of pH on product composition.
Jacob H Olshansky (2014)
10.1145/1961189.1961199
LIBSVM: A library for support vector machines
Chih-Chung Chang (2011)
10.4315/0362-028X-47.2.145
An Update.
Katherine Marshall Johnson (1984)
10.1186/1758-2946-5-26
Open-source platform to benchmark fingerprints for ligand-based virtual screening
Sereina Riniker (2013)
10.1021/cm011070n
Microporous Solids: From Organically Templated Inorganic Skeletons to Hybrid Frameworks...Ecumenism in Chemistry
Gérard Férey (2001)
10.1016/j.chemolab.2005.09.003
Facilitating the application of Support Vector Regression by using a universal Pearson VII function based kernel
Bülent Üstün (2006)
10.1021/jz501331m
Rapid and Accurate Machine Learning Recognition of High Performing Metal Organic Frameworks for CO2 Capture.
Michael Fernandez (2014)
10.1016/0022-1139(94)00406-6
Oxyfluorinated microporous compounds ULM-n: chemical parameters, structures and a proposed mechanism for their molecular tectonics
Gérard Férey (1995)
performance and resource considerations
Gaultois (2013)
10.1146/ANNUREV-MATSCI-070214-020844
Materials Data Science: Current Status and Future Outlook
Surya R. Kalidindi (2015)
10.1021/cr500010r
New stories of zeolite structures: their descriptions, determinations, predictions, and evaluations.
Yi Li (2014)
10.1038/ncomms1451
From computational discovery to experimental characterization of a high hole mobility organic crystal
Anatoliy N. Sokolov (2011)
10.1039/C3EE42756K
Lead candidates for high-performance organic photovoltaics from high-throughput quantum chemistry - the Harvard Clean Energy Project
Johannes Hachmann (2014)
10.1007/S11434-014-0120-1
High-throughput experimental tools for the materials genome initiative
Jingchan Zhao (2014)
10.1002/(SICI)1521-3773(19991115)38:22<3268::AID-ANIE3268>3.0.CO;2-U
Open-Framework Inorganic Materials.
Cheetham (1999)



This paper is referenced by
10.1103/PhysRevE.98.033305
Machine-learning quantum mechanics: Solving quantum mechanics problems using radial basis function networks
Peiyuan Teng (2018)
10.1016/j.addma.2019.100924
Processing Parameter Correlations in Material Extrusion Additive Manufacturing
Daniel J. Braconnier (2020)
10.3762/bjoc.13.125
Biomimetic molecular design tools that learn, evolve, and adapt
David A. Winkler (2017)
10.1039/C6EE02697D
Holistic computational structure screening of more than 12 000 candidates for solid lithium-ion conductor materials
Austin D Sendek (2017)
10.1103/PhysRevResearch.2.013354
Unsupervised learning using topological data augmentation
Oleksandr Balabanov (2019)
10.1038/s41524-019-0173-4
New frontiers for the materials genome initiative
Juan J. de Pablo (2019)
10.3390/PR6070079
Solving Materials’ Small Data Problem with Dynamic Experimental Databases
Michael McBride (2018)
10.1038/s41467-019-12394-0
Linking synthesis and structure descriptors from a large collection of synthetic records of zeolite materials
Koki Muraoka (2019)
10.1016/j.jmat.2020.02.011
Metaheuristic-based inverse design of materials – A survey
T. Warren Liao (2020)
10.1002/inf2.12028
Machine learning in materials science
Jing Wei (2019)
10.1016/J.CPLETT.2019.04.075
Data analysis of multi-dimensional thermophysical properties of liquid substances based on clustering approach of machine learning
Gota Kikugawa (2019)
10.1002/nadc.20194090194
Kreativ per Mausklick
Brigitte Osterath, (2019)
10.1038/533022a
Can artificial intelligence create the next wonder material?
Nicola Nosengo (2016)
Deep learning bandgaps of topologically doped graphene
Yuan Dong (2018)
10.1039/c7cs00122c
An updated roadmap for the integration of metal-organic frameworks with electronic devices and chemical sensors.
Ivo Stassen (2017)
10.1021/ACS.CHEMMATER.7B03500
Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning
Edward Kim (2017)
10.1103/PHYSREVA.99.053419
Direct imaging of molecular rotation with high-order-harmonic generation
Yanqing He (2019)
10.1021/jacs.8b03913
Crystal Structure Prediction via Deep Learning.
Kevin Ryan (2018)
Systems of natural-language-facilitated human-robot cooperation: A review
Rui Liu (2017)
10.1016/j.chemolab.2019.103885
Data mining assisted prediction of liquidus temperature for primary crystallization of different electrolyte systems
Hui Lu (2020)
10.1016/j.actamat.2019.10.043
Uncovering the eutectics design by machine learning in the Al-Co-Cr-Fe-Ni high entropy system
Qingfeng Wu (2020)
Synthesis Insights from Scienti fi c Literature via Text Extraction and Machine Learning
Edward Kim (2017)
10.1016/J.SCIB.2019.06.026
Predicting the onset temperature (Tg) of GexSe1−x glass transition: a feature selection based two-stage support vector regression method
Yanjv Liu (2019)
Machine learning driven synthesis of few-layered WTe2
Manzhang Xu (2019)
10.1038/s41597-020-0473-z
The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules
Justin S. Smith (2020)
10.1039/c8cc02850h
Prediction and understanding of AIE effect by quantum mechanics-aided machine-learning algorithm.
Jia Qiu (2018)
10.1002/adma.201800701
Machine Detection of Enhanced Electromechanical Energy Conversion in PbZr0.2 Ti0.8 O3 Thin Films.
Joshua C. Agar (2018)
10.1038/s41598-017-02303-0
Predicting the outcomes of organic reactions via machine learning: are current descriptors sufficient?
Grzegorz Skoraczyński (2017)
10.1016/J.ACTAMAT.2016.12.009
An informatics approach to transformation temperatures of NiTi-based shape memory alloys
Dezhen Xue (2017)
10.1021/acs.jpclett.9b02517
Machine Learning Protocol for Surface Enhanced Raman Spectroscopy.
Wei Hu (2019)
10.1038/s43246-019-0004-7
Identifying superionic conductors by materials informatics and high-throughput synthesis
Masato Matsubara (2020)
The Directed Migration of Neutrophil-Like Cells Through Engineered Chemokine Secretion
Francisca Vasconcelos (2020)
See more
Semantic Scholar Logo Some data provided by SemanticScholar