Deep Learning on Microstructural Images

It’s well known that microstructure plays a key role in determining material properties.  One common way of assessing material microstructure is via Scanning Electron Microscopy (SEM) images.  On Citrination, we have the capability to use these microstructural images as inputs to our data-driven models.  

We have developed customized deep learning techniques to automatically detect which textures are present in the images.  Those textures can then be used as inputs to machine learning models to label the microstructure and predict material properties.  This framework is shown schematically in the figure below.  

This schematic illustrates the deep learning framework for featurizing SEM images. The SEM image on the left shows steel with pearlite microstructure. That image is transformed through deep learning into a vector of textures. A machine learning model is then able to correctly label the microstructure of this image with high confidence.

A tutorial video of how SEM images can be ingested onto the platform and used to build models is available here.  This tutorial used data from the Ultra High Carbon Steel Database 1, which is accessible here on the public Citrination platform, complete with deep learning texture vectors.  

This capability is an example of how Citrine’s platform provides cutting edge artificial intelligence solutions specialized for materials science use cases.

Dr. Julia Ling, Principal Scientist at Citrine

1 DeCost, Brian L., et al. “UHCSDB: UltraHigh Carbon Steel Micrograph DataBase.” Integrating Materials and Manufacturing Innovation (2017): 1-9.

Mengfei Yuan Shares Her Researcher in Residence Experience

Working as a researcher-in-residence during these three weeks, I developed my research interests combining with Citrination tool. This study seeks to using machine learning algorithm through Citrination to establish a “fast-acting reduced-order crystal plasticity model” for polycrystalline material. The system is usually underdetermined when adding more fitting parameters than experiments used to calibrate the model.  At the preliminary design stage, it’s important to have an early approximation of mechanical properties and microstructure informatics based on the experimental stress-strain data and EBSD results of samples. The final texture and optimal crystal plasticity model with certain initial/boundary/loading condition can be predicted through Citrination tool through leaning the relationship between the microscale texture deformation development and physical properties. The overreaching goal of this study can be extended to the “data-driven material design tool” used for designing expected microstructures with desired crystal plasticity properties depending on given initial texture, processing techniques and conditions.

Also, I learned how to design training processes and evaluate the quality of data on Citrnation platform. Based on the machine learning results, I need to adjust my dataset for better predictions and analyze the theoretical issues might existed in my training dataset. Also, I did some exercise based on “learn-citrinaition”, such as writing PIF from computational calculations, experimental designing for optimization problem, batched properties prediction using queries, etc.

Mengfei Yuan, Ohio State University

Mengfei Yuan joined the Citrine Research team over the summer as part of the Researcher in Residence program. 

Learn Citrination to generate a useful data analysis

In this Learn Citrination tutorial, we’re going to learn to use Citrination to generate a useful data analysis called t-SNE. This data visualization technique enables you to represent a high dimensional set of data in fewer dimensions in a way that preserves the local structure of the data. In materials informatics, this allows you to create a two-dimensional plot of a set of materials where points corresponding to similar materials are grouped together in two-dimensional space. More information on t-SNE here.

This tutorial will teach you to create and export a two-dimensional t-SNE plot for any data on Citrination. The first step is to create a data view on the Citrination. Instructions for creating a data view can be found in this tutorial.

We’ll be using this data view: (view id 787) for this tutorial, which includes a model predicting experimental band gaps based on data compiled by W.H. Strehlow and E.L. Cook, which can be viewed in this dataset.

See the full tutorial notebook with step-by-step instructions here.

– E Antono, Citrine Research

Cutting Edge Uncertainty Quantification for Data-Driven Materials Models

In many applications of machine learning, the machine learning model accuracy is the most important consideration, and knowing the uncertainty of those predictions is not critical.  For example, for a clothing recommendation engine, it is important that on average it suggests clothes that a customer would like to buy.  It is acceptable for it to occasionally recommend an article of clothing that a customer dislikes, as long as its average performance is high.

At Citrine, we recognize that building accurate models for materials properties is not enough.

In order for data-driven models to be useful in materials science applications, it is critical to have a reliable estimate of model uncertainty reported with every prediction.

For example, say that we have trained a model to predict band gap based on the Strehlow and Cook experimental dataset.  We want to make predictions for the band gaps of a couple new compounds, tin monoxide (SnO) and nickel oxide (NiO).  Our model predicts values of 2.4 eV and 2.8 eV respectively. The key question is, “How confident is our model in these predictions?”

There are many different sources of uncertainty in data-driven models.  If the model was fit to noisy training data, then that noise will cause uncertainty in the model.  If the model is fit to only a small number of data points, it will also have higher uncertainty.  Another important source of uncertainty is extrapolation.  For example, if we trained a model on the blue dots in the figure to the right, then tried to make a prediction at the red X, our prediction would have high uncertainty.  Similarly, data-driven models are unreliable at making predictions on materials that are significantly different from any of the materials in the training set.

At Citrine, all our predictions come with uncertainty estimates.  We have developed, implemented, and validated cutting edge uncertainty quantification methods for data-driven materials models.  For more details on our uncertainty quantification techniques and how they can be used to accelerate materials design, please see our recent paper.1

In the cases of SnO and NiO, our predictions are shown below.


These plots show the probability distribution function for our prediction.  For example, in the case of SnO, the mean value of the distribution is 2.45 eV and the uncertainty of 0.78 eV is based on the spread of the distribution at one standard deviation. Since the uncertainty estimates are based on the standard deviation of the distribution, they are a 68% confidence interval, i.e. the probability that the true value is within 0.75 eV of the prediction (2.45 eV) is 68%.

The model uncertainty for NiO (1.41 eV) is much higher than for SnO (0.78 eV), in part because the training set included far fewer compounds containing nickel than tin.  The higher uncertainty in the NiO predictions reflects the fact that the model is extrapolating at this point.  The true band gap for SnO is approximately 2.5 eV and for NiO is approximately 3.8 eV.2

At Citrine, we know that uncertainty estimates are critical for assessing model confidence when using data-driven models for real engineering applications.  We are proud to be leading the field by providing well-calibrated uncertainty estimates for all our predictions.3

J Ling, Citrine Research

  1.  Ling, Julia, et al. “High-Dimensional Materials and Process Optimization using Data-driven Experimental Design with Well-Calibrated Uncertainty Estimates.” Integrating Materials and Manufacturing Innovation (2017).
  2. Wong, Terence KS, et al. “Current status and future prospects of copper oxide heterojunction solar cells.” Materials 9.4 (2016): 271.
  3. This work was funded in part by Argonne National Laboratories through contract 6F-31341, associated with the R2R Manufacturing Consortium funded by the Department of Energy Advanced Manufacturing Office.

Citrine Informatics Nominated to the World Materials Forum Start Up Challenge

Citrine Informatics is pleased to announce its nomination to the 2017 World Materials Forum Start Up Challenge. As one of 12 start up companies selected to attend this year’s World Materials Forum in Nancy, France, Citrine will have the opportunity to showcase its platform, Citrination, and how its artificial intelligence technology can dramatically accelerate the materials and chemicals development and deployment process.

Continue reading…

Citrine Informatics Selected to 2017 AI100

Citrine Informatics Selected to the 2017 AI 100, Highlighting Advancements in Artificial Intelligence for Materials and Chemicals Manufacturing and Development

Santa Barbara, CA, January 11, 2017 — CB Insights today selected Citrine Informatics to the prestigious Artificial Intelligence 100 list (“AI 100”), a select group of emerging private companies working on ground breaking artificial intelligence technology. CB Insights CEO and co-founder, Anand Sanwal, revealed the winners during The Innovation Summit, a gathering of top executives and investors to explore the industries of the future.

Continue reading…