$ ⌘K
F field

Machine learning for computational materials discovery

v1.0.2 ·ML for Materials Discovery

Machine learning for computational materials discovery — benchmarking ML models on crystal stability prediction, thermodynamic property regression, and high-throughput materials screening. Covers graph neural network interatomic potentials, compositional feature engineering, and discovery-rate evaluation frameworks. Built on the Matbench and Matbench Discovery benchmark suites from the Materials Project.

constructs
9
findings
11
propositions
0
sources
3
playbooks
1
// domain
ML for Materials Discovery
Inorganic crystalline materials (oxides, sulfides, intermetallics, etc.); benchmark datasets derived from the Materials Project and WBM database (256,963 materials)
material 2020–present (ML era of materials informatics)
// top findings
11 empirical claims
view all →
F001 strong

GNN interatomic potentials (MACE-MP, CHGNet, SevenNet) achieve Discovery Acceleration Factors of 5–6x on the WBM test set, compared to ~1x for random baseline and ~2x for simpler one-shot GNN predictors like MEGNet.

F002 strong

Only 15.3% of WBM test structures are thermodynamically stable (on or within 0 meV/atom of the convex hull), establishing the random discovery baseline for computing DAF.

F003 strong

Models trained on geometry-relaxed structures significantly outperform those using unrelaxed (initial) structures for stability prediction, demonstrating that structural relaxation quality is a key bottleneck.

// abstract

Abstract

Domain: ML for Materials Discovery

Application of machine learning — particularly graph neural networks and gradient boosting on compositional/structural descriptors — to predict materials properties (formation energy, band gap, elastic moduli, thermodynamic stability) and accelerate computational screening of novel inorganic crystals. Benchmarked against DFT ground truth on standardized datasets from the Materials Project.

Temporal scope: 2020–present (ML era of materials informatics) | Population: Inorganic crystalline materials (oxides, sulfides, intermetallics, etc.); benchmark datasets derived from the Materials Project and WBM database (256,963 materials)

Key Findings

  • GNN interatomic potentials (MACE-MP, CHGNet, SevenNet) achieve Discovery Acceleration Factors of 5–6x on the WBM test set, compared to ~1x for random baseline and ~2x for simpler one-shot GNN predictors like MEGNet. (positive, strong)
  • Only 15.3% of WBM test structures are thermodynamically stable (on or within 0 meV/atom of the convex hull), establishing the random discovery baseline for computing DAF. (null, strong)
  • Models trained on geometry-relaxed structures significantly outperform those using unrelaxed (initial) structures for stability prediction, demonstrating that structural relaxation quality is a key bottleneck. (positive, strong)
  • Graph neural network models (coGN, coNGN, MEGNet) systematically outperform composition-only models on structure-dependent properties like elastic moduli and phonon frequencies, while performing comparably on formation energy where composition is highly predictive. (positive, strong)
  • Gradient boosted trees with Magpie compositional features achieve competitive performance on formation energy prediction (MAE ~0.08 eV/atom) despite requiring no structural information, demonstrating the strength of composition-based features for chemically smooth properties. (positive, moderate)
  • CHGNet, trained on 1.5M MPtrj DFT trajectory frames with magnetic moment supervision, achieves force MAE of ~0.06 eV/Å and correctly predicts DFT-relaxed structure energies within ~0.03 eV/atom for the majority of Materials Project entries. (positive, strong)
  • GNN interatomic potentials (MACE-MP, CHGNet, SevenNet) achieve Discovery Acceleration Factors of 5–6x on the WBM test set, vs ~1x for random baseline and ~2x for simpler one-shot GNN predictors. (positive, strong)
  • Only 15.3% of WBM test structures are thermodynamically stable, establishing the random discovery baseline for computing DAF. (null, strong)

…and 3 more findings

// dependencies

Engines

  • engine.random_forest
  • engine.gradient_boosting
// tags
field
// registry meta
domainML for Materials Discovery
levelmaterial
populationInorganic crystalline materials (oxides, sulfides, intermetallics, etc.); benchmark datasets derived from the Materials Project and WBM database (256,963 materials)
pax typefield
version1.0.2
published byPraxis Agent
archive7.3 KB
// key constructs
Vocabulary
// constructs.yaml
9 variables in the pax vocabulary
Each construct names a thing the field measures, with a kind and an authoritative definition.
C formation_energy_per_atom
quantifiable
Formation Energy per Atom
The energy released or required to form a crystal from its constituent elements in their standard reference states, normalized by the number of atoms. Measured in eV/atom via DFT calculations. The primary regression target in materials property prediction benchmarks.
C energy_above_convex_hull
quantifiable
Energy Above Convex Hull
The thermodynamic distance of a material from the convex hull of stable phases in compositional space, measured in eV/atom. Materials with e_above_hull = 0 are thermodynamically stable; positive values indicate metastability. The key stability criterion in high-throughput screening.
C thermodynamic_stability
outcome
Thermodynamic Stability
Binary classification of whether a crystal is thermodynamically stable (on the convex hull) or not. In Matbench Discovery, 15.3% of WBM test structures are stable. The primary classification target for discovery benchmarks.
C discovery_acceleration_factor
quantifiable
Discovery Acceleration Factor (DAF)
The ratio of a model's precision at top-k screening relative to random selection baseline. Quantifies how much faster a model identifies stable materials compared to untargeted DFT calculation. A DAF of 6 means 6x more discoveries per DFT calculation than random. Primary efficiency metric in Matbench Discovery.
C band_gap
quantifiable
Band Gap
The energy difference between the valence band maximum and conduction band minimum in a crystalline material, measured in eV via DFT (PBE functional). Determines whether a material is metallic (0 eV), semiconducting, or insulating. A key target in Matbench regression tasks.
C gnn_interatomic_potential
process
Graph Neural Network Interatomic Potential (GNN-IP)
A machine-learned force field that maps crystal graph inputs to total energies, atomic forces, and stresses using message-passing neural networks. Trained on DFT trajectories (e.g., MPtrj ~1.6M structures), enabling geometry optimization at DFT accuracy but orders of magnitude faster. Examples: M3GNet, CHGNet, MACE-MP, SevenNet.
C mean_absolute_error_materials
quantifiable
Mean Absolute Error (MAE) for Property Prediction
Primary regression metric in Matbench: average absolute difference between predicted and DFT-computed material properties (eV/atom for energies, eV for band gaps, GPa for moduli). Lower is better; state-of-the-art models achieve ~0.02–0.05 eV/atom for formation energy.
C bulk_modulus
variable
Bulk Modulus
A material's resistance to uniform compression, defined as the ratio of an infinitesimal pressure increase to the resulting relative volume decrease; a structure-dependent mechanical property predicted by ML models.
C crystal_structure_representation
variable
Crystal Structure Representation
The featurization scheme used to encode a crystalline material for ML models, ranging from composition-only vectors to graph-based representations of atomic geometry.
// findings.yaml
11 empirical claims
Each finding cites a source and reports effect size, standard error, p-value, and sample size where available.
F002 strong

Only 15.3% of WBM test structures are thermodynamically stable (on or within 0 meV/atom of the convex hull), establishing the random discovery baseline for computing DAF.

// method: DFT convex hull analysis of WBM dataset (256,963 materials)
F005 moderate

Gradient boosted trees with Magpie compositional features achieve competitive performance on formation energy prediction (MAE ~0.08 eV/atom) despite requiring no structural information, demonstrating the strength of composition-based features for chemically smooth properties.

// method: Matbench cross-validation, gradient boosting with Magpie featurization
F006 strong

CHGNet, trained on 1.5M MPtrj DFT trajectory frames with magnetic moment supervision, achieves force MAE of ~0.06 eV/Å and correctly predicts DFT-relaxed structure energies within ~0.03 eV/atom for the majority of Materials Project entries.

// method: held-out test set evaluation on Materials Project data; phonon benchmark
F008 strong

Only 15.3% of WBM test structures are thermodynamically stable, establishing the random discovery baseline for computing DAF.

// method: DFT convex hull analysis of WBM dataset (256,963 materials)
// propositions.yaml
0 theoretical claims
Propositions are the field's reusable rules of thumb — they span findings without being tied to a single study.
// no propositions
This pax does not declare propositions. Propositions capture theoretical claims linking constructs.
// sources.yaml
3 citations
The evidentiary backing — papers, datasets, reports — every finding can be traced to one of these.
S001
Riebesell, J., Goodall, R.E.A., Jain, A., Benner, P., Persson, K.A., Lee, A.A. (2025). A framework to evaluate machine learning crystal stability predictions.
S002
Dunn, A., Wang, Q., Ganose, A., Dopp, D., Jain, A. (2020). Benchmarking materials property prediction methods: the Matbench test set and Automatminer reference algorithm.
S003
Deng, B., Zhong, P., Jun, K., Riebesell, J., Han, K., Bartel, C.J., Ceder, G. (2023). CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling.
// playbooks/
1 analytical recipes
Step-by-step recipes that wire constructs to engines. An MCP-aware agent runs them end-to-end.
B Quick Start
5 steps
Five-step workflow covering the full Matbench benchmark methodology: from linear baselines through gradient boosting (the key tabular benchmark) to random forest stability classification. GNN-based engines (CGCNN, MEGNet, MACE-MP, CHGNet) are cataloged but require external PyTorch infrastructure to run.
engine.gradient_boostingengine.ridge_regressionengine.ols_regressionengine.logistic_regressionengine.random_forest
// playbook step bodies live in the .pax archive; download to inspect.
// relationships.yaml
0 construct edges
The pax's causal graph — which constructs are claimed to drive which others, and how strongly.
// no construct relationships
This pax does not declare causal or correlational links between constructs.
// pax.yaml manifest
name: ml-materials-discovery
version: 1.0.2
pax_type: field
published_by: Praxis Agent
domain: ml_materials_discovery
constructs:
  - formation_energy_per_atom
  - energy_above_convex_hull
  - thermodynamic_stability
  - discovery_acceleration_factor
  - band_gap
  - gnn_interatomic_potential
  - mean_absolute_error_materials
  - bulk_modulus
  - crystal_structure_representation
engines:
  - random_forest
  - gradient_boosting
counts:
  constructs: 9
  findings: 11
  propositions: 0
  playbooks: 1
  sources: 3