Flow Matching Meets Exponential Families for Tabular Data

Published on June 8, 2025

Motivation

Tabular data is awkward for modern generative modeling because a single table usually mixes continuous, binary, and categorical variables. Standard flow matching works naturally in continuous spaces, but a table is not a single Euclidean object: each column has its own geometry and its own natural loss.

Our goal was to make flow matching work natively on mixed-type tables rather than forcing everything into a continuous encoding. The key insight: treat each variable through an appropriate exponential-family distribution — Gaussian for continuous columns, Bernoulli for binary …

Making Natural Gradient Descent Practical

Published on May 18, 2025

Motivation

Physics-Informed Neural Networks (PINNs) are often hard to train because the optimization landscape is badly conditioned. Natural-gradient and second-order methods can help a lot, but the main obstacle is cost: the curvature matrix lives in parameter space, so solving the natural-gradient step can become prohibitively expensive for modern networks.

This paper asks a practical question: can we keep the optimization benefits of Energy Natural Gradient Descent (ENGD) while making it cheap enough to use routinely? The answer is yes, through three ideas: compute the update in sample …

Is 3D a Step Too Far for Optimizing Molecules?

Published on December 14, 2024

Motivation

Molecules are intrinsically 3D objects, so it is tempting to assume that 3D molecular representations should dominate 1D and 2D representations in Bayesian optimization for materials discovery. In practice, however, chemists often use SMILES strings, fingerprints, or 2D graph features instead.

Instead of proposing yet another molecular model, we decided to study the question empirically and mathematically through the lens of Bayesian optimization (BO): what matters is not just predictive accuracy, but whether a representation helps a surrogate model find high-value molecules in as …