In the early days of the internet, we were promised that it would change everything. Three decades later there are indeed a few internet-only companies, but the other winners are those who integrated the internet into what they already did well. Machine Learning will probably be similar.

How should physical scientists use machine learning?

It helps to unpack the scientific method into physics, modelling, and analysis:

  • Physics means "what the world does";
  • Modelling means "how we model the physics of a particular part of the world";
  • Analysis means "how we predict the behaviour of that model".

The classical scientific approach is to identify the pertinent physics, build an appropriate model, perform some appropriate analysis, derive some useful results, test the model experimentally, and ideally discover something non-obvious but observable in the underlying physics.

Data-driven machine learning bypasses the modelling phase by applying analysis directly to experimental or numerical data. This works by fitting the data with a model (e.g. a neural network) that contains many degrees of freedom. The model can then recognize behaviour close to that which it has seen before. The downsides are that this model cannot extrapolate reliably and, if it has been overfitted, can be significantly wrong.

Our view is that:

  • data-driven techniques are developing rapidly and give us new capabilities;
  • the physical knowledge and modelling expertise that have accumulated over centuries should not be discarded;
  • the most durable advances will combine physics-based and data-driven approaches.

In our research we combine the classical scientific approach with the data-driven machine learning approach. We fit physics-based models rather than (say) neural networks to data. In other words, we take a qualitatively-accurate physics-based model and render it quantitatively-accurate by assimilating data. This model can then extrapolate to situations that it has not seen before but that share the same physics. The concept is explained here and some of the machinery is explained here. The examples below illustrate the concept.

Example 1 - the natural frequency of a clock's pendulum.

Classical engineering approach

Physics: The pendulum is stiff but flexible and has distributed mass. The pendulum experiences mechanical friction at the pivot and aerodynamic drag along its length. Twice each period it receives an impulse from the clock mechanism.

Modelling: To calculate the pendulum's natural frequency we model it as a point with mass m on a mass-less rigid rod with length l, rotating around a frictionless pivot in a vacuum with gravitational acceleration, g. Considering the angular displacement around the pivot and using Newton's second law we obtain l d^2(theta)/dt^2 = - g * sin(theta), which is a second order nonlinear ordinary differential equation (ODE).

Analysis: For small angles, we approximate sin(theta) = theta to obtain a second order linear ODE which we solve to obtain the angular frequency, omega = sqrt(g/l). For large angles, an analytical solution to this simple equation can be found and shows that omega depends also on amplitude.

Experiments: To test the model's predicted result that omega = sqrt(g/l), independent of m, we would measure omega for a handful of oscillations with different l and m. For small amplitudes we should find that omega = sqrt(g/l). For large amplitudes we should also find the dependence on amplitude, A.

Comments: To calculate the natural frequency we do not model air resistance, friction, distributed masses, forcing from the clock mechanism, or vibrations in the rod, because they have little influence on the frequency, they make the analysis harder, and they obscure the pertinent physics. Note that the model can reliably extrapolate to different values of g without any observations at different values of g.

Data-driven engineering approach

A machine learning algorithm would need to observe omega over several thousand experiments at different values of m, l, and amplitude, A. It would fit a model to the data and, if successful, this model would be a black box with two influential inputs (l and A) and one redundant input (m). As users, we would have no physical insight, but we would be able to obtain omega as a function of l and A as long as l and A lie within the ranges already observed. The machine learning algorithm would not know the dependence on g becaue it would never observe it.

Example 2 - the limit cycle amplitude of a clock's pendulum

Classical engineering approach

Physics: As before, the pendulum is stiff but flexible and has distributed mass. The pendulum experiences mechanical friction at the pivot and aerodynamic drag along its length. Twice each period it receives an impulse from the clock mechanism.

Modelling: To calculate the pendulum's limit cycle amplitude we retain the rigid rod model but now need to model the impulse from the clock mechanism, the aerodynamic drag, and the friction at the pivot. Even with this limited extra physics, the model becomes quite elaborate and requires several more assumptions and parameters. For example, we might assume that the steady flow drag coefficient can be applied to this unsteady flow such that the aerodynamic drag is proportional to (d(theta)/dt)^2 at high speeds and d(theta)/dt at low speeds with a constant of proportionality that needs to be measured for that pendulum shape and depends somewhat on d(theta)/dt through the Reynolds number. We need to model both static friction and dynamic friction at the pivot. We obtain a second order nonlinear ordinary differential equation, as before, but it contains several extra forcing terms and several more parameters.

Analysis: We would calculate the limit cycle amplitude as a function of the impulse from the clock's mechanism. The analysis is no longer analytically tractable so a numerical solution is required. If we are wise, we will also calculate the sensitivity of the amplitude to the model parameters in order to determine which parameters have most influence and therefore which need to be determined most accurately.

Experiments: To test the model's predictions, we would measure the limit cycle amplitude as a function of the impulse over a range of l, m and pendulum shapes.

Comments: On the positive side, the model can extrapolate to different values of air densities, air viscosities, and pendulum shapes (if we measure their drag coefficients). On the negative side, we may need to obtain these model parameters extremely accurately in order for the model itself to be accurate.

Data-driven engineering approach

A machine learning algorithm would observe the amplitude over several thousand experiments. As before, the output would be a black box that would give accurate results when interpolating between experiments it had seen before. On the positive side, the model would be accurate. On the negative side, it would not know the dependence on air densities, viscosities, or pendulum shapes.

Which is better: physics-based or data-driven?

It depends. In the first example, the classical engineering approach with a physics-based model is clearly better. It gives more physical insight, can be expressed in terms of easily-measurable parameters (g/l) and creates a general extrapolatable result.

In the second example, the physics-based model has become cumbersome; the analysis is difficult and it contains many parameters, some of which are quite influential and many of which are not well known or easy to discover. On the other hand, the machine learning model is accurate, but only when interpolating between results it has already observed. In summary, both models have their problems.

At this stage, particularly if there is disagreement between the experimental results and the model predictions, it is tempting to try to include more physical phenomena in the hope that an important phenomenon has been omitted. This may help but, of course, makes the model more cumbersome. Alternatively, the model can be altered such that some influential parameters are no longer required. For example, we could replace the steady flow aerodynamic drag coefficients with an unsteady numerical simulation of the flow around the pendulum. Note, however, that this introduces other influential parameters, such as turbulence or subgrid scale model parameters used in the simulations. There is a danger that the dependence on one set of parameters is replaced with a less-visible dependence on another set of parameters and that the qualitatively-important physical phenomena will be masked by unimportant phenomena.

Our approach combines machine learning and physics-based approaches and can best be called inverse uncertainty quantification.

We start from the premise that we understand the physics qualitatively well and that we know how these physical phenomena scale. For example, we would assume that the aerodynamic drag can be modelled as the sum of a component proportional to rho*U^2*D^2 and a component proportional to mu*U*D.

Then we perform several thousand experiments and use the data-driven approach to learn the model parameters and their uncertainties. The experiments have to be carefully designed so that every parameter is visible and distinguishable. (One advantage is that parameters that are highly influential are also highly visible.) This renders the qualitative model quantitatively accurate over the range studied and, because the model is physics-based, it can extrapolate to other situations in which the physics remains the same. This is an improvement on the machine learning approach because it can extrapolate, and an improvement on the physics based approach because it is more accurate.

Why Rocket Science is Rocket Science.

Thermoacoustic (combustion) instability has plagued rocket engines for 90 years. During the cold war, the USA and Russia spent billions to eliminate it from their designs. For example, NASA performed 2000 full-scale tests on the F1 engine of the Apollo Program in order to obtain a stable engine by inspired trial and error. The physical mechanism of the instability was well known and the scientists and engineers devoted to it were highly capable, so why was it so hard to eliminate?

The answer is the subtext of most books and papers on the subject since the 1950's: Thermoacoustic instability is pathologically sensitive to small design changes. This is demonstrated in this Annual Review paper on Sensitivity in Thermoacoustics:

Sensitivity and nonlinearity in Thermoacoustics
M. Juniper, R. I. Sujith
Annual Review of Fluid Mechanics 50, 661--689, (2018), doi:10.1146/annurev-fluid-122316-045125
pdf
Open Access
doi: https://doi.org/10.1146/annurev-fluid-122316-045125
Tutorial 1: Obtaining thermoacoustic eigenvalue sensitivities with adjoint methods
Tutorial 1 Matlab files
Tutorial 2: Tools from nonlinear dynamics
Tutorial 2 Matlab files

Nine decades of rocket engine and gas turbine development have shown that thermoacoustic oscillations are difficult to predict but can usually be eliminated with relatively small ad hoc design changes. These changes can, however, be ruinously expensive to devise. This review explains why linear and nonlinear thermoacoustic behaviour is so sensitive to parameters such as operating point, fuel composition, and injector geometry. It shows how non-periodic behaviour arises in experiments and simulations and discusses how fluctuations in thermoacoustic systems with turbulent reacting flow, which are usually filtered or averaged out as noise, can reveal useful information. Finally, it proposes tools to exploit this sensitivity in the future: adjoint-based sensitivity analysis to optimize passive control designs, and complex systems theory to warn of impending thermoacoustic oscillations and to identify the most sensitive elements of a thermoacoustic system.

This sensitivity arises because the time delay between acoustic perturbations at the fuel injector and subsequent heat release rate perturbations at the flame is the same order as the acoustic period. Any design change that alters the flame time delay or the acoustic period therefore strongly influences thermoacoustic stability. The problem is that most design changes will alter one or the other.

It is therefore a fool's errand to try to model the physical mechanism of thermoacoustic instability with quantitative accuracy ab initio. The mechanism might be correct and the parameters nearly accurate, but the model will almost certainly not be predictive because of this extreme sensitivity to parameters.

We can assume, however, that the model is qualitatively accurate and we can therefore construct a model with floating parameters. (This assumption is good for simple systems but may become stretched for complex systems.) Some of these parameters may not be observable directly, even though they strongly influence the observed behaviour. We then tune the model parameters based on several hundred thousand observations of thermoacoustic oscillations. This renders the qualitative model quantitatively accurate and predictive.

Once we have a quantitatively-accurate model, we can combine it with adjoint-based control and design in order to work out the smallest permissible change that will render the system stable.

Our first papers on this subject are on flame behaviour in the absence of acoustics, in conjunction with Luca Magri:

Data Assimilation and Optimal Calibration in Nonlinear Models of Flame Dynamics
H. Yu, T. Jaravel, M. Ihme, M. Juniper, and L. Magri
Journal of Engineering for Gas Turbines and Power (GTP-19-1369), , (2019), doi:https://doi.org/10.1016/j.jcp.2019.108950
pdf
Open Access
doi: https://doi.org/https://doi.org/10.1016/j.jcp.2019.108950

We propose an on-the-fly statistical learning method to take a qualitative reduced-order model of the dynamics of a premixed flame and make it quantitatively accurate. This physics-informed data-driven method is based on the statistically optimal combination of (i) a reduced-order model of the dynamics of a premixed flame with a level-set method, (ii) high-quality data, which can be provided by experiments and/or high-fidelity simulations, and (iii) assimilation of the data into the reduced-order model to improve the prediction of the dynamics of the premixed flame. The reduced-order model learns the state and the parameters of the premixed flame on the fly with the ensemble Kalman filter, which is a Bayesian filter used, for example, in weather forecasting. The proposed method and algorithm are applied to two test cases with relevance to reacting flows and instabilities. First, the capabilities of the framework are demonstrated in a twin experiment, where the assimilated data is produced from the same model as that used in prediction. Second, the assimilated data is extracted from a high-fidelity reacting-flow direct numerical simulation (DNS), which provides the reference solution. The results are analyzed by using Bayesian statistics, which robustly provide the level of confidence in the calculations from the reduced-order model. The versatile method we propose enables the optimal calibration of computationally inexpensive reduced-order models in real time when experimental data becomes available, for example, from gas-turbine sensors.
Combined state and parameter estimation in level-set methods
H. Yu, M. Juniper, and L. Magri
Journal of Computational Physics 399, 108950, (2019), doi:https://doi.org/10.1016/j.jcp.2019.108950
pdf
doi: https://doi.org/https://doi.org/10.1016/j.jcp.2019.108950

Reduced-order models based on level-set methods are widely used tools to qualitatively capture and track the nonlinear dynamics of an interface. The aim of this paper is to develop a physics-informed, data-driven, statistically rigorous learning algorithm for state and parameter estimation with level-set methods. A Bayesian approach based on data assimilation is introduced. Data assimilation is enabled by the ensemble Kalman filter and smoother, which are used in their probabilistic formulations. The level-set data assimilation framework is verified in one-dimensional and two-dimensional test cases, where state estimation, parameter estimation and uncertainty quantification are performed. The statistical performance of the proposed ensemble Kalman filter and smoother is quantified by twin experiments. In the twin experiments, the combined state and parameter estimation fully recovers the reference solution, which validates the proposed algorithm. The level-set data assimilation framework is then applied to the prediction of the nonlinear dynamics of a forced premixed flame, which exhibits the formation of sharp cusps and intricate topological changes, such as pinch-off events. The proposed physics-informed statistical learning algorithm opens up new possibilities for making reduced-order models of interfaces quantitatively predictive, any time that reference data is available.

Our current activities include pure data-driven machine learning, physics-based statistical learning, and hybrids of both approaches:

Bayesian Machine Learning for the Prognosis of Combustion Instabilities from Noise
U. Sengupta, C. E. Rasmussen, M. P. Juniper
Journal of Engineering for Gas Turbines and Power (accepted), , (2020), doi:10.1115/1.4049762
pdf
doi: https://doi.org/10.1115/1.4049762

Experiments are performed on a turbulent swirling flame placed inside a vertical tube whose fundamental acoustic mode becomes unstable at higher powers and equivalence ratios. The power, equivalence ratio, fuel composition and boundary condition of this tube are varied and, at each operating point, the combustion noise is recorded. In addition, short acoustic pulses at the fundamental frequency are supplied to the tube with a loudspeaker and the decay rates of subsequent acoustic oscillations are measured. This quantifies the linear stability of the system at every operating point. Using this data for training, we show that it is possible for a Bayesian ensemble of neural networks to predict the decay rate from a 300 millisecond sample of the (un-pulsed) combustion noise and therefore forecast impending thermoacoustic instabilities. We also show that it is possible to recover the equivalence ratio and power of the flame from these noise snippets, confirming our hypothesis that combustion noise indeed provides a fingerprint of the combustor?s internal state. Furthermore, the Bayesian nature of our algorithm enables principled estimates of uncertainty in our predictions, a reassuring feature that prevents it from making overconfident extrapolations. We use the techniques of permutation importance and integrated gradients to understand which features in the combustion noise spectra are crucial for accurate predictions and how they might influence the prediction. This study serves as a first step towards establishing interpretable and Bayesian machine learning techniques as tools to discover informative relationships in combustor data and thereby build trustworthy, robust and reli- able combustion diagnostics.
Assimilation of Experimental Data to Create a Quantitatively Accurate Reduced-Order Thermoacoustic Model
F. Garita, H. Yu, M. P. Juniper
Journal of Engineering for Gas Turbines and Power (accepted), , (2020), doi:10.1115/1.4048569
pdf
doi: https://doi.org/10.1115/1.4048569

We combine a thermoacoustic experiment with a thermoacoustic reduced order model using Bayesian inference to accurately learn the parameters of the model, rendering it predictive. The experiment is a vertical Rijke tube containing an electric heater. The heater drives a base flow via natural convection, and thermoacoustic oscillations via velocity-driven heat release fluctuations. The decay rates and frequencies of these oscillations are measured every few seconds by acoustically forcing the system via a loudspeaker placed at the bottom of the tube. More than 320,000 temperature measurements are used to compute state and parameters of the base flow model using the Ensemble Kalman Filter. A wave-based network model is then used to describe the acoustics inside the tube. We balance momentum and energy at the boundary between two adjacent elements, and model the viscous and thermal dissipation mechanisms in the boundary layer and at the heater and thermocouple locations. Finally, we tune the parameters of two different thermoacoustic models on an experimental dataset that comprises more than 40,000 experiments. This study shows that, with thorough Bayesian inference, a qualitative model can become quantitatively accurate, without overfitting, as long as it contains the most influential physical phenomena.

The techniques are highly versatile (as long as one asks the right questions) and we are also applying variants of them to Carbon Nanotube Aerogel formation and to Magnetic Resonance Velocimetry.