It helps to unpack the scientific method into physics, modelling, and analysis:
The classical scientific approach is to identify the pertinent physics, build an appropriate model, perform some appropriate analysis, derive some useful results, test the model experimentally, and ideally discover something non-obvious but observable in the underlying physics.
Data-driven machine learning bypasses the modelling phase by applying analysis directly to experimental or numerical data. This works by fitting the data with a model (e.g. a neural network) that contains many degrees of freedom. The model can then recognize behaviour close to that which it has seen before. The downsides are that this model cannot extrapolate reliably and, if it has been overfitted, can be significantly wrong.
Our view is that:
In our research we combine the classical scientific approach with the data-driven machine learning approach. We fit physics-based models rather than (say) neural networks to data. In other words, we take a qualitatively-accurate physics-based model and render it quantitatively-accurate by assimilating data. This model can then extrapolate to situations that it has not seen before but that share the same physics. The concept is explained here and some of the machinery is explained here. The examples below illustrate the concept.
Physics: The pendulum is stiff but flexible and has distributed mass. The pendulum experiences mechanical friction at the pivot and aerodynamic drag along its length. Twice each period it receives an impulse from the clock mechanism.
Modelling: To calculate the pendulum's natural frequency we model it as a point with mass m on a mass-less rigid rod with length l, rotating around a frictionless pivot in a vacuum with gravitational acceleration, g. Considering the angular displacement around the pivot and using Newton's second law we obtain l d^2(theta)/dt^2 = - g * sin(theta), which is a second order nonlinear ordinary differential equation (ODE).
Analysis: For small angles, we approximate sin(theta) = theta to obtain a second order linear ODE which we solve to obtain the angular frequency, omega = sqrt(g/l). For large angles, an analytical solution to this simple equation can be found and shows that omega depends also on amplitude.
Experiments: To test the model's predicted result that omega = sqrt(g/l), independent of m, we would measure omega for a handful of oscillations with different l and m. For small amplitudes we should find that omega = sqrt(g/l). For large amplitudes we should also find the dependence on amplitude, A.
Comments: To calculate the natural frequency we do not model air resistance, friction, distributed masses, forcing from the clock mechanism, or vibrations in the rod, because they have little influence on the frequency, they make the analysis harder, and they obscure the pertinent physics. Note that the model can reliably extrapolate to different values of g without any observations at different values of g.
A machine learning algorithm would need to observe omega over several thousand experiments at different values of m, l, and amplitude, A. It would fit a model to the data and, if successful, this model would be a black box with two influential inputs (l and A) and one redundant input (m). As users, we would have no physical insight, but we would be able to obtain omega as a function of l and A as long as l and A lie within the ranges already observed. The machine learning algorithm would not know the dependence on g becaue it would never observe it.
Physics: As before, the pendulum is stiff but flexible and has distributed mass. The pendulum experiences mechanical friction at the pivot and aerodynamic drag along its length. Twice each period it receives an impulse from the clock mechanism.
Modelling: To calculate the pendulum's limit cycle amplitude we retain the rigid rod model but now need to model the impulse from the clock mechanism, the aerodynamic drag, and the friction at the pivot. Even with this limited extra physics, the model becomes quite elaborate and requires several more assumptions and parameters. For example, we might assume that the steady flow drag coefficient can be applied to this unsteady flow such that the aerodynamic drag is proportional to (d(theta)/dt)^2 at high speeds and d(theta)/dt at low speeds with a constant of proportionality that needs to be measured for that pendulum shape and depends somewhat on d(theta)/dt through the Reynolds number. We need to model both static friction and dynamic friction at the pivot. We obtain a second order nonlinear ordinary differential equation, as before, but it contains several extra forcing terms and several more parameters.
Analysis: We would calculate the limit cycle amplitude as a function of the impulse from the clock's mechanism. The analysis is no longer analytically tractable so a numerical solution is required. If we are wise, we will also calculate the sensitivity of the amplitude to the model parameters in order to determine which parameters have most influence and therefore which need to be determined most accurately.
Experiments: To test the model's predictions, we would measure the limit cycle amplitude as a function of the impulse over a range of l, m and pendulum shapes.
Comments: On the positive side, the model can extrapolate to different values of air densities, air viscosities, and pendulum shapes (if we measure their drag coefficients). On the negative side, we may need to obtain these model parameters extremely accurately in order for the model itself to be accurate.
A machine learning algorithm would observe the amplitude over several thousand experiments. As before, the output would be a black box that would give accurate results when interpolating between experiments it had seen before. On the positive side, the model would be accurate. On the negative side, it would not know the dependence on air densities, viscosities, or pendulum shapes.
It depends. In the first example, the classical engineering approach with a physics-based model is clearly better. It gives more physical insight, can be expressed in terms of easily-measurable parameters (g/l) and creates a general extrapolatable result.
In the second example, the physics-based model has become cumbersome; the analysis is difficult and it contains many parameters, some of which are quite influential and many of which are not well known or easy to discover. On the other hand, the machine learning model is accurate, but only when interpolating between results it has already observed. In summary, both models have their problems.
At this stage, particularly if there is disagreement between the experimental results and the model predictions, it is tempting to try to include more physical phenomena in the hope that an important phenomenon has been omitted. This may help but, of course, makes the model more cumbersome. Alternatively, the model can be altered such that some influential parameters are no longer required. For example, we could replace the steady flow aerodynamic drag coefficients with an unsteady numerical simulation of the flow around the pendulum. Note, however, that this introduces other influential parameters, such as turbulence or subgrid scale model parameters used in the simulations. There is a danger that the dependence on one set of parameters is replaced with a less-visible dependence on another set of parameters and that the qualitatively-important physical phenomena will be masked by unimportant phenomena.
Our approach combines machine learning and physics-based approaches and can best be called inverse uncertainty quantification.
We start from the premise that we understand the physics qualitatively well and that we know how these physical phenomena scale. For example, we would assume that the aerodynamic drag can be modelled as the sum of a component proportional to rho*U^2*D^2 and a component proportional to mu*U*D.
Then we perform several thousand experiments and use the data-driven approach to learn the model parameters and their uncertainties. The experiments have to be carefully designed so that every parameter is visible and distinguishable. (One advantage is that parameters that are highly influential are also highly visible.) This renders the qualitative model quantitatively accurate over the range studied and, because the model is physics-based, it can extrapolate to other situations in which the physics remains the same. This is an improvement on the machine learning approach because it can extrapolate, and an improvement on the physics based approach because it is more accurate.
Thermoacoustic (combustion) instability has plagued rocket engines for 90 years. During the cold war, the USA and Russia spent billions to eliminate it from their designs. For example, NASA performed 2000 full-scale tests on the F1 engine of the Apollo Program in order to obtain a stable engine by inspired trial and error. The physical mechanism of the instability was well known and the scientists and engineers devoted to it were highly capable, so why was it so hard to eliminate?
The answer is the subtext of most books and papers on the subject since the 1950's: Thermoacoustic instability is pathologically sensitive to small design changes. This is demonstrated in this Annual Review paper on Sensitivity in Thermoacoustics:
This sensitivity arises because the time delay between acoustic perturbations at the fuel injector and subsequent heat release rate perturbations at the flame is the same order as the acoustic period. Any design change that alters the flame time delay or the acoustic period therefore strongly influences thermoacoustic stability. The problem is that most design changes will alter one or the other.
It is therefore a fool's errand to try to model the physical mechanism of thermoacoustic instability with quantitative accuracy ab initio. The mechanism might be correct and the parameters nearly accurate, but the model will almost certainly not be predictive because of this extreme sensitivity to parameters.
We can assume, however, that the model is qualitatively accurate and we can therefore construct a model with floating parameters. (This assumption is good for simple systems but may become stretched for complex systems.) Some of these parameters may not be observable directly, even though they strongly influence the observed behaviour. We then tune the model parameters based on several hundred thousand observations of thermoacoustic oscillations. This renders the qualitative model quantitatively accurate and predictive.
Once we have a quantitatively-accurate model, we can combine it with adjoint-based control and design in order to work out the smallest permissible change that will render the system stable.
Our first papers on this subject are on flame behaviour in the absence of acoustics, in conjunction with Luca Magri:
Our current activities include pure data-driven machine learning, physics-based statistical learning, and hybrids of both approaches:
The techniques are highly versatile (as long as one asks the right questions) and we are also applying variants of them to Carbon Nanotube Aerogel formation and to Magnetic Resonance Velocimetry.