If you see this, something is wrong
First published on Wednesday, Jun 3, 2026 and last modified on Wednesday, Jun 10, 2026 by François Chaplais.
CD Laboratory for Physics-driven Machine Learning in Industrial Applications, Graz, Austria and The Institute of Thermodynamics and Sustainable Propulsion Systems, Graz University of Technology, Graz, Austria
Know Center Research GmbH, Graz, Austria
CD Laboratory for Physics-driven Machine Learning in Industrial Applications, Graz, Austria and The Institute of Thermodynamics and Sustainable Propulsion Systems, Graz University of Technology, Graz, Austria
Machine Learning, ICML
From neural ODEs to continuous-time machine learning, differentiable solvers allow physics, optimization, and simulation to become trainable components within deep learning systems. This has opened the path to a new generation of deep learning frameworks for scientific computing, with many promising applications still emerging. In this paper, we integrate a differentiable chemistry solver into a modified physics-informed neural network to solve parameterized reaction systems that are inherently stiff. The proposed framework introduces several key components required to overcome limitations of standard physics-informed neural networks. These include a differentiable chemistry solver, a network architecture for parameterized solutions, and residual weighting tailored to stiff reactions. We evaluate the framework on a set of differential equations related to hydrogen combustion, which include initial/boundary value problems, inverse parameter identification, and a parameterized partial differential equation. Our results highlight the ability of the proposed approach to extend physics-informed neural networks to stiff chemical systems that were previously inaccessible.
Physics-informed machine learning (PIML) has emerged as a powerful paradigm in scientific computing, combining data-driven learning with prior knowledge derived from physical laws [1]. By embedding known physical constraints directly into modern machine learning models, PIML aims to improve generalization, data efficiency, and physical consistency in the modeling of complex systems. Over the past decade, a growing body of work has demonstrated that purely data-driven approaches often struggle to achieve the accuracy and robustness required for challenging physical problems, motivating the integration of physics into the learning process.
Early developments such as Hamiltonian Neural Networks [2], Lagrangian Neural Networks [3], and Neural Ordinary Differential Equations [4] have highlighted the importance of respecting underlying physical structure during learning. Among these approaches, physics-informed neural networks (PINNs) have become a central workhorse of PIML. PINNs encode governing equations, constitutive relations, and constraints directly into the loss function through physics-based residual terms, enabling the solution of forward and inverse problems governed by differential equations [5, 6].
Since their introduction, PINNs have attracted significant attention and inspired numerous extensions that build upon the same principle of physics-based loss formulations. These developments have pushed the boundaries of scientific machine learning toward problems that were traditionally considered intractable using classical numerical methods alone. Notable examples include deep operator learning approaches such as DeepONets [7], Fourier Neural Operators [8], and physics-informed neural operators [9], which aim to learn entire solution operators rather than individual realizations of physical systems. These operator-learning frameworks have been successfully applied to a wide range of systems, including parametric fluid dynamics problems [8, 10], stochastic and uncertainty-aware PDEs [7, 11], and multiscale modeling in materials science and solid mechanics [12, 13].
For PIML reaction–diffusion systems constitute a particularly challenging class of physical systems, describing the interplay between chemical reactions and transport processes such as diffusion. These systems arise in numerous scientific and engineering applications, including combustion, catalysis, biological pattern formation, and materials science. Depending on the underlying reaction mechanisms and involved species, reaction–diffusion systems can become highly multi-scale and exhibit stiff dynamics, characterized by rapid temporal variations over very short time scales. Accurately resolving such behavior often requires specialized numerical solvers, or even operator splitting approaches that couple solvers tailored to fast chemical kinetics with those designed for slower transport processes [14].
For systems exhibiting stiff dynamics, standard PINN formulations face well-documented limitations, including optimization difficulties, poor convergence, and imbalanced training across temporal or spatial domains [15, 16]. To address these challenges, several modifications to the original PINN framework have been proposed, each targeting specific failure modes such as spectral bias [17, 18], stiffness [19, 20], or training imbalance [21, 22]. Despite these advances, existing PINN studies on reaction–diffusion systems have largely focused on simplified reaction terms, such as Fisher’s equation [23, 24] or related low-dimensional problems [25, 15]. Prior efforts applying PINNs to combustion are valuable but do not address the unsteady, forward reaction–diffusion setting [26]. They typically target steady configurations [27], inverse parameter inference [28, 28], or neglecting the chemistry in the loss term formulation [28].
Recent advances in differentiable programming and differentiable numerical solvers provide a key opportunity to overcome these limitations [29, 28, 30, 31]. In particular, differentiable chemistry solvers enable backpropagation through complex, stiff reaction kinetics, allowing chemical source terms to be incorporated into physics-based loss functions in an end-to-end differentiable manner. This capability fundamentally expands the scope of PINNs, enabling the seamless integration of high-fidelity chemistry models into neural network training without sacrificing physical accuracy or consistency.
In this work, we present a practical, end-to-end differentiable PINN framework for stiff combustion systems and demonstrate its effectiveness on hydrogen flamelet equations, which we propose as a challenging benchmark for future PINN research. Our framework introduces several essential modifications to the standard PINN formulation that are required to achieve accurate and stable solutions in the presence of stiff chemical kinetics. These include a fully differentiable chemistry solver, residual-based loss weighting strategies, and the enforcement of hard physical constraints such as mass conservation, boundary and initial conditions, and inert species mass fractions.
We evaluate the proposed framework on a range of problem settings for hydrogen combustion, including both ordinary and partial differential equations. The considered tasks span forward problems—such as initial value problems, boundary value problems, and operator learning—as well as an inverse problem involving parameter identification. Furthermore, we compare our approach against the standard PINN framework and against a state-of-the-art enhancement technique for time-dependent problems which utilizes causality-based loss scaling [32]. Our results demonstrate that the proposed differentiable PINN framework significantly improves robustness and accuracy for stiff reaction–diffusion systems, highlighting the critical role of differentiable solvers in enabling next-generation physics-informed learning.
In summary, our main contributions are as follows:
PINNs are a deep learning–based framework for solving forward and inverse problems governed by partial and ordinary differential equations [5]. They employ neural networks as universal function approximators [33] to represent the unknown solution of a differential equation. In the forward setting, full knowledge of the governing equations is used to approximate the solution, while in the inverse setting partial physical knowledge together with observational data is leveraged to infer unknown quantities such as equation parameters.
The governing equation is assumed to be of the form
(1)
where \( \mathcal{F}\) denotes a (nonlinear) differential operator, \( u(x)\) is the (vector-valued) solution, \( x\) is a multidimensional input variable, \( \mu\) is a vector of physical parameters, and \( \Omega\) denotes the computational domain.
The solution is approximated by a neural network \( u_\theta(x)\) with parameters \( \theta\) . The corresponding physics residual, capturing the mismatch between the model’s physical behavior and what is dictated by Eq. (1), is defined as
(2)
Training is performed by minimizing a composite loss function of the form
(3)
where the physics loss enforces the governing equations at collocation points sampled from the interior of the domain:
(4)
The initial and boundary condition loss penalizes deviations from prescribed conditions,
(5)
which are required to render the problem well posed.
In the presence of observational data, an additional data loss term is included,
(6)
where \( \hat{u}(x)\) denotes observed data.
In the forward setting, the data loss term is not required provided that sufficient initial and boundary conditions are specified to render the problem well posed. In contrast, in the inverse setting the objective is to infer unknown physical parameters \( \mu\) from observed data. In this case, the data loss term is essential, and the parameters \( \mu\) are typically treated as additional learnable variables jointly optimized with the network parameters.
To apply PINNs to reactive (combusting) systems, the governing equation (1) is augmented by a chemical source term,
(7)
where \( Y_i(x)\) denotes the \( i\) -th component of the species mass fractions vector \( \boldsymbol{Y}\) . The source term \( \dot{\omega}_i\) depends on the full composition and the thermodynamic state, yielding a tightly coupled, nonlinear system across all species. Under constant-pressure conditions (\( p=\) const), the thermodynamic state is characterized by the temperature \( T\) together with \( \boldsymbol{Y}\) (or, equivalently, by a prescribed enthalpy \( h\) from which \( T\) is recovered). Moreover, the kinetics exhibit Arrhenius temperature sensitivity, so that, in mass-action form,
(8)
which contributes strongly to stiffness due to the exponential dependence on \( 1/T\) . Classical numerical simulation discretizes the differential equations in time and in case of PDE, in spatial coordinate. Chemical kinetics are usually solved by suitable chemistry solvers such as cantera [28]. Through the source term dependency on the solution of the differential equation, special numerical treatment is required which ranges from linearization to operator splitting schemes in case of reaction-diffusion PDE. Furthermore, the stiff dynamics of reacting systems requires stability constraints leading to limiting step sizes and thus increased simulation times. By contrast, PINNs enables the treatment of the chemistry "instantaneously" at the level of residual satisfaction rather than via sequential time stepping. However, this requires that the solution of the underlying chemistry is part of the computational graph and thus be differentiable. In our proposed PINN, the reaction source terms are determined through the differentiable chemistry backend reactorch [34]. Given the predicted species mass fractions, the mixture enthalpy \( h\) is first computed from the boundary stream states, and the corresponding temperature \( T\) is recovered implicitly by solving the thermochemical equilibrium relation. The recovered temperature and species composition are then used to evaluate reaction rates via a differentiable implementation of detailed chemical kinetics. This formulation enables end-to-end gradient propagation through the chemistry evaluation and is essential for both forward and inverse problem settings (Figure 1). Further details on the chemical kinetics computations can be found in the appendix in Section A.
Solving stiff reaction systems requires several modifications to the training procedure, without which stable convergence is not achievable. Due to the extreme stiffness of the governing equations, we incorporate residual weighting as proposed in [24]. Specifically, each residual term in the physics loss computation (cf. Eq. (4)) is scaled according to the corresponding reaction coefficient,
(9)
where \( \omega_i(\boldsymbol{Y}^{(j)})\) denotes the reaction rate of species \( i\) evaluated for the mass-fraction vector \( \boldsymbol{Y}^{j}\) , and \( \lambda\) is a hyperparameter. This weighting reduces the contribution of regions with very high reaction rates to the loss function, thereby mitigating instabilities commonly encountered in stiff reactive systems.
In the parameterized setting, we do not sample the parameter independently for each collocation point. Instead, for a given parameter value, we sample a full set of collocation points covering the entire spatiotemporal domain. This reflects the fact that each parameter corresponds to a single global solution of the governing equations, which must satisfy the PDE constraints throughout the domain. Sampling collocation points in this structured manner is essential for enforcing global physical consistency and was found to be crucial for stable training in the stiff reactive setting considered here. For the parameterized case, we adopt the network architecture proposed in [23] rather than a standard fully connected architecture.
Initial and boundary conditions are enforced through hard constraints. The inert species (N\( _2\) ) mass fraction is constrained throughout the domain, and the predicted mass fractions of the reactive species are constrained via a softmax transformation to ensure mass conservation. We employ the Adam optimizer for training and use only the physics loss in all forward problem settings. Details of the network architecture and training procedure are provided in Section C.
We evaluate the proposed framework on three distinct systems derived from hydrogen combustion chemistry, modeled using a global one-step Arrhenius mechanism to represent the underlying reaction kinetics. The considered problems include an initial value problem, a boundary value problem, and a reaction–diffusion partial differential equation. Hydrogen combustion is used throughout as a deliberately challenging test case due to its extreme stiffness and wide separation of chemical time scales. Rather than varying the chemical mechanism, we focus on evaluating the proposed framework across fundamentally different problem classes (IVPs, BVPs, and reaction–diffusion PDEs), each of which typically requires a distinct numerical treatment. Successful application to hydrogen chemistry thus serves as a strong indicator of robustness and expected generalizability to less stiff combustion systems. In the PDE setting, we study both forward problems, including a parameterized formulation in which a family of solutions indexed by the strain rate is learned, and an inverse problem, where the strain rate is inferred from observed solution data. While each of these systems typically requires a fundamentally different numerical solution strategy, the corresponding PINN formulations differ only minimally, highlighting the generality of the proposed approach.
Since the proposed framework incorporates several training and formulation choices beyond a standard PINN, we evaluate its performance on forward problems against a basic PINN baseline as well as against one well known PINN modification to serve as a competitive baseline. The vanilla PINN baseline enforces only hard initial and boundary conditions and does not include any additional problem-specific constraints, such as mass-fraction normalization or inert-species enforcement, serving as a minimal reference implementation. In addition, we compare against causality-based loss scaling [32] which is a widely known PINN training modification, well suited for time-dependent systems.
For a fair comparison, all methods share the same network architectures including the hard constraints for initial and boundary conditions and optimization settings. In the case of the competitive baseline, only training modifications that are not orthogonal to the respective method are removed, while all other components of the framework are kept identical. Implementation details are provided in Appendix C, and additional experimental details are described in Appendix D.
We first consider an initial value problem arising from hydrogen combustion chemistry, formulated as a coupled system of ordinary differential equations
(10)
where \( \boldsymbol{Y}\) denotes the vector of species mass fractions and \( \omega_i(\boldsymbol{Y})\) represents the reaction rate of species \( i\) computed through detailed chemical kinetics.
We solve this system using a PINN by enforcing the governing equations at collocation points sampled over the temporal domain. To account for the sharp time derivatives occurring at early times, collocation points are sampled in logarithmic time coordinates, and logarithmic transformation is applied to the input. This setting serves as a baseline test for assessing the ability of the proposed framework to handle stiff reaction dynamics in a purely forward setting.
Next, we evaluate the proposed framework on a boundary value problem corresponding to the steady-state formulation of the reaction–diffusion PDE describing hydrogen combustion. The system reduces to a second-order ordinary differential equation of the form
(11)
where \( Z \in [0,1] \) is the mixture fraction and \( \frac{\chi(Z, \alpha)}{2}\) is the scalar dissipation rate which is a function of strain rate \( \alpha\) and mixture fraction \( Z\) . Boundary value problems are known to be particularly challenging for both PINNs and classical numerical solvers.
We consider a reaction–diffusion partial differential equation describing hydrogen combustion in a flamelet configuration. The governing equation couples diffusion in mixture–fraction space with nonlinear reaction terms arising from detailed chemical kinetics and is given by
(12)
In the forward setting, the strain rate is fixed and the network is trained to approximate the corresponding flamelet solution for a given parameter value. This fixed-parameter formulation is used as the primary benchmark for performing the ablation study. In the inverse setting, the strain rate is treated as an unknown parameter and is inferred by simultaneously minimizing the physics-informed loss and the data loss.
Beyond these fixed-parameter experiments, we additionally consider a parameterized formulation in which the strain rate is provided as an explicit input to the network. This enables learning a continuous family of flamelet solutions across a prescribed range of strain rates using a single neural network. The parameterized setting serves as an extension of the core methodology and illustrates the flexibility of the proposed framework in handling forward problem across parameter space.
In Table 1 we show the quantitative results for all of the forward problem settings we considered in our evaluation. Namely, we compare our proposed framework with regular PINN training (without any chemistry specific modifications), and also with causality-based loss scaling approach as a competitive baseline. We compute the mean absolute error and also the relative \( L_2\) error on initial value problem, boundary value problem, reaction-diffusion PDE in both non-parametrized setting with a fixed strain rate and with the parametrized setting with strain rate used as input. Our results show that the PINN with our proposed modifications outperforms both baselines across all the test cases.
| Vanilla PINN | Causality-based PINN | Proposed framework | ||||
| MAE | Rel. \( L_2\) | MAE | Rel. \( L_2\) | MAE | Rel. \( L_2\) | |
| IVP ODE | 6.60e-01 | 3.81e-00 | 7.91e-03 | 4.80e-02 | 1.10e-03 | 4.37e-03 |
| BVP ODE (\( \alpha{=}1\) ) | 2.17e-00 | 7.27e-00 | – | – | 2.80e-03 | 1.71e-02 |
| BVP ODE (\( \alpha{=}100\) ) | 2.03e-00 | 6.79e-00 | – | – | 5.20e-03 | 2.26e-02 |
| PDE (\( \alpha{=}1\) ) | 8.81e-02 | 7.04e-01 | 4.03e-03 | 2.62e-02 | 2.41e-04 | 1.90e-03 |
| PDE (\( \alpha{=}100\) ) | 7.17e-02 | 5.19e-01 | 3.32e-03 | 2.22e-02 | 6.59e-05 | 3.79e-04 |
| Parametrized PDE (\( \alpha \in [1,100]\) ) | 6.33e-03 | 4.10e-02 | 2.40e-03 | 1.52e-02 | 8.74e-04 | 7.63e-03 |
Figure 2 shows the solution of the stiff initial value problem obtained using the proposed PINN. Despite the presence of rapid transients at early times, the network accurately captures the evolution of all species across the full temporal domain. This experiment demonstrates that the proposed framework can robustly handle highly stiff reaction dynamics in a purely forward setting.
Figure 3 presents the solution of the boundary value problem ODE corresponding to the steady-state solution of the PDE flamelet formulation. We note that although this problem does not exhibit sharp temporal transients and the solution appears relatively smooth, training without residual scaling led to severe convergence issues and inaccurate results (not shown).
Figure 4 shows the solution of the reaction-diffusion flamelet PDE for a fixed strain rate. The proposed PINN accurately reproduces the species evolution across the full spatiotemporal domain without exhibiting spurious oscillations or instability. Importantly, stable convergence is achieved in a regime characterized by extremely stiff and sharply varying reaction dynamics, which is widely regarded as challenging for standard PINN formulations. To the best of our knowledge, this constitutes the first successful application of PINNs to reaction-diffusion PDE with detailed chemistry. This experiment therefore provides a representative benchmark for evaluating stabilization strategies in stiff reactive PDEs.
Table 2 reports the results of the inverse problem for the reaction-diffusion flamelet PDE, where the strain rate \( \alpha\) is inferred from observed solution data. The evaluation is performed for three representative ground-truth values, \( \alpha\in{1,10,100}\) , and different additive noise levels applied to the observational data.
Across all cases, the PINN accurately recovers the true strain rate, even in the cases with large levels of additive noise. These results demonstrate that the proposed framework is able to identify global physical parameters directly from the PDE residual, even in the presence of stiff reaction dynamics and highly nonlinear source terms.
| True strain rate \( \alpha\) | |||
| 1.0 | 10.0 | 100.0 | |
| 0.0 | 0.994 | 10.002 | 99.624 |
| 0.01 | 0.995 | 9.992 | 99.598 |
| 0.1 | 0.998 | 10.022 | 99.281 |
| 1.0 | 1.027 | 10.199 | 94.657 |
Figure 6 compares the performance of the parameterized PINN against a vanilla PINN baseline across a range of strain rates. The error is resolved both by strain rate and by individual chemical species. Two trends are immediately apparent. First, lower strain rates are consistently more difficult for PINNs to learn, which is consistent with the increased stiffness of the underlying reaction dynamics in this regime. Second, the relative \( L_2\) error is notably higher for hydrogen compared to other species. This can be attributed to its very low mass fraction over large parts of the domain, which leads to an imbalance in the residual contributions and makes accurate learning more challenging for standard PINN formulations.
In contrast, the proposed framework significantly reduces the error across all strain rates and species. In particular, the improvement is most pronounced at low strain rates, where stiffness effects are strongest and vanilla PINNs struggle to converge to physically meaningful solutions.
To further illustrate the quality of the learned parameterized representation, Figure 5 shows the predicted flamelet solution evaluated at a representative strain rate of \( \alpha = 100\) . The parameterized PINN accurately reproduces the spatiotemporal structure of the solution, demonstrating that a single network can represent a continuous family of flamelet solutions.
In practical combustion simulations, flamelet-based models often rely on precomputed solution tables spanning a range of strain rates and additional physical parameters. While tabulation is effective in low-dimensional settings, the size of such tables can grow rapidly as more parameters are introduced, leading to increased memory requirements and interpolation complexity. Although the present experiments consider a regime where tabulation remains practical, the parameterized PINN formulation explored here suggests a potential path toward memory efficient representation of the flamelet equation solutions. With further development, such neural representations could enable simulations that would otherwise be impractical due to prohibitive memory requirements.
For the forward reaction-diffusion PDE, we perform an ablation study to assess the contribution of chemistry-related modifications introduced in the proposed framework. Starting from the full model, we selectively remove individual components while keeping all other training settings fixed.
Specifically, we examine the impact of removing (i) differentiability of the chemistry solver, (ii) residual scaling of the PDE loss, (iii) the hard constraint on the inert species mass fraction, and (iv) the mass conservation constraint.
Table 3 reports the mean absolute error (MAE) and relative \( L_2\) error for the full model and the ablated variants. Removing any of the proposed components leads to a significant degradation of accuracy.
| Method | MAE | Rel. \( L_2\) |
| Proposed PINN | 2.41e-04 | 1.90e-03 |
| Non-diff. chem. solver | 6.77e-03 | 3.26e-02 |
| No residual scaling | 2.85e-03 | 1.71e-02 |
| No inert specie constraint | 2.50e-03 | 1.65e-02 |
| No mass conservation | 3.19e-03 | 2.03e-02 |
In this work, we proposed a general framework for integrating differentiable chemistry solvers with physics-informed neural networks to solve combustion-related differential equations involving detailed chemical kinetics. A recent comprehensive review [26] confirms that, to the best of our knowledge, this represents the first successful demonstration of PINNs applied to time-dependent reaction-diffusion PDEs involving physically grounded chemical kinetics in a purely forward setting using physics-based losses only.
We further demonstrated that the proposed framework can be extended to a parameterized formulation, enabling a single neural network to approximate a continuous family of solutions across a wide range of strain rates. This capability suggests a potential pathway toward integrating PINN-based chemistry models into high-fidelity CFD simulations, where detailed chemical kinetics are traditionally represented through discrete precomputed lookup tables. Such neural representations could offer a more flexible and memory-efficient alternative for modeling combustion chemistry across parameter spaces.
The present study is limited to global one-step Arrhenius reaction mechanisms. While the proposed framework is, in principle, directly applicable to multi-step and detailed chemical kinetics, extending it to such settings introduces significant additional challenges for PINN optimization due to increased stiffness and system dimensionality. Addressing these challenges remains an important direction for future work.
The financial support by the Austrian Federal Ministry of Economy, Energy and Tourism, and the Christian Doppler Research Association is gratefully acknowledged. Furthermore, the authors want to thank Prof. Ricardo Novella and Prof. José Maria García-Oliver from the Clean Mobility & Thermofluids department of the Universitat Politècnica de València for their input on flamelet data generation.
The work Franz M. Rohrhofer was partially funded by the European Union’s HORIZON Research and Innovation Programme under grant agreement No 101120657, project ENFIELD (European Lighthouse to Manifest Trustworthy and Green AI). Know Center is a COMET competence center that is financed by the Austrian Federal Ministry of Innovation, Mobility and Infrastructure (BMIMI), the Austrian Federal Ministry of Economy, Energy and Tourism (BMWET), the Province of Styria, the Steirische Wirtschaftsförderungsgesellschaft m.b.H. (SFG), the Vienna business agency and the Standortagentur Tirol . The COMET programme is managed by the Austrian Research Promotion Agency FFG.
The underlying chemistry of the reacting system can be described as follows. Let \( N_r\) be the number of reactions and \( N_s\) the number of species, a general reaction takes the form
(13)
where \( \nu'_{i,r}\) and \( \nu''_{i,r}\) are the molar stoichiometric coefficients with
(14)
The reaction rate \( \dot{q}_r\) at which each reaction progresses is the difference of the forward and backward reaction rates as
(15)
where \( k_{f,r}\) and \( k_{b,r}\) denote the forward and backward reaction rates, respectively. The specific reaction rates are determined by the Arrhenius law
(16)
where \( A\) and \( \beta\) are constants, \( E_a\) is the activation energy, \( R\) is the universal gas constant, and \( T\) is the temperature. From this law, the strong influence of the temperature on the reaction can be seen. At low temperatures, the reaction velocity is very low while with increasing temperature, reactions become extremely fast leading to the considerable stiffness of the formalism to be solved. The species reaction rate \( \dot{\omega}_i\) is calculated by
(17)
with \( W_i\) being the molecular weight of species \( i\) .
Additionally to the chemical kinetics, the modeling of the thermodynamic properties of the desired species is required. This is utilized by applying the NASA 7-coefficient polynomial parameterization to compute the species standard state thermodynamic properties \( \hat{c}^0_p\left(T\right)\) , \( \hat{h}^0\left(T\right)\) , and \( \hat{s}^0\left(T\right)\) .
In the present study, we focused on hydrogen combustion, since its fast kinetics represents major challenges for the PINN in solving the resulting stiffness of the problem. We model the chemistry with a single-step global reaction
(18)
with the Arrhenius constants, cf.(16), \( A=1.8\mathrm{e}19\) , \( \beta=0\) and \( E_a=17614\) . The thermodynamic properties are taken from the well-known San Diego hydrogen mechanism [28].
In order to use the differentiable solver reactorch within our boundary-value and reaction–diffusion formulations, we extended the library to support an enthalpy-based thermodynamic state. The original API specifies the state via pressure \( p\) , temperature \( T\) , and species mass fractions \( \boldsymbol{y}\) , whereas our formulation prescribes mixture enthalpy as a function of mixture fraction and the enthaply values of the fuel and oxidizer, respectively, as
(19)
We therefore added an interface that accepts \( \left(p,h,\boldsymbol{y}\right)\) using a root finder for solving the NASA 7-coefficient polynomial. Of course, this root finder has to be differentiable to maintain the end-to-end gradient propagation.
Across all cases, the reference data are produced with classical numerical solvers. Tolerances and resolutions are chosen such that the remaining discretization/solver error is negligible compared to the reported PINN errors.
The reference trajectories are generated with SciPy’s solve_ivp integrator. We use an implicit method (Radau) to resolve the stiff behavior of the hydrogen reaction. Absolute and relative tolerances are set to stringent values (atol=1e-12, rtol=1e-8) and the solution is evaluated on a dense temporal grid.
For generating the reference data to validate the boundary value problem and the reaction-diffusion PDE, we used the flamelet solver ZLFLAM [28]. The solver uses Chemkin to calculate the reaction source terms \( \dot\omega_i\left(\boldsymbol{Y}\right)\) based on the specified chemical mechanism. Furthermore, the "Twopnt program for boundary value problems" presented in [35] was used for solving the boundary value problem in \( Z\) . The unsteady solution is performed applying the DDASSL solver [36] using the exact block tridiagonal Jacobian matrix. For the discretization of \( Z\) , we use a non-conformal resolution 201 in the boundary value problem and the reaction-diffusion PDE.
The scalar dissipation rate \( \chi\) , denoting the diffusion coefficient in (11) and (12) is calculated according to [28] by
(20)
with \( \mathrm{erfc}\left(x\right)=1-\mathrm{erf}\left(x\right) = \left(2/\sqrt{\pi}\right)\int_x^\infty \exp\left(-y^2\right)\mathrm{d}y\) .
The solution for the initial value problem is approximated using a fully connected neural network with four hidden layers and 50 neurons per layer. All hidden layers use the hyperbolic tangent activation function, while an exponential activation is applied to the output layer to ensure non-negativity of the predicted mass fractions and also to allow applying constraint for the mass fraction sum. The network is trained using the Adam optimizer with a fixed learning rate of \( 10^{-3}\) for 20{,}000 epochs.
To improve numerical stability in the presence of sharp temporal transients at early times, the temporal input is transformed to logarithmic scale. Specifically, the network takes as input \( \log(t + \varepsilon)\) , where \( \varepsilon > 0\) is a small constant to avoid singularities at \( t=0\) . This transformation allows the network to better resolve the stiff initial dynamics commonly encountered in reactive systems.
Hard constraints are imposed to enforce the initial conditions, mass conservation, and the constancy of the inert species mass fraction. Let \( \boldsymbol{y}(t) \in \mathbb{R}^N\) denote the vector of species mass fractions, and let \( y_{\mathrm{N_2}}\) denote the inert nitrogen mass fraction. The network predicts only the reactive species mass fractions through an unconstrained neural output \( \boldsymbol{f}_\theta(\log t + \varepsilon) \in \mathbb{R}^{N-1}\) . The reactive species mass fractions are constructed as
(21)
which enforces the initial condition exactly at \( t=0\) . The inert species mass fraction is fixed as
(22)
and mass conservation is enforced by normalizing the reactive species such that
(23)
The full mass-fraction vector is then obtained by concatenating the reactive and inert species. This construction guarantees that all predicted mass fractions are non-negative, satisfy the prescribed initial conditions, and sum to \( 1\) for all \( t\) .
Physics-informed training is performed by minimizing the mean squared residual of the governing ODE evaluated at collocation points sampled logarithmically in time. Specifically, we draw samples \( \xi_j \sim \mathcal{U}(0,1)\) and map them to physical time as
(24)
where \( [t_{\min}, t_{\max}]\) denotes the temporal domain, \( N_{\mathrm{col}}\) is the number of collocation points, and \( \varepsilon > 0\) is a small constant to avoid numerical singularities. This sampling strategy concentrates collocation points near early times while still covering the full temporal domain, enabling accurate enforcement of the physics residual in regions with sharp gradients. Number of collocation points at each epoch is set to \( 1024\) .
For solving the boundary value problem we again use the fully connected architecture with 4 hidden layers of 50 units each, hyperbolic tangent activation function in the hidden layers and exponential function in the output layer. In this case we use only soft constraints for the boundary conditions as it, counter-intuitively, improved the training. We again train using the Adam optimizer with a fixed learning rate \( 0.001\) and train the model for \( 20,000\) epochs.
We again use hard constraints for the inert species by defining the mass fraction as
(25)
We use the same constraint outlined in (23) for mass conservation.
To improve conditioning, the input mixture fraction is scaled to \( [-1,1]\) via
(26)
and the network prediction is evaluated as \( \boldsymbol{y}_\theta(Z) = f_\theta(\hat{Z})\) .
We enforce the ODE residual at collocation points sampled uniformly in the domain. Specifically, we draw \( u_j \sim \mathcal{U}(0,1)\) and map to physical mixture fraction via
(27)
Derivatives with respect to \( Z\) are obtained through automatic differentiation.
Dirichlet boundary conditions are enforced at \( Z=0\) and \( Z=1\) using the known stream compositions (coflow and fuel states). Let \( \boldsymbol{y}_{\mathrm{coflow}}\) and \( \boldsymbol{y}_{\mathrm{fuel}}\) denote the corresponding mass-fraction vectors. We include a boundary loss term
(28)
In addition, we impose soft Neumann-type constraints motivated by the physical boundary behavior of the major reactants. Specifically, we penalize the derivatives
(29)
through the auxiliary loss
(30)
where the derivatives are computed by automatic differentiation of the network output with respect to \( Z\) .
The physics-informed loss consists of the mean squared ODE residual evaluated at collocation points, together with boundary penalties:
(31)
where \( \lambda_{\mathrm{f}}\) balances the residual and boundary terms (in our implementation we use a fixed scaling factor of \( 0.1\) ).
In all cases, the input variables are scaled to the interval \( [-1,1]\) . Collocation points in time (\( t\) ) and mixture fraction (\( Z\) ) are sampled using Latin hypercube sampling, with \( 1024\) collocation points per epoch.
Hard constraints for PDE PINN are implemented as
(32)
For both the forward and inverse problems with a fixed strain rate, we use the same network architecture: a fully connected neural network with four hidden layers of 50 units each, hyperbolic tangent activation functions in the hidden layers, and an exponential activation function at the output. In both cases, training is performed using the Adam optimizer with a fixed learning rate of \( 0.0005\) for \( 50{,}000\) epochs. The inert species mass fraction is defined in the same manner as in the boundary value problem, as shown in (25), and mass conservation is enforced as described in (23). The only difference between the two settings is that the forward problem is trained solely using the physics loss, minimizing the scaled PDE residuals, whereas in the inverse setting the loss function is defined as
(33)
where the loss scaling factor \( \lambda_{\mathrm{f}}\) is set to \( 0.001\) . In the inverse setting, the logarithm of the strain rate is treated as a learnable parameter and is used for computing the physics loss.
For the parameterized case, the strain rate is included explicitly as an input to the network. As outlined in the main part of the paper, a single strain rate is sampled per epoch and concatenated with the sampled collocation points for \( t\) and \( Z\) . We employ a specialized architecture for the parameterized case as described in [23]. This architecture uses a branched network structure, in which the parameter input (\( \alpha\) ) and the coordinate inputs (\( t\) , \( Z\) ) are processed by separate encoder networks. The encoded outputs are then concatenated and passed through a decoder network. In our implementation, both encoder networks and the decoder consist of two hidden layers with 50 units each.
For all of the systems we used in our experiments, we need to define pressure, which we set to normal atmospheric pressure of \( 101325.0\) Pa. In the case of initial value problem we set the initial temperature to \( 1000\) K and initial mass fractions to:
For both the boundary value problem and the reaction-diffusion PDE, the flamelet formulation is defined by two opposing streams corresponding to the fuel and coflow (oxidizer) states. The fuel and coflow states are specified by their temperature and composition according to [28] as
Using these definitions, the exact thermochemical states of the fuel and coflow streams are computed with Cantera. The resulting species mass fractions define Dirichlet boundary conditions in mixture-fraction space,
(34)
The corresponding mass-fraction values obtained from the equilibrium evaluation are
For all results reported in this paper, a global random seed of \( 1\) is used for every training run. For the residual loss scaling defined in (9), we use the default hyperparameter value \( \lambda = 1\) and do not perform additional tuning. Training is always preformed on a time window \( t \in [0, 0.02]\) in which most of the reaction has already taken place.
For the baseline comparisons reported in Table 1, all methods share the same network and training settings, except for the differences explicitly described below. In the vanilla PINN baseline, we do not employ the residual scaling defined in (9), the mass conservation constraint in (23), nor the hard constraint on the inert species mass fraction.
For the parametrized problem, the vanilla baseline does not use the parameterized architecture of [23], but instead employs a fully connected neural network with four hidden layers of 70 units each, resulting in the same depth and a comparable number of trainable parameters.
For causality-based training, we retain all components of the proposed framework except for the residual scaling based on the reaction term. The remaining modifications are orthogonal to the causality-based weighting strategy and are therefore kept fixed to ensure a fair comparison. The causality-based loss function is defined as
(35)
with weights
(36)
The hyperparameter \( \gamma\) is manually tuned on the parametrized PDE case and is found to have little influence on training performance within the tested range \( \gamma \in [10^{-4}, 1]\) . Consequently, we fix \( \gamma = 1\) for all experiments involving causality-based training.
As outlined in the main text, inverse problem experiments aim to infer the unknown strain rate from observed solution data. These experiments are conducted both in the noise-free setting and in the presence of additive noise. When noise is introduced, synthetic observations are generated by perturbing each species mass fraction independently according to
(37)
where \( Y_j\) denotes the mass fraction of species \( j\) , \( \epsilon\) is the prescribed noise level, and \( \mathrm{std}(Y_j)\) is the standard deviation of species \( j\) computed over the entire dataset. This construction ensures that the noise magnitude is scaled appropriately for each species based on its variability.
Figures 7, 8, and 9 illustrate the behavior of a vanilla physics-informed neural network applied to the initial value problem ODE, the boundary value problem ODE, and the reaction–diffusion flamelet PDE, respectively. In all cases, training without the chemistry-aware modifications introduced in this work fails to converge to a physically meaningful solution and results in severe inaccuracies across the domain. These qualitative failures highlight the necessity of domain-specific stabilization strategies when applying PINNs to extremely stiff reactive systems.
Figure 10 compares numerical reference solutions with predictions obtained using the proposed parameterized PINN across multiple strain rates. A single trained model accurately represents a continuous family of flamelet solutions over a wide range of strain rates, demonstrating its ability to capture the parametric structure of the solution manifold.
[1] George Em Karniadakis and Ioannis G Kevrekidis and Lu Lu and Paris Perdikaris and Sifan Wang and Liu Yang Physics-informed machine learning Nature Reviews Physics 2021 3 6 422–440
[2] Samuel Greydanus and Misko Dzamba and Jason Yosinski Hamiltonian neural networks Advances in neural information processing systems 2019 32
[3] Miles Cranmer and Sam Greydanus and Stephan Hoyer and Peter Battaglia and David Spergel and Shirley Ho Lagrangian Neural Networks ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations 2020
[4] Ricky TQ Chen and Yulia Rubanova and Jesse Bettencourt and David K Duvenaud Neural ordinary differential equations Advances in neural information processing systems 2018 31
[5] M. Raissi and P. Perdikaris and G.E. Karniadakis Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Journal of Computational Physics 2019 378 686-707
[6] Salvatore Cuomo and Vincenzo Schiano Di Cola and Fabio Giampaolo and Gianluigi Rozza and Maziar Raissi and Francesco Piccialli Scientific machine learning through physics–informed neural networks: Where we are and what’s next Journal of Scientific Computing 2022 92 3 88
[7] Lu Lu and Pengzhan Jin and Guofei Pang and Zhongqiang Zhang and George Em Karniadakis Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators Nature machine intelligence 2021 3 3 218–229
[8] Zongyi Li and Nikola Borislavov Kovachki and Kamyar Azizzadenesheli and Kaushik Bhattacharya and Andrew Stuart and Anima Anandkumar and others Fourier Neural Operator for Parametric Partial Differential Equations International Conference on Learning Representations 2021
[9] Zongyi Li and Hongkai Zheng and Nikola Kovachki and David Jin and Haoxuan Chen and Burigede Liu and Kamyar Azizzadenesheli and Anima Anandkumar Physics-informed neural operator for learning partial differential equations ACM/IMS Journal of Data Science 2024 1 3 1–27
[10] Nikola Kovachki and Zongyi Li and Burigede Liu and Kamyar Azizzadenesheli and Kaushik Bhattacharya and Andrew Stuart and Anima Anandkumar Neural operator: Learning maps between function spaces with applications to pdes Journal of Machine Learning Research 2023 24 89 1–97
[11] Yuanran Zhu and Yu-Hang Tang and Changho Kim Learning stochastic dynamics with statistics-informed neural network Journal of Computational Physics 2023 474 111819
[12] Seid Koric and Asha Viswantah and Diab W Abueidda and Nahil A Sobh and Kamran Khan Deep learning operator network for plastic deformation with variable loads and material properties Engineering with Computers 2024 40 2 917–929
[13] Shahed Rezaei and Reza Najian Asl and Shirko Faroughi and Mahdi Asgharzadeh and Ali Harandi and Rasoul Najafi Koopas and Gottfried Laschet and Stefanie Reese and Markus Apel A finite operator learning technique for mapping the elastic properties of microstructures to their mechanical deformations International Journal for Numerical Methods in Engineering 2025 126 1 e7637
[14] Bruno Sportisse An analysis of operator splitting techniques in the stiff case Journal of computational physics 2000 161 1 140–168
[15] Aditi Krishnapriyan and Amir Gholami and Shandian Zhe and Robert Kirby and Michael W Mahoney Characterizing possible failure modes in physics-informed neural networks Advances in neural information processing systems 2021 34 26548–26560
[16] Simone Monaco and Daniele Apiletti Training physics-informed neural networks: One learning to rule them all? Results in Engineering 2023 18 101023
[17] Jian Cheng Wong and Chin Chun Ooi and Abhishek Gupta and Yew-Soon Ong Learning in sinusoidal spaces with physics-informed neural networks IEEE Transactions on Artificial Intelligence 2022 5 3 985–1000
[18] Xintao Chai and Wenjun Cao and Jianhui Li and Hang Long and Xiaodong Sun Overcoming the spectral bias problem of physics-informed neural networks in solving the frequency-domain acoustic wave equation IEEE Transactions on Geoscience and Remote Sensing 2024
[19] Arka Daw and Jie Bu and Sifan Wang and Paris Perdikaris and Anuj Karpatne Rethinking the importance of sampling in physics-informed neural networks arXiv preprint arXiv:2207.02338 2022
[20] Weiqi Ji and Weilun Qiu and Zhiyu Shi and Shaowu Pan and Sili Deng Stiff-pinn: Physics-informed neural network for stiff chemical kinetics The Journal of Physical Chemistry A 2021 125 36 8098–8106
[21] Sifan Wang and Yujun Teng and Paris Perdikaris Understanding and mitigating gradient flow pathologies in physics-informed neural networks SIAM Journal on Scientific Computing 2021 43 5 A3055–A3081
[22] Suryanarayana Maddu and Dominik Sturm and Christian L Müller and Ivo F Sbalzarini Inverse Dirichlet weighting enables reliable training of physics informed neural networks Machine Learning: Science and Technology 2022 3 1 015026
[23] Woojin Cho and Minju Jo and Haksoo Lim and Kookjin Lee and Dongeun Lee and Sanghyun Hong and Noseong Park Parameterized Physics-informed Neural Networks for Parameterized PDEs Proceedings of the 41st International Conference on Machine Learning 2024 Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix 235 Proceedings of Machine Learning Research 8510–8533 PMLR
[24] Franz M Rohrhofer and Stefan Posch and Clemens Gö\ssnitzer and Bernhard C Geiger Approximating families of sharp solutions to Fisher's equation with physics-informed neural networks Computer Physics Communications 2025 307 109422
[25] Chengping Rao and Pu Ren and Qi Wang and Oral Buyukozturk and Hao Sun and Yang Liu Encoding physics to learn reaction–diffusion processes Nature Machine Intelligence 2023 5 7 765–779
[26] Jiahao Wu and Xutun Wang and Guihua Zhang and Jiayue Liu and Xin Li and Yang Zhang and Hai Zhang and Junfu Lyu and Bing Wang and Yuxin Wu Physics-informed machine learning for combustion: A review arXiv preprint arXiv:2509.03347 2025
[27] Zhen Cao and Kai Liu and Kun Luo and Sifan Wang and Liang Jiang and Jianren Fan Surrogate modeling of multi-dimensional premixed and non-premixed combustion using pseudo-time stepping physics-informed neural networks Physics of Fluids 2024 36 11
[28]
[29] Philipp Holl and Nils Thuerey \( {\Phi}_{\text{Flow}}\) (PhiFlow): Differentiable Simulations for PyTorch, TensorFlow and Jax International Conference on Machine Learning 2024 PMLR
[30] Yuanming Hu and Luke Anderson and Tzu-Mao Li and Qi Sun and Nathan Carr and Jonathan Ragan-Kelley and Frédo Durand DiffTaichi: Differentiable Programming for Physical Simulation International Conference on Learning Representations 2020
[31] C Daniel Freeman and Erik Frey and Anton Raichuk and Sertan Girgin and Igor Mordatch and Olivier Bachem Brax–a differentiable physics engine for large scale rigid body simulation Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1 2021
[32] Sifan Wang and Shyam Sankaran and Paris Perdikaris Respecting causality is all you need for training physics-informed neural networks arXiv preprint arXiv:2203.07404 2022
[33] Kurt Hornik Approximation capabilities of multilayer feedforward networks Neural networks 1991 4 2 251–257
[34] Weiqi Ji and Sili Deng ReacTorch: A Differentiable Reacting Flow Simulation Package in PyTorch https://github.com/DENG-MIT/reactorch 2020
[35] JF Grcar Sandia National Laboratories Report SAND91-8230 Sandia National Laboratories, Livermore 1991
[36] Linda R Petzold Description of DASSL: a differential/algebraic system solver Sandia National Labs., Livermore, CA (USA) 1982