Prediction of NOxEmissions from Compression Ignition Engines Using Ensemble Learning-Based Models with Physical Interpretability

Harish Panneer Selvam, Shashi Shekhar, William F. Northrop

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations


On-board diagnostics (OBD) data contain valuable information including real-world measurements of vehicle powertrain parameters. These data can be used to gain a richer data-driven understanding of complex physical phenomena like emissions formation during combustion. In this study, we develop a physics-based machine learning framework to predict and analyze trends in engine-out NOx emissions from diesel and diesel-hybrid heavy-duty vehicles. This model differs from black-box machine learning models presented in previous literature because it incorporates engine combustion parameters that allow physical interpretation of the results. Based on chemical kinetics and the characteristics of diffusive combustion, NOx emissions from compression ignition engines primarily depend non-linearly on three parameters: adiabatic flame temperature, the oxygen concentration in the cylinder when the intake valves are closed, and combustion time duration. Here these parameters were calculated from available OBD data. Linearizing a physics-based NOx emissions prediction model provides an opportunity to evaluate several machine learning regression techniques. The results show that an ensemble learning bagging-type model like random forest regression (RFR) is highly effective in predicting engine out NOx emissions measured by the on-board NOx sensor. We also show that real-world OBD data has high heterogeneity with clustered co-occurrences of vehicle parameters. In terms of accuracy, the developed model provides an average R2 value of 0.72 and mean absolute error (MAE) of 78 ppm for different vehicle OBD datasets, an improvement of 53% and 42% respectively when compared to non-linear regression models, and provides the opportunity to interpret the results because of its linkage to physical parameters. We also perform drop-column feature sensitivity analysis for the RFR Model and compare prediction results with black-box deep neural network and non-linear regression models. Based on its high accuracy and interpretability, the developed RFR model has potential for use in on-board NOx prediction in engines of varying displacement and design.

Original languageEnglish (US)
JournalSAE Technical Papers
Issue number2021
StatePublished - Sep 5 2021
EventSAE 15th International Conference on Engines and Vehicles, ICE 2021 - Capri, Italy
Duration: Sep 12 2021Sep 16 2021

Bibliographical note

Funding Information:
This material is based upon work supported by the National Science Foundation under Grant No. 1901099. We thank the U.S. Department of Energy’s National Renewable Energy Laboratory for their Fleet DNA support and assistance. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.

Publisher Copyright:
© 2021 SAE International. All Rights Reserved.


Dive into the research topics of 'Prediction of NOxEmissions from Compression Ignition Engines Using Ensemble Learning-Based Models with Physical Interpretability'. Together they form a unique fingerprint.

Cite this