Decoding the ‘black box’ of data-driven models in desalination technology|Asia Research News

A+ A- go back

Data-driven models provide precise prediction on charging-phase of desalination technology building a bridge between data-driven insights and physicochemical principle to lay a strong foundation for system parameter optimization.

By National Taiwan University

Electro-driven water desalination technologies are recognized as essential solutions to global freshwater scarcity. Among these, membrane capacitive deionization (MCDI) has been considered as an energy-efficient method for treating brackish water and low-salinity water.

However, as MCDI systems scale up from laboratory investigations to pilot-scale demonstrations, optimizing operational parameters heavily relies on empirical knowledge and “trial-and-error” approaches. This inefficiency significantly hinders broader commercial development.

A study published in the journal Water Research by a research team led by Prof. Chia-Hung Hou from National Taiwan University introduces a data-driven model being applied to the MCDI system for desalination performance prediction.

By utilizing machine learning (ML) models to predicted MCDI effluent conductivity of a scaled-up MCDI system, and employing Shapley Additive Explanations (SHAP) to quantify feature contributions, which decode the model "black box" behavior of ML model and the mechanisms of MCDI system.

The researchers developed a reliable ML workflow tailored for MCDI process prediction. Once the rigorous data preprocessing is successfully conducted, four algorithms included two ML models, as random forest (RF) and extreme gradient boost (XG-Boost), and two DL models, as multiple layer (MLP) and long-short term memory (LSTM) were adopted in this study.

The team further through systematic hyperparameter turning, they effectively reduce the risk of overfitting, striking an optimal balance between predictive accuracy and model stability.

Last but not least, the RF model demonstrated superior robustness and predictive accuracy, validated through comprehensive metrics including the coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE).

The SHAP analysis then revealed that charging time was the most influential variable, followed by current and pH. This ranking perfectly aligns with experimental observations and known electrochemical behavior, proving that the model successfully captured the underlying physics.

Taken together, this study demonstrates how interpretable data-driven models can be applied to scaled-up MCDI systems for desalination predictions, paving the way for parameter optimization, and smarter, highly efficient, and scalable water treatment solutions.

“With the ML workflow established, we are already planning a follow-up study that incorporates broader performance metrics, such as removal efficiency, water recovery, and energy consumption, to optimize MCDI parameters,” says Huei-Cih Liu, the study’s first author.

“By integrating data-driven models with a scaled-up MCDI system, this study represents a cornerstone for developing AI-assisted, practical control and decision-making for MCDI operations in desalination or water reuse in Taiwan,” says Dr. Chia-Hung Hou, distinguished professor of environmental engineering and corresponding author of the study.

 

Prof. Chia-Hung Hou's email address: chiahunghou@ntu.edu.tw

Go Back