Please wait a minute...
Frontiers of Environmental Science & Engineering

ISSN 2095-2201

ISSN 2095-221X(Online)

CN 10-1013/X

Postal Subscription Code 80-973

2018 Impact Factor: 3.883

Front. Environ. Sci. Eng.    2016, Vol. 10 Issue (2) : 299-310    https://doi.org/10.1007/s11783-015-0825-7
RESEARCH ARTICLE
Evaluation of the k-nearest neighbor method for forecasting the influent characteristics of wastewater treatment plant
Minsoo KIM1,Yejin KIM2,Hyosoo KIM3,Wenhua PIAO1,Changwon KIM1,*()
1. Department of Civil and Environmental Engineering, Pusan National University, Busan 609-735, Republic of Korea
2. Department of Civil and Environmental Engineering, Catholic University of Pusan, Busan 609-757, Republic of Korea
3. EnvironSoft Co., Ltd. 511 Industry-University Co., Bld., Pusan National University, Busan 609-735, Republic of Korea
 Download: PDF(900 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The k-nearest neighbor (k-NN) method was evaluated to predict the influent flow rate and four water qualities, namely chemical oxygen demand (COD), suspended solid (SS), total nitrogen (T-N) and total phosphorus (T-P) at a wastewater treatment plant (WWTP). The search range and approach for determining the number of nearest neighbors (NNs) under dry and wet weather conditions were initially optimized based on the root mean square error (RMSE). The optimum search range for considering data size was one year. The square root-based (SR) approach was superior to the distance factor-based (DF) approach in determining the appropriate number of NNs. However, the results for both approaches varied slightly depending on the water quality and the weather conditions. The influent flow rate was accurately predicted within one standard deviation of measured values. Influent water qualities were well predicted with the mean absolute percentage error (MAPE) under both wet and dry weather conditions. For the seven-day prediction, the difference in predictive accuracy was less than 5% in dry weather conditions and slightly worse in wet weather conditions. Overall, the k-NN method was verified to be useful for predicting WWTP influent characteristics.

Keywords influent wastewater      prediction      data-driven model      k-nearest neighbor method (k-NN)     
Corresponding Author(s): Changwon KIM   
Online First Date: 11 December 2015    Issue Date: 01 February 2016
 Cite this article:   
Minsoo KIM,Yejin KIM,Hyosoo KIM, et al. Evaluation of the k-nearest neighbor method for forecasting the influent characteristics of wastewater treatment plant[J]. Front. Environ. Sci. Eng., 2016, 10(2): 299-310.
 URL:  
https://academic.hep.com.cn/fese/EN/10.1007/s11783-015-0825-7
https://academic.hep.com.cn/fese/EN/Y2016/V10/I2/299
item flow rate (m3·d−1) COD (mg?L−1) SS (mg?L−1) T-N (mg?L−1) T-P (mg?L−1)
maximum 406461.0 95.3 338.3 48.8 7.1
minimum 242953.0 36.9 56.7 13.9 1.8
average 306204.5 61.5 122.7 33.8 3.9
standard deviation 35681.7 10.3 25.6 5.9 0.8
Tab.1  Statistical properties of influent wastewater flow rate and compositions for the N WWTP
weather search range (d) DF approach SR approach
1.62 2.0 2.5 5.0 8.0 15.0 30.0
dry 30 28528.9 28594.5 28623.7 28836.4 29428.1 29326.4 29280.2 27392.7
90 31260.2 30795.7 30975.7 29914.3 30673.9 31465.0 31628.0 28222.6
365 32280.4 31745.3 29986.7 27474.4 26769.9 27125.5 26992.9 26695.1
731 34096.6 33300.4 32185.0 28667.8 27586.2 27763.7 28422.4 27166.0
wet 30 29731.8 28852.9 29273.3 31968.7 32676.8 33064.5 33838.4 27571.3
90 29404.0 28898.4 29281.4 35437.7 37489.0 40878.6 44564.0 28683.9
365 33752.9 31768.6 27258.4 27483.7 31354.8 41215.1 53193.6 26038.8
731 31293.7 29472.2 28184.0 26949.2 27535.8 32491.2 45090.8 26280.7
Tab.2  RMSE results of the effects by the search range considering the number of NNs in the influent flow rate under the dry and wet weather conditions
Fig.1  Effects of the number of NNs on forecasting the flow rate analyzed using the RMSE based on the DF approach (◆, ●, ▲, █) and the SR approach (◇, ○, △, □) in dry (a) and wet (b) weather conditions
Fig.2  Results for predicting the influent flow rate by using the k-NN method in dry and wet weather conditions
weather subject COD SS T-N T-P
dry search range 365 days 90 days 30 days 731 days
approach DF (8) SR DF (2.5) SR
number of NNs 74.4* 9.0 7.2* 27.0
RMSE 3.91 7.69 2.02 0.15
wet search range 731 days 365 days 731 days 365 days
approach DF (5) SR SR DF (8)
number of NNs 64.1* 19.0 27.0 137.6*
RMSE 5.82 8.91 2.17 0.34
Tab.3  Derivation of appropriate conditions for the search range and the number of NNs to predict influent water qualities (COD, SS, T-N, and T-P)
Fig.3  A comparison of predicted influent qualities (COD, SS, T-N, and T-P) with measured data in dry and wet weather conditions
Fig.4  An evaluation of the accuracy of long-term predictions of influent flow rates and water qualities based on the k-NN method
Fig.5  A comparison of box plots for the application of two approaches in dry and wet weather conditions
1 Butler D, Graham N J D. Modeling dry weather wastewater flow in sewer networks. Journal of Environmental Engineering, 1995, 121(2): 161–173
https://doi.org/10.1061/(ASCE)0733-9372(1995)121:2(161)
2 Lin S, Liao Y, Hsieh S, Kuo J, Chen Y. A pattern-oriented approach to development of a real-time storm sewer simulation system with an SWMM model. Journal of Hydroinformatics, 2010, 12(4): 408–423
https://doi.org/10.2166/hydro.2010.021
3 Freni G, Mannian G, Viviani G. Urban storm-water quality management: centralized versus source control. Journal of Water Resources Planning and Management, 2010, 136(2): 268–278
https://doi.org/10.1061/(ASCE)0733-9496(2010)136:2(268)
4 Jeppsson U, Rosen C, Alex J, Copp J,Gernaey K V, Pons M N, Vanrolleghem P A. Towards a benchmark simulation model for plant-wide control strategy performance evaluation of WWTPs. Water Science and Technology, 2008, 53(1): 287–295
https://doi.org/10.2166/wst.2006.031
5 Gernaey K V,Flores-Alsina X, Rosen C, Benedetti L, Jeppsson U. Dynamic influent pollutant disturbance scenario generation using a phenomenological modelling approach. Journal of Environmental Modelling and Software, 2011, 26(11): 1255–1267
https://doi.org/10.1016/j.envsoft.2011.06.001
6 Kim H S, Kim Y J, Cheon S P, Baek G D, Kim S S, Kim C W. Evaluation of model-based control strategy based on generated setpoint schedules for NH4-N removal in a pilot-scale A2/O process. Chemical Engineering Journal, 2012, 203: 387–397
https://doi.org/10.1016/j.cej.2012.07.067
7 Kim J R, Ko J H, Im J H, Lee S H, Kim S H, Kim C W, Park T J. Forecasting influent flow rate and composition with occasional data for supervisory management system by time series model. Water Science and Technology, 2006, 53(4-5): 185–192
https://doi.org/10.2166/wst.2006.123
8 Wang H R, Wang C, Kin X, Kang J. An improved ARIMA model for precipitation simulations. Nonlinear Processes in Geophysics, 2014, 21(6): 1159–1168
https://doi.org/10.5194/npg-21-1159-2014
9 Valipour M, Banihabib M E, Behbahani S M R. Comparison of the ARMA, ARIMA, and the autoregressive artificial neural network models in forecasting the monthly inflow of Dez Dam Reservoir. Journal of Hydrology (Amsterdam), 2013, 476: 433–441
https://doi.org/10.1016/j.jhydrol.2012.11.017
10 Mohammadi K, Eslami H R. Dayyani Dardashti Sh. Comparison of regression ARIMA and ANN models for reservoir inflow forecasting using snowmelt equivalent (a case study of Karaj). Journal of Agricultural Science and Technology, 2005, 7: 17–30
11 Khashei M, Bijari M. A new hybrid methodology for nonlinear time series forecasting. Modelling and Simulation in Engineering, 2011, 2011: 1–5
https://doi.org/10.1155/2011/379121
12 Laio F, Porporato A, Revelli R,Ridolfi L. A comparison of nonlinear flood forecasting methods. Water Resources Research, 2003, 39(5): 1129
https://doi.org/10.1029/2002WR001551
13 Karunasinghe D S K, Liong Sh. Chaotic time series prediction with a global model: artificial neural network. Journal of Hydrology (Amsterdam), 2006, 3232(1-4): 92–105
https://doi.org/10.1016/j.jhydrol.2005.07.048
14 Solaimany-Aminabad M, Maleki A, Hadi M. Application of artificial neural network (ANN) for the prediction of water treatment plant influent characteristics. Journal of Advances in Environmental Health Research, 2013, 1(2): 89–100
15 Bagheri M, Mirbagheri S A, Bagheri Z, Kamarkhani A M. Modeling and optimization of activated sludge bulking for a real wastewater treatment plant using hybrid artificial neural networks-genetic algorithm approach. Process Safety and Environmental Protection, 2015, 95: 12–25
https://doi.org/10.1016/j.psep.2015.02.008
16 Grieu S, Traoré A, Polit M, Colprim J. Prediction of parameters characterizing the state of a pollution removal biologic process. Engineering Applications of Artificial Intelligence, 2005, 18(5): 559–573
https://doi.org/10.1016/j.engappai.2004.11.008
17 Wu C L, Chau K W. Data-driven models for monthly streamflow time series prediction. Engineering Applications of Artificial Intelligence, 2010, 23(8): 1350–1367
https://doi.org/10.1016/j.engappai.2010.04.003
18 Arroyo J, Maté C. Forecasting histogram time series with k-nearest neighbours methods. International Journal of Forecasting, 2009, 25(1): 192–207
https://doi.org/10.1016/j.ijforecast.2008.07.003
19 Imandoust S B, Bolandraftar M. Application of k-nearest neighbor (KNN) approach for predicting economic events: theoretical background. Int. Journal of Engineering Research and Applications, 2013, 3(5): 605–610
20 Ponomarenko A,Avrelin N, Naidan B, Boytsov L. Comparative analysis of data structures for approximate nearest neighbor search. Journal of Mathematical Sciences, 2012, 181(6): 782–791
21 Batista G E A P A, Silva D F. How k-nearest neighbor parameters affect its performance. In Argentine Symposium on Artificial Intelligence, 2009, 1–12
22 Alkasassbeh M, Altarawneh G A, Hassanat A. On enhancing the performance of nearest neighbour classifiers using hassanat distance metric. Canadian Journal of Pure and Applied Sciences, 2015, 9(1): 3291–3298
23 Tongal H. Nonlinear forecasting of stream flows using a chaotic approach and artificial neural networks. Earth Sciences Research Journal, 2013, 17(2): 119–126
24 Nesmerak I, Blazkova S D. Analysis of the time series of waste water quality at the inflow of the wastewater treatment plant and transfer functions. Journal of Hydrology and Hydromechanics, 2014, 62(1): 55–59
25 Yu X Y, Liong S Y. Babovic V. EC-SVM approach for real-time hydrologic forecasting. Journal of Hydroinformatics, 2004, 6(3): 209–233
26 Toth E,Brath A, Montanari A. Comparison of short-term rainfall prediction models for real-time flood forecasting. Journal of Hydrology (Amsterdam), 2000, 239(1): 132–147
https://doi.org/10.1016/S0022-1694(00)00344-9
27 Gou J, Du L, Zhang Y, Xiong T. A new distance-weighted k-nearest neighbor classifier. Journal of Information and Computational Science, 2012, 9: 1429–1436
28 Hassanat A B, Abbadi M A, Altarawneh G A, Alhasanat A A. Solving the Problem of the K Parameter in the KNN Classifier Using an Ensemble Learning Approach. 2014
https://doi.org/, arXiv preprint arXiv:1409.0919
29 Livio M. The Golden Ratio: The Story of Phi, the World’s Most Astonishing Number. New York: Broad books, 2002
30 Han J, Kamber M. Data mining: concepts and techniques. Morgan Kaufmann publishers, San Francisco, 2001
31 Karegowda A G, Jayaram M A, Manjunath A S. Combining Akaike’s information criterion and the golden-section search technique to find optimal numbers of k-nearest neighbors. Journal of Computer Applications, 2010, 2(1): 80–87
https://doi.org/10.5120/609-859
32 Yanxia S, van Wyk B J, Wag Z. A new golden ratio local search-based particle swarm optimization. In: Proceedings of 2012 International Conference on Systems and Informatics, China. 2012, 754–757
33 Teimouri M. Comparison of neural network and k-nearest neighbor methods in daily flow forecasting. Journal of Applied Sciences, 2010, 10: 1006–1010
https://doi.org/10.3923/jas.2010.1006.1010
[1] Lina Gan, Kezhi Li, Hejingying Niu, Yue Peng, Jianjun Chen, Yuandong Huang, Junhua Li. Simultaneous removal of NOx and chlorobenzene on V2O5/TiO2 granular catalyst: Kinetic study and performance prediction[J]. Front. Environ. Sci. Eng., 2021, 15(4): 70-.
[2] Liuyan WU,Lijuan JIA,Xiaohan LIU,Chao LONG. The prediction of adsorption isotherms of ester vapors on hypercrosslinked polymeric adsorbent[J]. Front. Environ. Sci. Eng., 2016, 10(3): 482-490.
[3] Shuai MA, Siyu ZENG, Xin DONG, Jining CHEN, Gustaf OLSSON. Short-term prediction of influent flow rate and ammonia concentration in municipal wastewater treatment plants[J]. Front Envir Sci Eng, 2014, 8(1): 128-136.
[4] Pengfei DU, Zhiyi LI, Jinliang HUANG. A modeling system for drinking water sources and its application to Jiangdong Reservoir in Xiamen city[J]. Front Envir Sci Eng, 2013, 7(5): 735-745.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed