Please wait a minute...
Frontiers of Environmental Science & Engineering

ISSN 2095-2201

ISSN 2095-221X(Online)

CN 10-1013/X

Postal Subscription Code 80-973

2018 Impact Factor: 3.883

Front. Environ. Sci. Eng.    2024, Vol. 18 Issue (1) : 8    https://doi.org/10.1007/s11783-024-1768-7
RESEARCH ARTICLE
A novel deep learning framework with variational auto-encoder for indoor air quality prediction
Qiyue Wu1, Yun Geng1, Xinyuan Wang1, Dongsheng Wang2, ChangKyoo Yoo5, Hongbin Liu1,3,4()
1. Jiangsu Co-Innovation Center of Efficient Processing and Utilization of Forest Resources, Nanjing Forestry University, Nanjing 210037, China
2. College of Automation & College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
3. Guangxi Key Laboratory of Clean Pulp & Papermaking and Pollution Control, College of Light Industry and Food Engineering, Guangxi University, Nanning 530004, China
4. Laboratory for Comprehensive Utilization of Paper Waste of Shandong Province, Shandong Huatai Paper Co. Ltd., Dongying 257335, China
5. Department of Environmental Science and Engineering, College of Engineering, Kyung Hee University, Yongin 446701, Republic of Korea
 Download: PDF(5893 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

● PLS-VAER is proposed for modeling of PM2.5 concentration.

● Data are decomposed by PLS to capture nonlinear feature.

● VAER can improve the predictive performance by variational inference.

● The proposed model provides a novel method for monitoring indoor air quality.

Exposure to poor indoor air conditions poses significant risks to human health, increasing morbidity and mortality rates. Soft measurement modeling is suitable for stable and accurate monitoring of air pollutants and improving air quality. Based on partial least squares (PLS), we propose an indoor air quality prediction model that utilizes variational auto-encoder regression (VAER) algorithm. To reduce the negative effects of noise, latent variables in the original data are extracted by PLS in the first step. Then, the extracted variables are used as inputs to VAER, which improve the accuracy and robustness of the model. Through comparative analysis with traditional methods, we demonstrate the superior performance of our PLS-VAER model, which exhibits improved prediction performance and stability. The root mean square error (RMSE) of PLS-VAER is reduced by 14.71%, 26.47%, and 12.50% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. Additionally, the coefficient of determination (R2) of PLS-VAER improves by 13.70%, 30.09%, and 11.25% compared to single VAER, PLS-SVR, and PLS-ANN, respectively. This research offers an innovative and environmentally-friendly approach to monitor and improve indoor air quality.

Keywords Indoor air quality      PM2.5 concentration      Variational auto-encoder      Latent variable      Soft measurement modeling     
Corresponding Author(s): Hongbin Liu   
About author:

* These authors contributed equally to this work.

Issue Date: 30 August 2023
 Cite this article:   
Qiyue Wu,Yun Geng,Xinyuan Wang, et al. A novel deep learning framework with variational auto-encoder for indoor air quality prediction[J]. Front. Environ. Sci. Eng., 2024, 18(1): 8.
 URL:  
https://academic.hep.com.cn/fese/EN/10.1007/s11783-024-1768-7
https://academic.hep.com.cn/fese/EN/Y2024/V18/I1/8
Fig.1  The structure of VAE.
Fig.2  The structure of VAER.
Fig.3  The flow chart of the PLS-VAER model.
Fig.4  Variables in the IAQ data measured from a metro station.
VariablesVariable interpretationUnitMeanStd.
NONitrogen monoxidePPM0.0680.079
NO2Nitrogen dioxidePPM0.0460.024
COCarbon monoxidePPM1.280.69
CO2Carbon dioxidePPM492.1973.95
TTemperatureoC18.768.28
HHumidity%44.7414.02
PM10 (hall)PM10 in the hallUGM72.4940.62
PM2.5 (hall)PM2.5in the hallUGM44.1923.39
PM10 (platform)PM10 in the platformUGM80.4146.71
PM2.5 (platform)PM10 in the platformUGM59.1035.97
Tab.1  The statistical information of the indoor air quality data
MethodsTraining setTest set
MAERMSER2MAERMSER2
CCA0.1410.2000.3380.1430.2110.329
SVR0.1100.1720.5140.1140.1870.476
PLS0.1130.1700.5220.1160.1820.504
ANN0.1080.1460.6480.1130.1660.586
VAER0.0860.1370.6890.0970.1560.635
CCA-VAER0.1130.1700.5260.1160.1790.518
PLS-SVR0.0920.1370.6890.1030.1720.555
PLS-ANN0.0910.1250.7430.1020.1530.649
PCA-VAER0.0680.0960.8470.0930.1430.693
PLS-VAER0.0660.0980.8420.0920.1360.722
Tab.2  Prediction results of different models in case 1
Fig.5  The prediction results of different single models on the test set.
Number of latent variablesInput variables (X)Output variables (Y)
Variance (%)Cumulative variance (%)Variance (%)Cumulative variance (%)
135.6035.6043.8143.81
210.3445.94 6.1849.99
310.4256.36 1.4251.41
414.9071.26 0.1851.59
5 9.5980.85 0.0951.68
6 4.5985.44 0.0551.73
7 5.2490.68 0.0151.74
8 4.1994.87051.74
Tab.3  Variance and cumulative variance results by PLS
Fig.6  The prediction results of different hybrid models on the test set.
Fig.7  The radar plot of different models on the test set.
Number of LVs (latent variables)Training setTest set
MAERMSER2MAERMSER2
30.1050.1550.6050.1150.1790.516
40.0910.1330.7090.1040.1580.626
50.0810.1180.7690.0980.1500.662
60.0730.1070.8120.0960.1430.694
70.0660.0980.8420.0920.1360.722
80.0630.0920.8600.0930.1420.697
Tab.4  The prediction results of PLS-VAER with different LVs
MethodsTraining setTest set
MAERMSER2MAERMSER2
CCA0.1320.167–0.3150.1320.166–0.239
PLS0.0750.0970.5530.0760.1050.552
SVR0.0740.0950.5770.0740.0960.581
ANN0.0710.0900.6200.0740.0950.596
VAER0.0560.0740.7410.0610.0810.704
CCA-VAER0.0760.1000.5320.0770.1010.539
PLS-SVR0.0640.0790.7050.0660.0830.691
PLS-ANN0.0600.0760.7290.0650.0820.694
PCA-VAER0.0440.0600.8290.0560.0770.734
PLS-VAER0.0340.0490.8870.0450.0650.810
Tab.5  Prediction results of different models in case 2
1 M Aljunaid , Y Tao , H Shi . (2021). A novel mutual information and partial least squares approach for quality-related and quality-unrelated fault detection. Processes (Basel, Switzerland), 9(1): 166
https://doi.org/10.3390/pr9010166
2 S A Alsenan , I M Al-Turaiki , A M Hafez . (2020). Feature extraction methods in quantitative structure activity relationship modeling: a comparative study. IEEE Access: Practical Innovations, Open Solutions, 8: 78737–78752
https://doi.org/10.1109/ACCESS.2020.2990375
3 de Miguel M Ángel , J M Armingol , F García . (2022). Vehicles trajectory prediction using recurrent VAE network. IEEE Access: Practical Innovations, Open Solutions, 10: 32742–32749
https://doi.org/10.1109/ACCESS.2022.3161661
4 A Apsemidis , S Psarakis , J M Moguerza . (2020). A review of machine learning kernel methods in statistical process monitoring. Computers & Industrial Engineering, 142: 106376
https://doi.org/10.1016/j.cie.2020.106376
5 A Challoner , F Pilla , L Gill . (2015). Prediction of indoor air exposure from outdoor air quality using an artificial neural network model for inner city commercial buildings. International Journal of Environmental Research and Public Health, 12(12): 15233–15253
https://doi.org/10.3390/ijerph121214975
6 R Q Chen , G H Shi , W L Zhao , C H Liang . (2021). A joint model for IT operation series prediction and anomaly detection. Neurocomputing, 448: 130–139
https://doi.org/10.1016/j.neucom.2021.03.062
7 Y Y Chen , F C Sung , M L Chen , I F Mao , C Y Lu . (2016a). Indoor air quality in the metro system in north Taiwan, China. International Journal of Environmental Research and Public Health, 13(12): 1200
https://doi.org/10.3390/ijerph13121200
8 Z Chen , S X Ding , K Zhang , Z Li , Z Hu . (2016b). Canonical correlation analysis-based fault detection methods with application to alumina evaporation process. Control Engineering Practice, 46: 51–58
https://doi.org/10.1016/j.conengprac.2015.10.006
9 Z Chen , K Zhang , S X Ding , Y A W Shardt , Z Hu . (2016c). Improved canonical correlation analysis-based fault detection methods for industrial processes. Journal of Process Control, 41: 26–34
https://doi.org/10.1016/j.jprocont.2016.02.006
10 C Correia , V Martins , I Cunha-Lopes , T Faria , E Diapouli , K Eleftheriadis , S M Almeida . (2020). Particle exposure and inhaled dose while commuting in Lisbon. Environmental Pollution, 257: 113547
https://doi.org/10.1016/j.envpol.2019.113547
11 M Diao , T Holloway , S Choi , S M O’Neill , M Z Al-Hamdan , Donkelaar A Van , R V Martin , X Jin , A M Fiore , D K Henze . et al.. (2019). Methods, availability, and applications of PM2.5 exposure estimates derived from ground measurements, satellite, and atmospheric models. Journal of the Air & Waste Management Association, 69(12): 1391–1414
https://doi.org/10.1080/10962247.2019.1668498
12 S Feng , D Gao , F Liao , F Zhou , X Wang . (2016). The health effects of ambient PM2.5 and potential mechanisms. Ecotoxicology and Environmental Safety, 128: 67–74
https://doi.org/10.1016/j.ecoenv.2016.01.030
13 K Han , H Wen , J Shi , K H Lu , Y Zhang , D Fu , Z Liu . (2019). Variational autoencoder: an unsupervised model for encoding and decoding fMRI activity in visual cortex. NeuroImage, 198: 125–136
https://doi.org/10.1101/214247
14 Y Hong , U Hwang , J Yoo , S Yoon . (2019). How generative adversarial networks and their variants work: an overview. ACM Computing Surveys, 52(1): 3301282
15 W Ji , C Liu , Z Liu , C Wang , X Li . (2021). Concentration, composition, and exposure contributions of fine particulate matter on subway concourses in China. Environmental Pollution, 275: 116627
https://doi.org/10.1016/j.envpol.2021.116627
16 X B Jin , W T Gong , J L Kong , Y T Bai , T L Su . (2022). PFVAE: a planar flow-based variational auto-encoder prediction model for time series data. Mathematics, 10(4): 610
https://doi.org/10.3390/math10040610
17 M H Kim , Y S Kim , J Lim , J T Kim , S W Sung , C Yoo . (2010). Data-driven prediction model of indoor air quality in an underground space. Korean Journal of Chemical Engineering, 27(6): 1675–1680
https://doi.org/10.1007/s11814-010-0313-5
18 M Längkvist , L Karlsson , A Loutfi . (2014). A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recognition Letters, 42: 11–24
https://doi.org/10.1016/j.patrec.2014.01.008
19 S H Lee , S Choi . (2007). Two-dimensional canonical correlation analysis. IEEE Signal Processing Letters, 14(10): 735–738
https://doi.org/10.1109/LSP.2007.896438
20 H Liu , C Yang , M Huang , C Yoo . (2020). Multivariate statistical monitoring of subway indoor air quality using dynamic concurrent partial least squares. Environmental Science and Pollution Research International, 27(4): 4159–4169
https://doi.org/10.1007/s11356-019-06935-9
21 J Loy-Benitez , S Heo , C Yoo . (2020). Soft sensor validation for monitoring and resilient control of sequential subway indoor air quality through memory-gated recurrent neural networks-based autoencoders. Control Engineering Practice, 97: 104330
https://doi.org/10.1016/j.conengprac.2020.104330
22 V Makarenkov , P Legendre . (2002). Nonlinear redundancy analysis and canonical correspondence analysis based on polynomial regression. Ecology, 83(4): 1146–1161
https://doi.org/10.1890/0012-9658(2002)083[1146:NRAACC]2.0.CO;2
23 M Mannan , S G Al-Ghamdi . (2021). Indoor air quality in buildings: a comprehensive review on the factors influencing air pollution in residential and commercial structure. International Journal of Environmental Research and Public Health, 18(6): 3276
https://doi.org/10.3390/ijerph18063276
24 T Mehmood , K H Liland , L Snipen , S Saebo . (2012). A review of variable selection methods in partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 118: 62–69
https://doi.org/10.1016/j.chemolab.2012.07.010
25 Y A Melaku , T K Gill , A W Taylor , R Adams , Z Shi . (2018). A comparison of principal component analysis, partial least-squares and reduced-rank regressions in the identification of dietary patterns associated with bone mass in ageing Australians. European Journal of Nutrition, 57(5): 1969–1983
https://doi.org/10.1007/s00394-017-1478-z
26 M Memarzadeh , B Matthews , I Avrekh . (2020). Unsupervised anomaly detection in flight data using convolutional variational auto-encoder. Aerospace (Basel, Switzerland), 7(8): 115
https://doi.org/10.3390/aerospace7080115
27 S K Ooi , D Tanny , J Chen , K Wang . (2021). Developing semi-supervised variational autoencoder-generative adversarial network models to enhance quality prediction performance. Chemometrics and Intelligent Laboratory Systems, 217: 104385
https://doi.org/10.1016/j.chemolab.2021.104385
28 Z Pu , J Yan , L Chen , Z Li , W Tian , T Tao , K Xin . (2023). A hybrid Wavelet-CNN-LSTM deep learning model for short- term urban water demand forecasting. Frontiers of Environmental Science & Engineering, 17(2): 22
https://doi.org/10.1007/s11783-023-1622-3
29 J Qian , Z Song , Y Yao , Z Zhu , X Zhang . (2022). A review on autoencoder based representation learning for fault detection and diagnosis in industrial processes. Chemometrics and Intelligent Laboratory Systems, 231: 104711
https://doi.org/10.1016/j.chemolab.2022.104711
30 Y Qin , Z Lou , Y Wang , S Lu , P Sun . (2022). An analytical partial least squares method for process monitoring. Control Engineering Practice, 124: 105182
https://doi.org/10.1016/j.conengprac.2022.105182
31 X Ran , W Chen , B Yvert , S Zhang . (2022). A hybrid autoencoder framework of dimensionality reduction for brain-computer interface decoding. Computers in Biology and Medicine, 148: 105871
https://doi.org/10.1016/j.compbiomed.2022.105871
32 N RasiwasiaD MahajanV MahadevanG Aggarwal (2014). Cluster canonical correlation analysis. Reykjavik, ICELAND, 823–831
33 C Reche , T Moreno , V Martins , M C Minguillon , T Jones , E de Miguel , M Capdevila , S Centelles , X Querol . (2017). Factors controlling particle number concentration and size at metro stations. Atmospheric Environment, 156: 169–181
https://doi.org/10.1016/j.atmosenv.2017.03.002
34 G San Martin , E Lopez Droguett , V Meruane , M das Chagas Moura . (2019). Deep variational auto-encoders: a promising tool for dimensionality reduction and ball bearing elements fault diagnosis. Structural Health Monitoring, 18(4): 1092–1128
https://doi.org/10.1177/1475921718788299
35 X Shu , T Bao , Y Li , J Gong , K Zhang . (2022). VAE-TALSTM: a temporal attention and variational autoencoder-based long short-term memory framework for dam displacement prediction. Engineering with Computers, 38(4): 3497–3512
https://doi.org/10.1007/s00366-021-01362-2
36 M Śmiełowska , M Marc , B Zabiegala . (2017). Indoor air quality in public utility environments: a review. Environmental Science and Pollution Research International, 24(12): 11166–11176
https://doi.org/10.1007/s11356-017-8567-7
37 F SouzaR AraujoJ (2016) Mendes. Review of soft sensor methods or regression applications. Chemometrics and Intelligent Laboratory Systems 152: 69-79 doi: 10.1016/j.chemolab.2015.12.011
38 X Su , L Sutarlie , X J Loh . (2020). Sensors and analytical technologies for air quality: particulate matters and bioaerosols. Chemistry, an Asian Journal, 15(24): 4241–4255
https://doi.org/10.1002/asia.202001051
39 J Sun , X Wang , N Xiong , J Shao . (2018). Learning sparse representation with variational auto-encoder for anomaly detection. IEEE Access: Practical Innovations, Open Solutions, 6: 33353–33361
https://doi.org/10.1109/ACCESS.2018.2848210
40 M Vallejo , La Espriella C de , J Gómez-Santamaría , A F Ramírez-Barrera , E Delgado-Trejos . (2020). Soft metrology based on machine learning: a review. Measurement Science & Technology, 31(3): 032001
https://doi.org/10.1088/1361-6501/ab4b39
41 B Wang , Z Li , Z Dai , N Lawrence , X Yan . (2020). Data-driven mode identification and unsupervised fault detection for nonlinear multimode processes. IEEE Transactions on Industrial Informatics, 16(6): 3651–3661
https://doi.org/10.1109/TII.2019.2942650
42 J Wang , Y Lu , C Xin , C Yoo , H Liu . (2022). Kernel PLS with AdaBoost ensemble learning for particulate matters forecasting in subway environment. Measurement, 204: 111974
https://doi.org/10.1016/j.measurement.2022.111974
43 W Wei , O Ramalho , L Malingre , S Sivanantham , J C Little , C Mandin . (2019). Machine learning and statistical models for predicting indoor air quality. Indoor Air, 29(5): 704–726
https://doi.org/10.1111/ina.12580
44 W Xie , J You , C Zhi , L Li . (2021). The toxicity of ambient fine particulate matter (PM2.5) to vascular endothelial cells. Journal of Applied Toxicology, 41(5): 713–723
https://doi.org/10.1002/jat.4138
45 B Xu , J L Hao . (2017). Air quality inside subway metro indoor environment worldwide: a review. Environment International, 107: 33–46
https://doi.org/10.1016/j.envint.2017.06.016
46 Q S Xu , Y Z Liang , H L Shen . (2001). Generalized PLS regression. Journal of Chemometrics, 15(3): 135–148
https://doi.org/10.1002/cem.605
47 X Yan , Y Xu , D She , W Zhang . (2022). Reliable fault diagnosis of bearings using an optimized stacked variational denoising auto-encoder. Entropy (Basel, Switzerland), 24(1): 24010036
48 K Zhang , J Yang , J Sha , H Liu . (2022). Dynamic slow feature analysis and random forest for subway indoor air quality modeling. Building and Environment, 213: 108876
https://doi.org/10.1016/j.buildenv.2022.108876
49 M H Zhang , Q S Xu , D L Massart . (2004). Averaged and weighted average partial least squares. Analytica Chimica Acta, 504(2): 279–289
https://doi.org/10.1016/j.aca.2003.10.056
50 Y Zhang , F Li , C Ni , S Gao , S Zhang , J Xue , Z Ning , C Wei , F Fang , Y Nie . et al.. (2023). Prediction and cause investigation of ozone based on a double-stage attention mechanism recurrent neural network. Frontiers of Environmental Science & Engineering, 17(2): 21
https://doi.org/10.1007/s11783-023-1621-4
51 J Zhu , H Shi , B Song , Y Tao , S Tan . (2020). Information concentrated variational auto-encoder for quality-related nonlinear process monitoring. Journal of Process Control, 94: 12–25
https://doi.org/10.1016/j.jprocont.2020.08.002
[1] FSE-23068-OF-WQY_suppl_1 Download
[1] Yang Xie, Hua Zhong, Zhixiong Weng, Xinbiao Guo, Satbyul Estella Kim, Shaowei Wu. PM2.5 concentration declining saves health expenditure in China[J]. Front. Environ. Sci. Eng., 2023, 17(7): 90-.
[2] Cong Liu, Yinping Zhang. Relations between indoor and outdoor PM2.5 and constituent concentrations[J]. Front. Environ. Sci. Eng., 2019, 13(1): 5-.
[3] Can DONG, Lingxiao YANG, Chao YAN, Qi YUAN, Yangchun YU, Wenxing WANG. Particle size distributions, PM2.5 concentrations and water-soluble inorganic ions in different public indoor environments: a case study in Jinan, China[J]. Front Envir Sci Eng, 2013, 7(1): 55-65.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed