Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2023, Vol. 17 Issue (1) : 171601    https://doi.org/10.1007/s11704-021-1080-7
RESEARCH ARTICLE
Effective ensemble learning approach for SST field prediction using attention-based PredRNN
Baiyou QIAO1,2(), Zhongqiang WU1, Ling MA1, Yicheng Zhou1, Yunjiao SUN1
1. School of Computer Science and Engineering, Northeastern University, Shenyang 110169, China
2. Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang 110169, China
 Download: PDF(4736 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Accurate prediction of sea surface temperature (SST) is extremely important for forecasting oceanic environmental events and for ocean studies. However, the existing SST prediction methods do not consider the seasonal periodicity and abnormal fluctuation characteristics of SST or the importance of historical SST data from different times; thus, these methods suffer from low prediction accuracy. To solve this problem, we comprehensively consider the effects of seasonal periodicity and abnormal fluctuation characteristics of SST data, as well as the influence of historical data in different periods, on prediction accuracy. We propose a novel ensemble learning approach that combines the Predictive Recurrent Neural Network(PredRNN) network and an attention mechanism for effective SST field prediction. In this approach, the XGBoost model is used to learn the long-period fluctuation law of SST and to extract seasonal periodic features from SST data. The exponential smoothing method is used to mitigate the impact of severely abnormal SST fluctuations and extract the a priori features of SST data. The outputs of the two aforementioned models and the original SST data are stacked and used as inputs for the next model, the PredRNN network. PredRNN is the most recently developed spatiotemporal deep learning network, which simulates both spatial and temporal representations and is capable of transferring memory across layers and time steps. Therefore, we used it to extract the spatiotemporal correlations of SST data and predict future SSTs. Finally, an attention mechanism is added to capture the importance of different historical SST data, weigh the output of each step of the PredRNN network, and improve the prediction accuracy. The experimental results on two ocean datasets confirm that the proposed approach achieves higher training efficiency and prediction accuracy than the existing SST field prediction approaches do.

Keywords SST prediction      ensemble learning      XGBoost      PredRNN      attention mechanism     
Corresponding Author(s): Baiyou QIAO   
Just Accepted Date: 12 July 2021   Issue Date: 01 March 2022
 Cite this article:   
Baiyou QIAO,Zhongqiang WU,Ling MA, et al. Effective ensemble learning approach for SST field prediction using attention-based PredRNN[J]. Front. Comput. Sci., 2023, 17(1): 171601.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-021-1080-7
https://academic.hep.com.cn/fcs/EN/Y2023/V17/I1/171601
Fig.1  Illustration of SST field prediction problem
Fig.2  The framework of the ELA-PredRNN-AT approach
Fig.3  Process flow of seasonal periodic features extraction
Fig.4  An example of the data size change before and after processing by the two models
Variable Definition and explanation Dimension
χt The input vector at time t 5-D
Ht?1l The l th hidden layer output of the PredRNN at time t?1 5-D
Htl The lth hidden layer output of the PredRNN at time t 5-D
Ct?1l Thelth layer standard temporal cell at time t?1 5-D
Ctl The lth layer standard temporal cell that is delivered from the previous node at t-1 to the current time step within each ST-LSTM unit 5-D
Mtl?1 The l?1th layer spatiotemporal memory cell at time t 5-D
Mtl the spatiotemporal memory, which is conveyed vertically from l?1 layer to the current node at the same time t step. For the bottom ST-LSTM layer where l=1, Mtl?1=Mt?1L 5-D
ft Output of the forget gate. The value of each of its element is between 0 and 1. It controls the temporal information that is forgotten in the old cell state Ct?1l 5-D
ft Output of the forget gate. The value of each of its element is between 0 and 1. It controls the spatiotemporal information that is forgotten in the old cell state Mtl?1 5-D
it Output of the input gate. The value of each of its element is between 0 and 1. It controls how much of the temporal information gt will be stored in the new state Ctl. 5-D
it Output of the input gate. The value of each of its element is between 0 and 1. It controls how much of the spatiotemporal information gt will be stored in the new state Mtl 5-D
ot Output of the output gate. The value of each of its element is between 0 and 1. It controls the amount of information output to Htl from the current state Ctl and Mtl 5-D
Tab.1  Definition and explanation of the variables used in Eq. (3)
Fig.5  Custom attention layer
Fig.6  A layer-wise schematic of the PredRNN-Attention layer
Datasets Total Training set Validation set Testing set
Bohai Sea 13514 12784 365 365
South China Sea 13514 12784 365 365
Tab.2  Number of samples in the two datasets
Fig.7  the change of Mean Square Error of four models with sliding window size. (a) South China Sea; (b) Bohai Sea
Fig.8  The daily average SST trends of the selected sites in the Bohai and South China Seas. (a) Bohai Sea; (b) South China Sea
Fig.9  Change of mean square errors (MSEs) of four models with different smoothing constant α. (a) South China Sea; (b) Bohai Sea
Dataset Bohai Sea dataset South China Sea dataset
MSE RMSE MAE R2 MSE RMSE MAE R2
Approaches PredRNN 0.210 0.458 0.323 0.981 0.103 0.325 0.258 0.937
PredRNN-TF 0.208 0.457 0.316 0.981 0.092 0.303 0.229 0.946
PredRNN-ExpS 0.202 0.450 0.313 0.982 0.096 0.311 0.234 0.943
PredRNN-AT 0.198 0.445 0.318 0.982 0.090 0.299 0.226 0.947
ELA-PredRNN-AT 0.183 0.425 0.315 0.982 0.081 0.285 0.215 0.952
Tab.3  Prediction results on the Bohai Sea and the South China Sea SST datasets
Fig.10  Comparison of short-term prediction errors of several models. (a) South China Sea; (b) Bohai Sea
Fig.11  The comparison of medium-term forecasting errors on two datasets. (a) South China Sea; (b) Bohai Sea
Model Step RMSE MAE R2
SVR 1 0.6434 0.4660 0.9786
2 0.7844 0.5747 0.9758
3 0.8866 0.6452 0.9735
FC-LSTM 1 0.6245 0.4585 0.9789
2 0.7661 0.5655 0.9762
3 0.8771 0.6405 0.9737
CNN-LSTM 1 0.5708 0.4280 0.9798
2 0.7227 0.5378 0.9771
3 0.8406 0.6192 0.9746
ConvLSTM 1 0.6288 0.4847 0.8207
2 0.6901 0.4980 0.9777
3 0.8282 0.5876 0.9748
ELA-PredRNN-AT 1 0.5397 0.4084 0.9812
2 0.6262 0.5145 0.9710
3 0.7342 0.6038 0.9758
Tab.4  The scores of the models applied to the Bohai Sea dataset
Model Step RMSE MAE R2
SVR 1 0.4901 0.3226 0.9029
2 0.4705 0.3359 0.8708
3 0.5594 0.4271 0.8436
FC-LSTM 1 0.3909 0.2990 0.9110
2 0.4369 0.3369 0.8887
3 0.4723 0.3661 0.8695
CNN-LSTM 1 0.3790 0.2912 0.9162
2 0.4365 0.3389 0.8892
3 0.4905 0.3821 0.8593
ConvLSTM 1 0.3493 0.2688 0.9291
2 0.4100 0.3159 0.9018
3 0.4483 0.3478 0.8854
ELA-PredRNN-AT 1 0.3452 0.2637 0.9305
2 0.4046 0.3135 0.9046
3 0.4429 0.3446 0.8854
Tab.5  The scores of the models applied to the South China Sea dataset
Fig.12  The comparison of long-term forecasting errors. (a) MSE on the South China Sea; (b) MSE on the Bohai Sea; (c) RMSE on the South China Sea; (d) RMSE on the Bohai Sea
Prediction step size SVR FC-LSTM CNN-LSTM ConvLSTM ELA-PredRNN-AT
1 0.979 0.979 0.9805 0.979 0.981
2 0.977 0.976 0.977 0.976 0.977
3 0.974 0.973 0.975 0.976 0.975
4 0.972 0.971 0.973 0.972 0.974
5 0.970 0.969 0.9713 0.968 0.972
6 0.969 0.967 0.97 0.965 0.975
7 0.968 0.966 0.969 0.966 0.971
Tab.6  The five models’ R2 scores on the Bohai Sea dataset with the prediction step sizes
Prediction step size SVR FC-LSTM CNN-LSTM ConvLSTM ELA-PredRNN-AT
1 0.918 0.936 0.943 0.940 0.952
2 0.887 0.895 0.912 0.910 0.916
3 0.859 0.862 0.889 0.880 0.892
4 0.841 0.832 0.873 0.860 0.876
5 0.826 0.802 0.858 0.835 0.865
6 0.805 0.770 0.844 0.815 0.817
7 0.787 0.742 0.830 0.795 0.844
Tab.7  R2 scores of the five models with the prediction step size on South China Sea dataset
  
  
  
  
  
1 F J Wentz , C Gentemann , D Smith , D Chelton . Satellite measurements of sea surface temperature through clouds. Science, 2000, 288( 5467): 847– 850
2 H U Solanki , D Bhatpuria , P Chauhan . Integrative analysis of AltiKa-SSHa, MODIS-SST, and OCM-chlorophyll signatures for fisheries applications. Marine Geodesy, 2015, 38( S1): 672– 683
3 C C Funk , A Hoell . The leading mode of observed and CMIP5 ENSO-residual sea surface temperatures and associated changes in Indo-Pacific climate. Journal of Climate, 2015, 28( 11): 4309– 4329
4 S G Aparna , S D’souza , N B Arjun . Prediction of daily sea surface temperature using artificial neural networks. International Journal of Remote Sensing, 2018, 39( 12): 4214– 4231
5 Y Liu , W Fu . Assimilating high-resolution sea surface temperature data improves the ocean forecast potential in the Baltic Sea. Ocean Science, 2018, 14( 3): 525– 541
6 T N Stockdale , M A Balmaseda , A Vidard . Tropical Atlantic SST prediction with coupled ocean–atmosphere GCMs. Journal of Climate, 2006, 19( 23): 6047– 6061
7 Y Xue , A Leetmaa . Forecasts of tropical Pacific SST and sea level using a Markov model. Geophysical Research Letters, 2000, 27( 17): 2701– 2704
8 I D Lins , M Araujo , M das Chagas Moura . Prediction of sea surface temperature in the tropical Atlantic by support vector machines. Computational Statistics & Data Analysis, 2013, 61 : 187– 198
9 K Patil , M C Deo , M Ravichandran . Prediction of sea surface temperature by combining numerical and neural techniques. Journal of Atmospheric and Oceanic Technology, 2016, 33( 8): 1715– 1726
10 Q He , C Zha , M Sun , X Y Jiang , F M Qi , D M Huang , W Song . Surface temperature parallel prediction algorithm under Spark platform. Marine Science Bulletin, 2019, 38( 3): 280– 289
11 Y LeCun , Y Bengio , G Hinton . Deep learning. Nature, 2015, 521( 7553): 436– 444
12 Q Zhang , H Wang , J Dong , G Zhong , X Sun . Prediction of sea surface temperature using long short-term memory. IEEE Geoscience and Remote Sensing Letters, 2017, 14( 10): 1745– 1749
13 Y Yang , J Dong , X Sun , E Lima , Q Mu , X Wang . A CFCC-LSTM model for sea surface temperature prediction. IEEE Geoscience and Remote Sensing Letters, 2018, 15( 2): 207– 211
14 C Xiao , N Chen , C Hu , K Wang , Z Xu , Y P Cai , L Xu , Z Chen , J Gong . A spatiotemporal deep learning model for sea surface temperature field prediction using time-series satellite data. Environmental Modelling & Software, 2019, 120 : 104502–
15 L Wei , L Guan , L Qu . Prediction of sea surface temperature in the South China Sea by artificial neural networks. IEEE Geoscience and Remote Sensing Letters, 2020, 17( 4): 558– 562
16 Y Wang, M Long, J Wang. PredRNN: recurrent neural networks for predictive learning using spatiotemporal LSTMs. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 879– 888
17 X Shi, Z Chen, H Wang, D Y Yeung, W K Wong, W C Woo. Convolutional LSTM network: a machine learning approach for precipitation nowcasting. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. 2015, 802– 810
18 T Chen, C Guestrin. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 785– 794
19 J H Friedman . Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 2001, 29( 5): 1189– 1232
20 J Feng, Y Yu, Z-H Zhou. Multi-layered gradient boosting decision trees. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. 2018, 3555−3565
21 A Vaswani, N Shazeer, N Parmar, J Uszkoreit, L Jones, A N Gomez, Ł Kaiser, I Polosukhin. Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. 2017, 6000−6010
22 J Xie , J Zhang , J Yu , L Xu . An adaptive scale sea surface temperature predicting method based on deep learning with attention mechanism. IEEE Geoscience and Remote Sensing Letters, 2020, 17( 5): 740– 744
23 X Shi, D-Y Yeung. Machine learning for spatiotemporal sequence forecasting: a survey. 2018, arXiv preprint arXiv: 1808.06865
[1] Yanbin JIANG, Huifang MA, Xiaohui ZHANG, Zhixin LI, Liang CHANG. Incorporating metapath interaction on heterogeneous information network for social recommendation[J]. Front. Comput. Sci., 2024, 18(1): 181302-.
[2] Yajing GUO, Xiujuan LEI, Lian LIU, Yi PAN. circ2CBA: prediction of circRNA-RBP binding sites combining deep learning and attention mechanism[J]. Front. Comput. Sci., 2023, 17(5): 175904-.
[3] Qingwen LI, Lichao ZHANG, Lei XU, Quan ZOU, Jin WU, Qingyuan LI. Identification and classification of promoters using the attention mechanism based on long short-term memory[J]. Front. Comput. Sci., 2022, 16(4): 164348-.
[4] Juntao CHEN, Quan ZOU, Jing LI. DeepM6ASeq-EL: prediction of human N6-methyladenosine (m 6A) sites with LSTM and ensemble learning[J]. Front. Comput. Sci., 2022, 16(2): 162302-.
[5] Mingyang LI, Yuqing XING, Fang KONG, Guodong ZHOU. Towards better entity linking[J]. Front. Comput. Sci., 2022, 16(2): 162308-.
[6] Dongjin YU, Lin WANG, Xin CHEN, Jie CHEN. Using BiLSTM with attention mechanism to automatically detect self-admitted technical debt[J]. Front. Comput. Sci., 2021, 15(4): 154208-.
[7] Xibin DONG, Zhiwen YU, Wenming CAO, Yifan SHI, Qianli MA. A survey on ensemble learning[J]. Front. Comput. Sci., 2020, 14(2): 241-258.
[8] Wenhao ZHENG, Hongyu ZHOU, Ming LI, Jianxin WU. CodeAttention: translating source code to comments by exploiting the code constructs[J]. Front. Comput. Sci., 2019, 13(3): 565-578.
[9] Tao SUN, Zhi-Hua ZHOU. Structural diversity for decision tree ensemble learning[J]. Front. Comput. Sci., 2018, 12(3): 560-570.
[10] Bo SUN, Haiyan CHEN, Jiandong WANG, Hua XIE. Evolutionary under-sampling based bagging ensemble method for imbalanced data classification[J]. Front. Comput. Sci., 2018, 12(2): 331-350.
[11] Qian LI, Gang LI, Wenjia NIU, Yanan CAO, Liang CHANG, Jianlong TAN, Li GUO. Boosting imbalanced data learning with Wiener process oversampling[J]. Front. Comput. Sci., 2017, 11(5): 836-851.
[12] Lean YU, Shouyang WANG, Kin Keung LAI. Developing an SVM-based ensemble learning system for customer risk identification collaborating with customer relationship management[J]. Front Comput Sci Chin, 2010, 4(2): 196-203.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed