Please wait a minute...
Frontiers of Earth Science

ISSN 2095-0195

ISSN 2095-0209(Online)

CN 11-5982/P

Postal Subscription Code 80-963

2018 Impact Factor: 1.205

Front. Earth Sci.    2023, Vol. 17 Issue (2) : 620-631    https://doi.org/10.1007/s11707-021-0962-1
RESEARCH ARTICLE
Variational quality control of non-Gaussian innovations and its parametric optimizations for the GRAPES m3DVAR system
Jie HE1, Yang SHI2, Boyang ZHOU3, Qiuping WANG4, Xulin MA4()
1. Guangzhou Institute of Tropical and Marine Meteorology, China Meteorological Administration, Guangzhou 510640, China
2. Guangdong Meteorological Observatory, Guangzhou 510640, China
3. Qingdao Air Traffic Management Station of Civil Aviation of China, Qingdao 266108, China
4. Collaborative Innovation Center on Forecast and Evaluation of Meteorological Disasters, Key Laboratory of Meteorological Disaster, Nanjing University of Information Science and Technology, Nanjing 210044, China
 Download: PDF(2953 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

The magnitude and distribution of observation innovations, which have an important impact on the analyzed accuracy, are critical variables in data assimilation. Variational quality control (VarQC) based on the contaminated Gaussian distribution (CGD) of observation innovations is now widely used in data assimilation, owing to the more reasonable representation of the probability density function of innovations that can sufficiently absorb observations by assigning different weights iteratively. However, the inaccurate parameters prevent VarQC from showing the advantages it should have in the GRAPES (Global/Regional Assimilation and PrEdiction System) m3DVAR system. Consequently, the parameter optimization methods are considerable critical studies to improve VarQC. In this paper, we describe two probable CGDs to include the non-Gaussian distribution of actual observation errors, Gaussian plus flat distribution and Huber norm distribution. The potential optimization methods of the parameters are introduced in detail for different VarQCs. With different parameter configurations, the optimization analysis shows that the Gaussian plus flat distribution and the Huber norm distribution are more consistent with the long-tail distribution of actual innovations compared to the Gaussian distribution. The VarQC’s cost and gradient functions with Huber norm distribution are more reasonable, while the VarQC’s cost function with Gaussian plus flat distribution may converge on different minimums due to its non-concave properties. The weight functions of two VarQCs gradually decrease with the increase of innovation but show different shapes, and the VarQC with Huber norm distribution shows more elasticity to assimilate the observations with a high contamination rate. Moreover, we reveal a general derivation relationship between the CGDs and VarQCs. A novel schematic interpretation that classifies the assimilated data into three categories in VarQC is presented. They are conducive to the development of a new VarQC method in the future.

Keywords data assimilation      variational quality control      contaminated Gaussian distribution      non-Gaussian distribution      innovation     
Corresponding Author(s): Xulin MA   
Online First Date: 30 June 2022    Issue Date: 04 August 2023
 Cite this article:   
Jie HE,Yang SHI,Boyang ZHOU, et al. Variational quality control of non-Gaussian innovations and its parametric optimizations for the GRAPES m3DVAR system[J]. Front. Earth Sci., 2023, 17(2): 620-631.
 URL:  
https://academic.hep.com.cn/fesci/EN/10.1007/s11707-021-0962-1
https://academic.hep.com.cn/fesci/EN/Y2023/V17/I2/620
Fig.1  The fitted Gaussian distributions (histogram) of innovation of brightness temperature from geostationary meteorological satellite (Meteosat-7). The green boxes indicate the non-Gaussian long-tail histograms (distribution). The dashed line represents the approximation between observation and background (ERA5 reanalysis). The x-axis represents the size of the innovation; the y-axis represents frequency. S shows the total sample number of observations at 1200 UTC 10 June 2015.
Fig.2  The distribution (histogram) statistics of normalized innovation by using the horizontal wind (a) and the temperature (b) from sounding and aircraft-reported observations are fitted with the pure Gaussian distribution (red line, Gaussian fit) and the CGD fit (black line). The x-axis represents the size of the innovation; the y-axis represents frequency. S shows the total sample number of observations from August 2013.
Fig.3  Histogram (a) and transformed histogram (b) of temperature departure from analysis from the AIREP observations collected from 1 to 14 August 2013. The slope of the straight lines fitted in (b) defines the standard deviation of the Gaussian curve (dashed line) drawn in (a). The red dashed label represents a Gaussian curve with a mean of 0.06 and a variance of 0.64. The solid red lines are displayed by an expression in y=abs(x/0.77). The top S shows the total sample number of observations.
Obs.VariablesSamplemax(f)λαApproximate rejection threshold
Flat-VarQCBgQC
TEMPwind u177089161621.814.07.24 m/s7.2?12 m/s
wind v177089150641.814.07.24 m/s7.2?12 m/s
rh10182777129.133.532.0%59.5%
pressure148670184510.302.50.75 hPa0.4?2.3 hPa
AIREPwind u170138182961.754.07.00 m/s10?16 m/s
wind v170138174161.754.07.00 m/s10?16 m/s
temp169550175650.774.03.08 K4.8?5.6 K
SYNOPrh6476750669.214.036.8%52?92%
pressure2634519300.604.02.40 hPa2.9?3.2 hPa
SATOBwind u89847952.274.09.08 m/s8?20 m/s
wind v89849492.274.09.08 m/s8?20 m/s
Tab.1  Statistics of rejection threshold for Flat-VarQC and BgQC in the GRAPES m3DVAR system. The horizontal wind (u and v), relative humidity (rh), pressure, and temperature (temp) of different observations are obtained from the standard GTS observations over the Chinese mainland (10°N ? 60° N, 70°E ? 140° E). These observations are collected four times every day spanning the period from 1 to 14 August 2013
Parameter namesParameter values
ε0.0020.0050.0100.0200.0500.1000.2000.3000.500
c2.4352.1601.9451.7171.3991.1400.8620.6850.436
Tab.2  The corresponding relationship between the contamination rate ε and the transition point c
Fig.4  A schematic diagram of observation classification for variational quality control. The green dots represent the valid observations (VOs); the black dots represent the available observations (AOs) shown as outliers but are not gross errors; the red dots represent the deleterious observations (DOs) shown as outliers but are gross errors. The green circle denotes the perfect threshold for a pure Gaussian error domain, the red circle represents the empirical (extended) rejection threshold in conventional quality control for gross errors. The VarQC pushes the observations (the black dots) in the gray zone iteratively into the green circle to be ingested in the VarQC.
Fig.5  The profiles of the probability density distribution of observation error for the pure Gaussian and two CGDs (Gaussian + flat and Huber norm). The parameters used are ε=0.1,d=5,σo=1,c=1.14.
Fig.6  The cost function, gradient function, and analysis weight function of (a) Flat-VarQC (blue line) and (b) Huber-VarQC (black line). The red lines represent the observational cost function of the pure 3DVAR under the assumption of Gaussian distribution. The parameters are used as shown in Fig. 5.
Fig.7  Same as the analysis weight functions in Fig.6(a) but for the sensitivity to d (the width of flat distribution) and γ (γ=ε2π/ε2π2d(1?ε)2d(1?ε)) in Flat-VarQC. The other parameters are (a) ε=0.1 and (b) d=5. The same parameter is σo=1.
Fig.8  Same as the analysis weight functions in Fig. 6 but for the sensitivity to the contamination rate (ε) in different VarQCs. The parameter c corresponding to ε is based on Table 2. The other parameters are the same as those of Fig. 5.
1 E, Anderson H Jarvinen (1999). Variational quality control.Q J R Meteorol Soc, 125(554): 697–722
https://doi.org/10.1002/qj.49712555416
2 P, Courtier E, Andersson W, Heckley D, Vasiljevic M, Hamrud A, Hollingsworth F, Rabier M, Fisher J Pailleux (1998). The ECMWF implementation of three-dimensional variational assimilation (3D-Var).Part I: formulation. Q J R Meteorol Soc, 124(550): 1783–1807
https://doi.org/10.1002/qj.49712455002
3 I, Dharssi A C, Lorenc N B Ingleby (1992). Treatment of gross errors using maximum probability theory.Q J R Meteorol Soc, 118(507): 1017–1036
https://doi.org/10.1002/qj.49711850709
4 B H, Duan W M, Zhang X F, Yang H J, Dai Y Yu (2017). Assimilation of typhoon wind field retrieved from scatterometer and SAR based on the Huber norm quality control.Remote Sens (Basel), 9(10): 987
https://doi.org/10.3390/rs9100987
5 C, Fernández M F Steel (1998). On Bayesian modeling of fat tails and skewness.J Am Stat Assoc, 93(441): 359–371
6 A, Fowler Leeuwen P J van (2013). Observation impact in data assimilation: the effect of non-Gaussian observation error.Tellus, 65(1): 20035
https://doi.org/10.3402/tellusa.v65i0.20035
7 P, Gauthier C, Chouinard B Brasnett (2003). Quality control: methodology and applications. In: Swinbank R, Shutyaev V, Lahoz W A, eds. Data Assimilation for the Earth System. Dordrecht: Springer-Verlag, 177–187
8 A, Guitton W W Symes (2003). Robust inversion of seismic data using the Huber norm.Geophysics, 68(4): 1310–1319
https://doi.org/10.1190/1.1598124
9 F R Hampel (1977). Rejection rules and robust estimates of location: an analysis of some Monte Carlo results. In: Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, and Random Processes. Dordrecht, Hingham, MA, 187–194
10 F R (2001) Hampel . Robust statistics: a brief introduction and overview. In: Research report/Seminar für Statistik, Eidgenössische Technische Hochschule. Zurich, Switzerland, 1–5
11 J, He X L, Ma X Y, Ge J J, Liu W, Cheng M Y, Chan Z N Xiao (2021). Variational quality control of non-Gaussian innovations in the GRAPES m3DVAR system: mass field evaluation of assimilation experiments.Adv Atmos Sci, 38(9): 1510–1524
https://doi.org/10.1007/s00376-021-0336-3
12 A Hollingsworth (1989). The role of real-time four-dimensional data assimilation in the quality control, interpretation, and synthesis of climate data. In: Anderson D L T, and Willebrand J, eds. Oceanic Circulation Models: Combining Data and Dynamics. Dordrecht: Springer-Verlag, 304–339
13 P L, Houtekamer H L Mitchell (1998). Data assimilation using an ensemble Kalman filter technique.Mon Weather Rev, 126(3): 796–811
https://doi.org/10.1175/1520-0493(1998)126<0796:DAUAEK>2.0.CO;2
14 P J Huber (1972). The 1972 wald lecture robust statistics: a review.Ann Math Stat, 43(4): 1041–1067
https://doi.org/10.1214/aoms/1177692459
15 P J Huber (2011). Robust statistics. In: Lovric M, ed. International Encyclopedia of Statistical Science. Heidelberg: Springer-Verlag, 1248–1251
16 N B, Ingleby A C Lorenc (1993). Bayesian quality control using multivariate normal distributions.Q J R Meteorol Soc, 119(513): 1195–1225
https://doi.org/10.1002/qj.49711951316
17 H, Jarvinen P Unden (1997). Observation screening and first guess quality control in the ECMWF 3D-Var data assimilation system.In: ECMWF Technical Memoranda, (236): 1–33
18 E Kalnay (2003). Atmospheric Modeling, Data Assimilation and Predictability.New York: Cambridge University Press, 198–204
19 W A, Lahoz P Schneider (2014). Data assimilation: making sense of earth observation.Front Env Sci—Switz, 2: 16
20 R, Legrand Y, Michel T Montmerle (2016). Diagnosing non-Gaussianity of forecast and analysis errors in a convective scale model.Nonlinear Process Geophys, 23: 1–12
https://doi.org/10.5194/npg-23-1-2016
21 A C Lorenc (1988). Optimal nonlinear objective analysis.Q J R Meteorol Soc, 114(479): 205–240
https://doi.org/10.1002/qj.49711447911
22 A C, Lorenc O Hammon (1988). Objective quality control of observations using Bayesian methods: theory, and a practical implementation.Q J R Meteorol Soc, 114(480): 515–543
https://doi.org/10.1002/qj.49711448012
23 X L, Ma Z R, Zhuang J S, Xue W S Lu (2009). Development of 3-D variational data assimilation system for the nonhydrostatic numerical weather prediction model-GRAPES.Acta Meteorol Sin, 67(1): 50–60
24 X L, Ma J, He B Y, Zhou L, Li Y, Ji H Guo (2017). Effect of variational quality control of Non-Gaussian distribution observation error on heavy rainfall prediction. Trans Atmos Sci, 40(2): 170−180 (in Chinese)
25 D F, Parrish J C Derber (1992). The national meteorological center spectral statistical interpolation analysis system.Mon Weather Rev, 120(8): 1747–1763
https://doi.org/10.1175/1520-0493(1992)120<1747:TNMCSS>2.0.CO;2
26 C A, Pires O, Talagrand M Bocquet (2010). Diagnosis and impacts of non-Gaussianity of innovations in data assimilation.Physica D, 239(17): 1701–1717
https://doi.org/10.1016/j.physd.2010.05.006
27 F, Rabier H, Järvinen E, Klinker J F, Mahfouf A Simmons (2000). The ECMWF operational implementation of four-dimensional variational assimilation.I: experimental results with simplified physics. Q J R Meteorol Soc, 126(564): 1143–1170
https://doi.org/10.1002/qj.49712656415
28 T, Sondergaard P F Lermusiaux (2013). Data assimilation with Gaussian mixture models using the dynamically orthogonal field equations.Part I: theory and scheme. Mon Weather Rev, 141(6): 1737–1760
https://doi.org/10.1175/MWR-D-11-00295.1
29 A Storto (2016). Variational quality control of hydrographic profile data with non-Gaussian errors for global ocean variational data assimilation systems.Ocean Model, 104: 226–241
https://doi.org/10.1016/j.ocemod.2016.06.011
30 X J, Su R J Pursera (2013). A new observation error probability model for nonlinear variational quality control and applications within the NCEP gridpoint statistical interpolation. In: Sixth WMO Symposium on Data Assimilation. Sixth WMO Symposium on Data Assimilation, 33–34
31 R, Swinbank V, Shutyaev W A Lahoz (2003). Data assimilation for the earth system. Maratea: Springer, 31–35
32 C, Tavolato L Isaksen (2015). On the use of a Huber norm for observation quality control in the ECMWF 4D–Var.Q J R Meteorol Soc, 141(690): 1514–1527
https://doi.org/10.1002/qj.2440
33 J W Tukey (1960). A survey of sampling from contaminated distributions. In: Olkin I, ed. Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling. Stanford, CA: Stanford University Press, 448–485
34 X G, Wang T Lei (2014). GSI-based four-dimensional ensemble-variational (4DEnsVar) data assimilation: formulation and single-resolution experiments with real data for NCEP global forecast system.Mon Weather Rev, 142(9): 3303–3325
https://doi.org/10.1175/MWR-D-13-00303.1
35 Y X Yang (1991). Robust bayesian estimation.J Geod, 65(3): 145–150
36 H, Zhao X L, Zou Z K Qin (2015). Quality control of specific humidity from surface stations based on EOF and FFT—case study.Front Earth Sci, 9(3): 381–393
https://doi.org/10.1007/s11707-014-0483-2
37 J J Zhu (1996). Robustness and the robust estimate.J Geod, 70(9): 586–590
https://doi.org/10.1007/BF00867867
38 J J, Zhu Z Q Zeng (1999). The theory of surveying adjustment under contaminated error model. Acta Geodaetica et Cartographica Sinica, 28(3): 91–91(in Chinese)
[1] Hong LI, Jingyao LUO, Mengting XU. Ensemble data assimilation and prediction of typhoon and associated hazards using TEDAPS: evaluation for 2015–2018 seasons[J]. Front. Earth Sci., 2019, 13(4): 733-743.
[2] Lu REN. A case study of GOES-15 imager bias characterization with a numerical weather prediction model[J]. Front. Earth Sci., 2016, 10(3): 409-418.
[3] Yi YU,Weimin ZHANG,Zhongyuan WU,Xiaofeng YANG,Xiaoqun CAO,Mengbin ZHU. Assimilation of HY-2A scatterometer sea surface wind data in a 3DVAR data assimilation system–A case study of Typhoon Bolaven[J]. Front. Earth Sci., 2015, 9(2): 192-201.
[4] Zhibin SUN, Lie-Yauw OEY, Yi-Hui ZHOU. Skill-assessments of statistical and Ensemble Kalman Filter data assimilative analyses using surface and deep observations in the Gulf of Mexico[J]. Front Earth Sci, 2013, 7(3): 271-281.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed