Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front Comput Sci Chin    0, Vol. Issue () : 119-125    https://doi.org/10.1007/s11704-011-9336-2
RESEARCH ARTICLE
A metric normalization of tree edit distance
Yujian LI1(), Chenguang ZHANG1,2
1. College of Computer Science and Technology, Beijing University of Technology, Beijing 100124, China; 2. College of Information Science and Technology, Hainan University, Haikou 570228, China
 Download: PDF(179 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.

Keywords metric      normalization      tree edit distance      triangle inequality      approximating and eliminating search algorithm (AESA)     
Corresponding Author(s): LI Yujian,Email:liyujian@bjut.edu.cn   
Issue Date: 05 March 2011
 Cite this article:   
Yujian LI,Chenguang ZHANG. A metric normalization of tree edit distance[J]. Front Comput Sci Chin, 0, (): 119-125.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-011-9336-2
https://academic.hep.com.cn/fcs/EN/Y0/V/I/119
Fig.1  Three elementary edit operations. (a) : the node label is changed to ; (b) : all children of the node become children of its parent node ; (c) : a consecutive sequence of siblings among the children of the node become children of the node
Fig.1  Three elementary edit operations. (a) : the node label is changed to ; (b) : all children of the node become children of its parent node ; (c) : a consecutive sequence of siblings among the children of the node become children of the node
Fig.2  The string “” and its equivalent single tree
Fig.2  The string “” and its equivalent single tree
Fig.3  The codes of eight directions
Fig.3  The codes of eight directions
Fig.4  Two examples of ordered labeled trees generated from images of digit “0” and “5”. (a) Gray-scale images; (b) thinned binary images; (c) ordered labeled trees
Fig.4  Two examples of ordered labeled trees generated from images of digit “0” and “5”. (a) Gray-scale images; (b) thinned binary images; (c) ordered labeled trees
Fig.5  Weight matrix for relabeling operations with labels 1-8
Fig.5  Weight matrix for relabeling operations with labels 1-8
Fig.6  The histograms of negative (a) (, , ) and (b) (, , ) on the training set.
Fig.6  The histograms of negative (a) (, , ) and (b) (, , ) on the training set.
NTED-AESANTED1-AESANTED2-AESATED-AESA
Test 191.691.490.689.4
Test 292.892.891.689.9
Test 389.989.888.987.0
Test 490.590.489.788.1
Tab.1  Experiment results (accuracy/%) using AESA with four different tree edit distances
1 Zhang K, Shasha D, Wang J T L. Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms , 1994, 16(1): 33-66
doi: 10.1006/jagm.1994.1003
2 Tai K C. The tree-to-tree correction problem. Journal of the ACM , 1979, 26(3): 422-433
doi: 10.1145/322139.322143
3 Zhang K, Shasha D. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing , 1989, 18(6): 1245-1262
doi: 10.1137/0218082
4 Kilpel?inen P, Mannila H. Ordered and unordered tree inclusion. SIAM Journal on Computing , 1995, 24(2): 340-356
doi: 10.1137/S0097539791218202
5 Klein P, Tirthapura S, Sharvit D, Kimia B. A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms . 2000, 696-704
6 Hoffmann C M, O’Donnell M J. Pattern matching in trees. Journal of the ACM , 1982, 29(1): 68-95
doi: 10.1145/322290.322295
7 Ramesh R, Ramakrishnan I V. Nonlinear pattern matching in trees. Journal of the ACM , 1992, 39(2): 295-316
doi: 10.1145/128749.128752
8 Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science , 2005, 337(1-3): 217-239
doi: 10.1016/j.tcs.2004.12.030
9 Levenshtein A. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady , 1966, 10(8): 707-710
10 Sellers P H. On the theory and computation of evolutionary distances. SIAM Journal of Applied Mathematics , 1974, 26(4): 787-793
doi: 10.1137/0126070
11 Wagner R A, Fischer M J. The string-to-string correction problem. Journal of the ACM , 1974, 21(1): 168-173
doi: 10.1145/321796.321811
12 Weigel A, Fein F. Normalizing the weighted edit distance. In: Proceedings of 12th International Conference on Pattern Recognition . 1994, 399-402
13 Yianilos P N. Normalized Forms for Two Common Metrics. Report 91–082–9027–1, revision 7. July 2002, NEC Research Inst . http://www.pnylab.com/pny/
14 Rico-Juan J R, Mico L. Comparison of AESA and LAESA search algorithms using string and tree-edit-distance. Pattern Recognition Letters , 2003, 24(9-10): 1417-1426
doi: 10.1016/S0167-8655(02)00382-3
15 Yujian L, Bo L. A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2007, 29(6): 1091-1095
doi: 10.1109/TPAMI.2007.1078 pmid:17431306
16 Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1993, 15(9): 926-932
doi: 10.1109/34.232078
17 Schroeder M, Schweimeier R. Argument and misunderstandings: fuzzy unification for negotiating agents. Electronic Notes in Theoretical Computer Science , 2002, 70(5): 1-19
doi: 10.1016/S1571-0661(04)80585-1
18 Vidal E. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA). Pattern Recognition Letters , 1994, 15(1): 1-7
doi: 10.1016/0167-8655(94)90094-9
19 Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE , 1998, 86(11): 2278-2324
doi: 10.1109/5.726791
20 Carrasco R C, Forcada M L. A note on the Nagen-draprasad-Wang-Gupta thining algorithm. Pattern Recognition Letters , 1995, 16(5): 539-541
doi: 10.1016/0167-8655(95)00121-V
21 Kong L B, Tang S W, Yang D Q, Wang T J, Gao J. Querying Techniques for XML Data. Journal of Software , 2007, 18(6): 1400-1418 (in Chinese)
doi: 10.1360/jos181400
[1] Lerina AVERSANO, Mario Luca BERNARDI, Marta CIMITILE, Martina IAMMARINO, Debora MONTANO. Forecasting technical debt evolution in software systems: an empirical study[J]. Front. Comput. Sci., 2023, 17(3): 173210-.
[2] Hongsheng XU, Zihan CHEN, Yu ZHANG, Xin GENG, Siya MI, Zhihong YANG. Weakly supervised temporal action localization with proxy metric modeling[J]. Front. Comput. Sci., 2023, 17(2): 172309-.
[3] Wei GAO, Mingwen SHAO, Jun SHU, Xinkai ZHUANG. Meta-BN Net for few-shot learning[J]. Front. Comput. Sci., 2023, 17(1): 171302-.
[4] Qingqing GAN, Joseph K. LIU, Xiaoming WANG, Xingliang YUAN, Shi-Feng SUN, Daxin HUANG, Cong ZUO, Jianfeng WANG. Verifiable searchable symmetric encryption for conjunctive keyword queries in cloud storage[J]. Front. Comput. Sci., 2022, 16(6): 166820-.
[5] Hongbin XU, Weili YANG, Qiuxia WU, Wenxiong KANG. Endowing rotation invariance for 3D finger shape and vein verification[J]. Front. Comput. Sci., 2022, 16(5): 165332-.
[6] Awais KHAN, Aun IRTAZA, Ali JAVED, Tahira NAZIR, Hafiz MALIK, Khalid Mahmood MALIK, Muhammad Ammar KHAN. Defocus blur detection using novel local directional mean patterns (LDMP) and segmentation via KNN matting[J]. Front. Comput. Sci., 2022, 16(2): 162702-.
[7] Yan-Ping SUN, Min-Ling ZHANG. Compositional metric learning for multi-label classification[J]. Front. Comput. Sci., 2021, 15(5): 155320-.
[8] Fan LIU, Xiaomin JI, Gang HU, Jing GAO. A computer aided design method for car form and its application based on shape parameters[J]. Front. Comput. Sci., 2020, 14(6): 146703-.
[9] Qianchen YU, Zhiwen YU, Zhu WANG, Xiaofeng WANG, Yongzhi WANG. Estimating posterior inference quality of the relational infinite latent feature model for overlapping community detection[J]. Front. Comput. Sci., 2020, 14(6): 146323-.
[10] Zhenghui HU, Wenjun WU, Jie LUO, Xin WANG, Boshu LI. Quality assessment in competition-based software crowdsourcing[J]. Front. Comput. Sci., 2020, 14(6): 146207-.
[11] Yixuan TANG, Zhilei REN, Weiqiang KONG, He JIANG. Compiler testing: a systematic literature analysis[J]. Front. Comput. Sci., 2020, 14(1): 1-20.
[12] Ashish Kumar DWIVEDI, Anand TIRKEY, Santanu Kumar RATH. Software design pattern mining using classification-based techniques[J]. Front. Comput. Sci., 2018, 12(5): 908-922.
[13] Yansheng DU, Zhihua CHEN, Changqing ZHANG, Xiaochun CAO. Research on axial bearing capacity of rectangular concrete-filled steel tubular columns based on artificial neural networks[J]. Front. Comput. Sci., 2017, 11(5): 863-873.
[14] Guoliang CHEN, Rui MAO, Kezhong LU. A parallel computing framework for big data[J]. Front. Comput. Sci., 2017, 11(4): 608-621.
[15] Xuzhou LI,Yilong YIN,Yanbin NING,Gongping YANG,Lei PAN. A hybrid biometric identification framework for high security applications[J]. Front. Comput. Sci., 2015, 9(3): 392-401.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed