|
|
A metric normalization of tree edit distance |
Yujian LI1( ), Chenguang ZHANG1,2 |
1. College of Computer Science and Technology, Beijing University of Technology, Beijing 100124, China; 2. College of Information Science and Technology, Hainan University, Haikou 570228, China |
|
|
Abstract Traditional normalized tree edit distances do not satisfy the triangle inequality. We present a metric normalization method for tree edit distance, which results in a new normalized tree edit distance fulfilling the triangle inequality, under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. We prove that the new distance, in the range [0, 1], is a genuine metric as a simple function of the sizes of two ordered labeled trees and the tree edit distance between them, which can be directly computed through tree edit distance with the same complexity. Based on an efficient algorithm to represent digits as ordered labeled trees, we show that the normalized tree edit metric can provide slightly better results than other existing methods in handwritten digit recognition experiments using the approximating and eliminating search algorithm (AESA) algorithm.
|
Keywords
metric
normalization
tree edit distance
triangle inequality
approximating and eliminating search algorithm (AESA)
|
Corresponding Author(s):
LI Yujian,Email:liyujian@bjut.edu.cn
|
Issue Date: 05 March 2011
|
|
1 |
Zhang K, Shasha D, Wang J T L. Approximate tree matching in the presence of variable length don’t cares. Journal of Algorithms , 1994, 16(1): 33-66 doi: 10.1006/jagm.1994.1003
|
2 |
Tai K C. The tree-to-tree correction problem. Journal of the ACM , 1979, 26(3): 422-433 doi: 10.1145/322139.322143
|
3 |
Zhang K, Shasha D. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing , 1989, 18(6): 1245-1262 doi: 10.1137/0218082
|
4 |
Kilpel?inen P, Mannila H. Ordered and unordered tree inclusion. SIAM Journal on Computing , 1995, 24(2): 340-356 doi: 10.1137/S0097539791218202
|
5 |
Klein P, Tirthapura S, Sharvit D, Kimia B. A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proceedings of 11th Annual ACM-SIAM Symposium on Discrete Algorithms . 2000, 696-704
|
6 |
Hoffmann C M, O’Donnell M J. Pattern matching in trees. Journal of the ACM , 1982, 29(1): 68-95 doi: 10.1145/322290.322295
|
7 |
Ramesh R, Ramakrishnan I V. Nonlinear pattern matching in trees. Journal of the ACM , 1992, 39(2): 295-316 doi: 10.1145/128749.128752
|
8 |
Bille P. A survey on tree edit distance and related problems. Theoretical Computer Science , 2005, 337(1-3): 217-239 doi: 10.1016/j.tcs.2004.12.030
|
9 |
Levenshtein A. Binary codes capable of correcting deletions, insertions and reversals. Soviet Physics Doklady , 1966, 10(8): 707-710
|
10 |
Sellers P H. On the theory and computation of evolutionary distances. SIAM Journal of Applied Mathematics , 1974, 26(4): 787-793 doi: 10.1137/0126070
|
11 |
Wagner R A, Fischer M J. The string-to-string correction problem. Journal of the ACM , 1974, 21(1): 168-173 doi: 10.1145/321796.321811
|
12 |
Weigel A, Fein F. Normalizing the weighted edit distance. In: Proceedings of 12th International Conference on Pattern Recognition . 1994, 399-402
|
13 |
Yianilos P N. Normalized Forms for Two Common Metrics. Report 91–082–9027–1, revision 7. July 2002, NEC Research Inst . http://www.pnylab.com/pny/
|
14 |
Rico-Juan J R, Mico L. Comparison of AESA and LAESA search algorithms using string and tree-edit-distance. Pattern Recognition Letters , 2003, 24(9-10): 1417-1426 doi: 10.1016/S0167-8655(02)00382-3
|
15 |
Yujian L, Bo L. A normalized Levenshtein distance metric. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2007, 29(6): 1091-1095 doi: 10.1109/TPAMI.2007.1078 pmid:17431306
|
16 |
Marzal A, Vidal E. Computation of normalized edit distance and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1993, 15(9): 926-932 doi: 10.1109/34.232078
|
17 |
Schroeder M, Schweimeier R. Argument and misunderstandings: fuzzy unification for negotiating agents. Electronic Notes in Theoretical Computer Science , 2002, 70(5): 1-19 doi: 10.1016/S1571-0661(04)80585-1
|
18 |
Vidal E. New formulation and improvements of the nearest-neighbour approximating and eliminating search algorithm (AESA). Pattern Recognition Letters , 1994, 15(1): 1-7 doi: 10.1016/0167-8655(94)90094-9
|
19 |
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE , 1998, 86(11): 2278-2324 doi: 10.1109/5.726791
|
20 |
Carrasco R C, Forcada M L. A note on the Nagen-draprasad-Wang-Gupta thining algorithm. Pattern Recognition Letters , 1995, 16(5): 539-541 doi: 10.1016/0167-8655(95)00121-V
|
21 |
Kong L B, Tang S W, Yang D Q, Wang T J, Gao J. Querying Techniques for XML Data. Journal of Software , 2007, 18(6): 1400-1418 (in Chinese) doi: 10.1360/jos181400
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|