Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

   Online First

Administered by

, Volume 6 Issue 5

For Selected: View Abstracts Toggle Thumbnails
RESEARCH ARTICLE
Measure oriented training: a targeted approach to imbalanced classification problems
Bo YUAN, Wenhuang LIU
Front Comput Sci. 2012, 6 (5): 489-497.  
https://doi.org/10.1007/s11704-012-2943-8

Abstract   HTML   PDF (462KB)

Since the overall prediction error of a classifier on imbalanced problems can be potentially misleading and biased, alternative performance measures such as G-mean and F-measure have been widely adopted. Various techniques including sampling and cost sensitive learning are often employed to improve the performance of classifiers in such situations. However, the training process of classifiers is still largely driven by traditional error based objective functions. As a result, there is clearly a gap between themeasure according to which the classifier is evaluated and how the classifier is trained. This paper investigates the prospect of explicitly using the appropriate measure itself to search the hypothesis space to bridge this gap. In the case studies, a standard threelayer neural network is used as the classifier, which is evolved by genetic algorithms (GAs) with G-mean as the objective function. Experimental results on eight benchmark problems show that the proposed method can achieve consistently favorable outcomes in comparison with a commonly used sampling technique. The effectiveness of multi-objective optimization in handling imbalanced problems is also demonstrated.

References | Related Articles | Metrics
Improving bagging performance through multi-algorithm ensembles
Kuo-Wei HSU, Jaideep SRIVASTAVA
Front Comput Sci. 2012, 6 (5): 498-512.  
https://doi.org/10.1007/s11704-012-1163-6

Abstract   HTML   PDF (692KB)

Working as an ensemble method that establishes a committee of classifiers first and then aggregates their outcomes through majority voting, bagging has attracted considerable research interest and been applied in various application domains. It has demonstrated several advantages, but in its present form, bagging has been found to be less accurate than some other ensemble methods. To unlock its power and expand its user base, we propose an approach that improves bagging through the use of multi-algorithm ensembles. In a multi-algorithm ensemble, multiple classification algorithms are employed. Starting from a study of the nature of diversity, we show that compared to using different training sets alone, using heterogeneous algorithms together with different training sets increases diversity in ensembles, and hence we provide a fundamental explanation for research utilizing heterogeneous algorithms. In addition, we partially address the problem of the relationship between diversity and accuracy by providing a non-linear function that describes the relationship between diversity and correlation. Furthermore, after realizing that the bootstrap procedure is the exclusive source of diversity in bagging, we use heterogeneity as another source of diversity and propose an approach utilizing heterogeneous algorithms in bagging. For evaluation, we consider several benchmark data sets from various application domains. The results indicate that, in terms of F1-measure, our approach outperformsmost of the other state-of-the-art ensemble methods considered in experiments and, in terms of mean margin, our approach is superior to all the others considered in experiments.

References | Related Articles | Metrics
A probabilistic model with multi-dimensional features for object extraction
Jing WANG, Zhijing LIU, Hui ZHAO
Front Comput Sci. 2012, 6 (5): 513-526.  
https://doi.org/10.1007/s11704-012-1093-3

Abstract   HTML   PDF (574KB)

To identify recruitment information in different domains, we propose a novel model of hierarchical treestructured conditional random fields (HT-CRFs). In our approach, first, the concept of aWeb object (WOB) is discussed for the description of special Web information. Second, in contrast to traditionalmethods, the Boolean model and multirule are introduced to denote a one-dimensional text feature for a better representation of Web objects. Furthermore, a two-dimensional semantic texture feature is developed to discover the layout of a WOB, which can emphasize the structural attributes and the specific semantics term attributes of WOBs. Third, an optimal WOB information extraction (IE) based on HT-CRF is performed, addressing the problem of a model having an excessive dependence on the page structure and optimizing the efficiency of the model’s training. Finally, we compare the proposed model with existing decoupled approaches forWOB IE. The experimental results show that the accuracy rate of WOB IE is significantly improved and that time complexity is reduced.

References | Related Articles | Metrics
A precise approach to tracking dim-small targets using spectral fingerprint features
Hao SHENG, Chao LI, Yuanxin OUYANG, Zhang XIONG
Front Comput Sci. 2012, 6 (5): 527-536.  
https://doi.org/10.1007/s11704-012-1106-2

Abstract   HTML   PDF (748KB)

A precise method for accurately tracking dimsmall targets, based on spectral fingerprint is proposed where traditional full color tracking seems impossible. A fingerprint model is presented to adequately extract spectral features. By creating a multidimensional feature space and extending the limited RGB information to the hyperspectral information, the improved precise tracking model based on a nonparametric kernel density estimator is built using the probability histogram of spectral features. A layered particle filter algorithm for spectral tracking is presented to avoid the object jumping abruptly. Finally, experiments are conducted that show that the tracking algorithm with spectral fingerprint features is accurate, fast, and robust. It meets the needs of dim-small target tracking adequately.

References | Related Articles | Metrics
Automatic object classification using motion blob based local feature fusion for traffic scene surveillance
Zhaoxiang ZHANG, Yunhong WANG
Front Comput Sci. 2012, 6 (5): 537-546.  
https://doi.org/10.1007/s11704-012-1296-7

Abstract   HTML   PDF (667KB)

Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging for the following reasons. Firstly, regions of interest in videos are of low resolution and limited size due to the capacity of conventional surveillance cameras. Secondly, the intra-class variations are very large due to changes of view angles, lighting conditions, and environments. Thirdly, real-time performance of algorithms is always required for real applications. In this paper, we evaluate the performance of local feature descriptors for automatic object classification in traffic scenes. Image intensity or gradient information is directly used to construct effective feature vectors from regions of interest extracted via motion detection. This strategy has great advantages of efficiency compared to various complicated texture features. We not only analyze and evaluate the performance of different feature descriptors, but also fuse different scales and features to achieve better performance. Numerous experiments are conducted and experimental results demonstrate the efficiency and effectiveness of this strategy with robustness to noise, variance of view angles, lighting conditions, and environments.

References | Related Articles | Metrics
Real-time urban traffic information estimation with a limited number of surveillance cameras
Guangtao XUE, Ke ZHANG, Qi HE, Hongzi ZHU
Front Comput Sci. 2012, 6 (5): 547-559.  
https://doi.org/10.1007/s11704-012-2051-9

Abstract   HTML   PDF (814KB)

Constant traffic congestion consumes enormous amounts of energy and causes vastly increased journey times. Therefore, real-time traffic information is of great importance to the public because such information is invaluable to more efficient traffic control and travel planning. To obtain such information in metropolises like Shanghai, however, is very challenging due to the extraordinarily large scale and complexity of the underlying road network. In this paper, we propose a novel traffic estimation scheme utilizing surveillance cameras pervasively deployed in cities. With only a limited number of roads with cameras, we adopt a measurementbased traffic matrix (TM) estimation method to infer the traffic conditions on those roads with no cameras. Extensively trace-driven simulations as well as field study results show that our scheme can achieve high accuracy with a very limited number of measurements. The accuracy of our measurementbased algorithm outperforms the traditional speed-based and model-based approaches by up to 50%.

References | Related Articles | Metrics
A group priority earliest deadline first scheduling algorithm
Qi LI, Wei BA
Front Comput Sci. 2012, 6 (5): 560-567.  
https://doi.org/10.1007/s11704-012-1104-4

Abstract   HTML   PDF (421KB)

In most priority scheduling algorithms, the number of priority levels is assumed to be unlimited. However, if a task set requires more priority levels than the system can support, several jobs must in practice be assigned the same priority level. To solve this problem, a novel group priority earliest deadline first (GPEDF) scheduling algorithm is presented. In this algorithm, a schedulability test is given to form a job group, in which the jobs can arbitrarily change their order without reducing the schedulability. We consider jobs in the group having the same priority level and use shortest job first (SJF) to schedule the jobs in the group to improve the performance of the system. Compared with earliest deadline first (EDF), best effort (BE), and group-EDF (gEDF), simulation results show that the new algorithm exhibits the least switching, the shortest average response time, and the fewest required priority levels. It also has a higher success ratio than both EDF and gEDF.

References | Related Articles | Metrics
The ClasSi coefficient for the evaluation of ranking quality in the presence of class similarities
Anca Maria IVANESCU, Marc WICHTERICH, Christian BEECKS, Thomas SEIDL
Front Comput Sci. 2012, 6 (5): 568-580.  
https://doi.org/10.1007/s11704-012-1175-2

Abstract   HTML   PDF (958KB)

Evaluationmeasures play an important role in the design of new approaches, and often quality is measured by assessing the relevance of the obtained result set.While many evaluation measures based on precision/recall are based on a binary relevance model, ranking correlation coefficients are better suited for multi-class problems. State-of-the-art ranking correlation coefficients like Kendall’s τ and Spearman’s ρ do not allow the user to specify similarities between differing object classes and thus treat the transposition of objects from similar classes the same way as that of objects from dissimilar classes. We propose ClasSi, a new ranking correlation coefficient which deals with class label rankings and employs a class distance function to model the similarities between the classes. We also introduce a graphical representation of ClasSi which describes how the correlation evolves throughout the ranking.

References | Related Articles | Metrics
An efficient approach for continuous density queries
Jie WEN, Xiaofeng MENG, Xing HAO, Jianliang XU
Front Comput Sci. 2012, 6 (5): 581-595.  
https://doi.org/10.1007/s11704-012-1120-4

Abstract   HTML   PDF (726KB)

In location-based services, a density query returns the regions with high concentrations of moving objects (MOs). The use of density queries can help users identify crowded regions so as to avoid congestion. Most of the existing methods try very hard to improve the accuracy of query results, but ignore query efficiency.However, response time is also an important concern in query processing and may have an impact on user experience. In order to address this issue, we present a new definition of continuous density queries. Our approach for processing continuous density queries is based on the new notion of a safe interval, using which the states of both dense and sparse regions are dynamically maintained. Two indexing structures are also used to index candidate regions for accelerating query processing and improving the quality of results. The efficiency and accuracy of our approach are shown through an experimental comparison with snapshot density queries.

References | Related Articles | Metrics
REVIEW AETICLE
Rethinking the architecture design of data center networks
Kaishun WU, Jiang XIAO, Lionel M. NI
Front Comput Sci. 2012, 6 (5): 596-603.  
https://doi.org/10.1007/s11704-012-1155-6

Abstract   HTML   PDF (439KB)

In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore’s law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidimensional characteristics. The huge amount of data brings many challenges on the management, transportation, and processing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. However, how to build a good data center remains an open challenge. Concurrently, the architecture design, which significantly affects the total performance, is of great research interest. This paper surveys advances in data center network design. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today’s data center network architectures. In the third part, we present some up-to-date data center frameworks and make a comprehensive comparison of them. During the comparison, we observe that there is no so-called optimal data center and the design should be different referring to the data placement, replication, processing, and query processing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.

References | Related Articles | Metrics
RESEARCH ARTICLE
Fostering artificial societies using social learning and social control in parallel emergency management systems
Wei DUAN, Xiaogang QIU
Front Comput Sci. 2012, 6 (5): 604-610.  
https://doi.org/10.1007/s11704-012-1166-3

Abstract   HTML   PDF (425KB)

How can we foster and grow artificial societies so as to cause social properties to emerge that are logical, consistent with real societies, and are expected by designers? We propose a framework for fostering artificial societies using social learning mechanisms and social control approaches. We present the application of fostering artificial societies in parallel emergency management systems. Then we discuss social learning mechanisms in artificial societies, including observational learning, reinforcement learning, imitation learning, and advice-based learning. Furthermore, we discuss social control approaches, including social norms, social policies, social reputations, social commitments, and sanctions.

References | Related Articles | Metrics
REVIEW AETICLE
Social influence and spread dynamics in social networks
Xiaolong ZHENG, Yongguang ZHONG, Daniel ZENG, Fei-Yue WANG
Front Comput Sci. 2012, 6 (5): 611-620.  
https://doi.org/10.1007/s11704-012-1176-1

Abstract   HTML   PDF (382KB)

Social networks often serve as a critical medium for information dissemination, diffusion of epidemics, and spread of behavior, by shared activities or similarities between individuals. Recently, we have witnessed an explosion of interest in studying social influence and spread dynamics in social networks. To date, relatively little material has been provided on a comprehensive review in this field. This brief survey addresses this issue.We present the current significant empirical studies on real social systems, including network construction methods, measures of network, and newly empirical results.We then provide a concise description of some related social models from both macro- and micro-level perspectives. Due to the difficulties in combining real data and simulation data for verifying and validating real social systems, we further emphasize the current research results of computational experiments. We hope this paper can provide researchers significant insights into better understanding the characteristics of personal influence and spread patterns in large-scale social systems.

References | Related Articles | Metrics
RESEARCH ARTICLE
A co-evolving memetic wrapper for prediction of patient outcomes in TCM informatics
Dion DETTERER, Paul KWAN, Cedric GONDRO
Front Comput Sci. 2012, 6 (5): 621-629.  
https://doi.org/10.1007/s11704-012-2959-0

Abstract   HTML   PDF (510KB)

Traditional Chinese medicine (TCM) relies on the combined effects of herbs within prescribed formulae. However, given the combinatorial explosion due to the vast number of herbs available for treatment, the study of these combined effects can become computationally intractable. Thus feature selection has become increasingly crucial as a pre-processing step prior to the study of combined effects in TCM informatics. In accord with this goal, a new feature selection algorithm known as a co-evolving memetic wrapper (COW) is proposed in this paper. COW takes advantage of recent research in genetic algorithms (GAs) and memetic algorithms (MAs) by evolving appropriate feature subsets for a given domain. Our empirical experiments have demonstrated that COW is capable of selecting subsets of herbs from a TCM insomnia dataset that shows signs of combined effects on the prediction of patient outcomes measured in terms of classification accuracy. We compare the proposed algorithm with results from statistical analysis including main effects and up to three way interaction terms and show that COW is capable of correctly identifying the herbs and herb by herb effects that are significantly associated to patient outcome prediction.

References | Related Articles | Metrics
13 articles