|
Improving bagging performance through multi-algorithm ensembles
Kuo-Wei HSU, Jaideep SRIVASTAVA
Front Comput Sci. 2012, 6 (5): 498-512.
https://doi.org/10.1007/s11704-012-1163-6
Working as an ensemble method that establishes a committee of classifiers first and then aggregates their outcomes through majority voting, bagging has attracted considerable research interest and been applied in various application domains. It has demonstrated several advantages, but in its present form, bagging has been found to be less accurate than some other ensemble methods. To unlock its power and expand its user base, we propose an approach that improves bagging through the use of multi-algorithm ensembles. In a multi-algorithm ensemble, multiple classification algorithms are employed. Starting from a study of the nature of diversity, we show that compared to using different training sets alone, using heterogeneous algorithms together with different training sets increases diversity in ensembles, and hence we provide a fundamental explanation for research utilizing heterogeneous algorithms. In addition, we partially address the problem of the relationship between diversity and accuracy by providing a non-linear function that describes the relationship between diversity and correlation. Furthermore, after realizing that the bootstrap procedure is the exclusive source of diversity in bagging, we use heterogeneity as another source of diversity and propose an approach utilizing heterogeneous algorithms in bagging. For evaluation, we consider several benchmark data sets from various application domains. The results indicate that, in terms of F1-measure, our approach outperformsmost of the other state-of-the-art ensemble methods considered in experiments and, in terms of mean margin, our approach is superior to all the others considered in experiments.
References |
Related Articles |
Metrics
|
|
A probabilistic model with multi-dimensional features for object extraction
Jing WANG, Zhijing LIU, Hui ZHAO
Front Comput Sci. 2012, 6 (5): 513-526.
https://doi.org/10.1007/s11704-012-1093-3
To identify recruitment information in different domains, we propose a novel model of hierarchical treestructured conditional random fields (HT-CRFs). In our approach, first, the concept of aWeb object (WOB) is discussed for the description of special Web information. Second, in contrast to traditionalmethods, the Boolean model and multirule are introduced to denote a one-dimensional text feature for a better representation of Web objects. Furthermore, a two-dimensional semantic texture feature is developed to discover the layout of a WOB, which can emphasize the structural attributes and the specific semantics term attributes of WOBs. Third, an optimal WOB information extraction (IE) based on HT-CRF is performed, addressing the problem of a model having an excessive dependence on the page structure and optimizing the efficiency of the model’s training. Finally, we compare the proposed model with existing decoupled approaches forWOB IE. The experimental results show that the accuracy rate of WOB IE is significantly improved and that time complexity is reduced.
References |
Related Articles |
Metrics
|
|
Automatic object classification using motion blob based local feature fusion for traffic scene surveillance
Zhaoxiang ZHANG, Yunhong WANG
Front Comput Sci. 2012, 6 (5): 537-546.
https://doi.org/10.1007/s11704-012-1296-7
Automatic object classification in traffic scene videos is an important issue for intelligent visual surveillance with great potential for all kinds of security applications. However, this problem is very challenging for the following reasons. Firstly, regions of interest in videos are of low resolution and limited size due to the capacity of conventional surveillance cameras. Secondly, the intra-class variations are very large due to changes of view angles, lighting conditions, and environments. Thirdly, real-time performance of algorithms is always required for real applications. In this paper, we evaluate the performance of local feature descriptors for automatic object classification in traffic scenes. Image intensity or gradient information is directly used to construct effective feature vectors from regions of interest extracted via motion detection. This strategy has great advantages of efficiency compared to various complicated texture features. We not only analyze and evaluate the performance of different feature descriptors, but also fuse different scales and features to achieve better performance. Numerous experiments are conducted and experimental results demonstrate the efficiency and effectiveness of this strategy with robustness to noise, variance of view angles, lighting conditions, and environments.
References |
Related Articles |
Metrics
|
|
A group priority earliest deadline first scheduling algorithm
Qi LI, Wei BA
Front Comput Sci. 2012, 6 (5): 560-567.
https://doi.org/10.1007/s11704-012-1104-4
In most priority scheduling algorithms, the number of priority levels is assumed to be unlimited. However, if a task set requires more priority levels than the system can support, several jobs must in practice be assigned the same priority level. To solve this problem, a novel group priority earliest deadline first (GPEDF) scheduling algorithm is presented. In this algorithm, a schedulability test is given to form a job group, in which the jobs can arbitrarily change their order without reducing the schedulability. We consider jobs in the group having the same priority level and use shortest job first (SJF) to schedule the jobs in the group to improve the performance of the system. Compared with earliest deadline first (EDF), best effort (BE), and group-EDF (gEDF), simulation results show that the new algorithm exhibits the least switching, the shortest average response time, and the fewest required priority levels. It also has a higher success ratio than both EDF and gEDF.
References |
Related Articles |
Metrics
|
|
Rethinking the architecture design of data center networks
Kaishun WU, Jiang XIAO, Lionel M. NI
Front Comput Sci. 2012, 6 (5): 596-603.
https://doi.org/10.1007/s11704-012-1155-6
In the rising tide of the Internet of things, more and more things in the world are connected to the Internet. Recently, data have kept growing at a rate more than four times of that expected in Moore’s law. This explosion of data comes from various sources such as mobile phones, video cameras and sensor networks, which often present multidimensional characteristics. The huge amount of data brings many challenges on the management, transportation, and processing IT infrastructures. To address these challenges, the state-of-art large scale data center networks have begun to provide cloud services that are increasingly prevalent. However, how to build a good data center remains an open challenge. Concurrently, the architecture design, which significantly affects the total performance, is of great research interest. This paper surveys advances in data center network design. In this paper we first introduce the upcoming trends in the data center industry. Then we review some popular design principles for today’s data center network architectures. In the third part, we present some up-to-date data center frameworks and make a comprehensive comparison of them. During the comparison, we observe that there is no so-called optimal data center and the design should be different referring to the data placement, replication, processing, and query processing. After that, several existing challenges and limitations are discussed. According to these observations, we point out some possible future research directions.
References |
Related Articles |
Metrics
|
|
Fostering artificial societies using social learning and social control in parallel emergency management systems
Wei DUAN, Xiaogang QIU
Front Comput Sci. 2012, 6 (5): 604-610.
https://doi.org/10.1007/s11704-012-1166-3
How can we foster and grow artificial societies so as to cause social properties to emerge that are logical, consistent with real societies, and are expected by designers? We propose a framework for fostering artificial societies using social learning mechanisms and social control approaches. We present the application of fostering artificial societies in parallel emergency management systems. Then we discuss social learning mechanisms in artificial societies, including observational learning, reinforcement learning, imitation learning, and advice-based learning. Furthermore, we discuss social control approaches, including social norms, social policies, social reputations, social commitments, and sanctions.
References |
Related Articles |
Metrics
|
|
Social influence and spread dynamics in social networks
Xiaolong ZHENG, Yongguang ZHONG, Daniel ZENG, Fei-Yue WANG
Front Comput Sci. 2012, 6 (5): 611-620.
https://doi.org/10.1007/s11704-012-1176-1
Social networks often serve as a critical medium for information dissemination, diffusion of epidemics, and spread of behavior, by shared activities or similarities between individuals. Recently, we have witnessed an explosion of interest in studying social influence and spread dynamics in social networks. To date, relatively little material has been provided on a comprehensive review in this field. This brief survey addresses this issue.We present the current significant empirical studies on real social systems, including network construction methods, measures of network, and newly empirical results.We then provide a concise description of some related social models from both macro- and micro-level perspectives. Due to the difficulties in combining real data and simulation data for verifying and validating real social systems, we further emphasize the current research results of computational experiments. We hope this paper can provide researchers significant insights into better understanding the characteristics of personal influence and spread patterns in large-scale social systems.
References |
Related Articles |
Metrics
|
13 articles
|