|
From similarity perspective: a robust collaborative filtering approach for service recommendations
Min GAO, Bin LING, Linda YANG, Junhao WEN, Qingyu XIONG, Shun LI
Front. Comput. Sci.. 2019, 13 (2): 231-246.
https://doi.org/10.1007/s11704-017-6566-y
Collaborative filtering (CF) is a technique commonly used for personalized recommendation and Web service quality-of-service (QoS) prediction. However, CF is vulnerable to shilling attackers who inject fake user profiles into the system. In this paper, we first present the shilling attack problem on CF-based QoS recommender systems for Web services. Then, a robust CF recommendation approach is proposed from a user similarity perspective to enhance the resistance of the recommender systems to the shilling attack. In the approach, the generally used similarity measures are analyzed, and the DegSim (the degree of similarities with top k neighbors) with those measures is selected for grouping and weighting the users. Then, the weights are used to calculate the service similarities/differences and predictions.We analyzed and evaluated our algorithms using WS-DREAM and Movielens datasets. The experimental results demonstrate that shilling attacks influence the prediction of QoS values, and our proposed features and algorithms achieve a higher degree of robustness against shilling attacks than the typical CF algorithms.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
DFTracker: detecting double-fetch bugs by multi-taint parallel tracking
Pengfei WANG, Kai LU, Gen LI, Xu ZHOU
Front. Comput. Sci.. 2019, 13 (2): 247-263.
https://doi.org/10.1007/s11704-016-6383-8
A race condition is a common trigger for concurrency bugs. As a special case, a race condition can also occur across the kernel and user space causing a doublefetch bug, which is a field that has received little research attention. In our work, we first analyzed real-world doublefetch bug cases and extracted two specific patterns for doublefetch bugs. Based on these patterns, we proposed an approach of multi-taint parallel tracking to detect double-fetch bugs. We also implemented a prototype called DFTracker (doublefetch bug tracker), and we evaluated it with our test suite. Our experiments demonstrated that it could effectively find all the double-fetch bugs in the test suite including eight realworld cases with no false negatives and minor false positives. In addition, we tested it on Linux kernel and found a new double-fetch bug. The execution overhead is approximately 2x for single-file cases and approximately 9x for the whole kernel test, which is acceptable. To the best of the authors’ knowledge, this work is the first to introduce multi-taint parallel tracking to double-fetch bug detection—an innovative method that is specific to double-fetch bug features—and has better path coverage as well as lower runtime overhead than the widely used dynamic approaches.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
An effective method for service components selection based on micro-canonical annealing considering dependability assurance
Shichen ZOU, Junyu LIN, Huiqiang WANG, Hongwu LV, Guangsheng FENG
Front. Comput. Sci.. 2019, 13 (2): 264-279.
https://doi.org/10.1007/s11704-017-6317-0
Distributed virtualization changes the pattern of building software systems. However, it brings some problems on dependability assurance owing to the complex social relationships and interactions between service components. The best way to solve the problems in a distributed virtualized environment is dependable service components selection. Dependable service components selection can be modeled as finding a dependable service path, which is a multiconstrained optimal path problem. In this paper, a service components selection method that searches for the dependable service path in a distributed virtualized environment is proposed from the perspective of dependability assurance. The concept of Quality of Dependability is introduced to describe and constrain software system dependability during dynamic composition. Then, we model the dependable service components selection as a multiconstrained optimal path problem, and apply the Adaptive Bonus-Penalty Microcanonical Annealing algorithm to find the optimal dependable service path. The experimental results show that the proposed algorithm has high search success rate and quick converges.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
Scene word recognition from pieces to whole
Anna ZHU, Seiichi UCHIDA
Front. Comput. Sci.. 2019, 13 (2): 292-301.
https://doi.org/10.1007/s11704-017-6420-2
Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regions with CNNs resulted in higher accuracy than when roughly segmented regions were used. Therefore, we expect to extract complete character regions from scene images. Text in natural scene images has an obvious contrast with its attachments. Many methods attempt to extract characters through different segmentation techniques. However, for blurred, occluded, and complex background cases, those methods may result in adjoined or over segmented characters. In this paper, we propose a scene word recognition model that integrates words from small pieces to entire after-cluster-based segmentation. The segmented connected components are classified as four types: background, individual character proposals, adjoined characters, and stroke proposals. Individual character proposals are directly inputted to a CNN that is trained using accurately segmented character images. The sliding window strategy is applied to adjoined character regions. Stroke proposals are considered as fragments of entire characters whose locations are estimated by a stroke spatial distribution system. Then, the estimated characters from adjoined characters and stroke proposals are classified by a CNN that is trained on roughly segmented character images. Finally, a lexicondriven integration method is performed to obtain the final word recognition results. Compared to other word recognition methods, our method achieves a comparable performance on Street View Text and the ICDAR 2003 and ICDAR 2013 benchmark databases. Moreover, our method can deal with recognizing text images of occlusion and improperly segmented text images.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
CLS-Miner: efficient and effective closed high-utility itemset mining
Thu-Lan DAM, Kenli LI, Philippe FOURNIER-VIGER, Quang-Huy DUONG
Front. Comput. Sci.. 2019, 13 (2): 357-381.
https://doi.org/10.1007/s11704-016-6245-4
High-utility itemset mining (HUIM) is a popular data mining task with applications in numerous domains. However, traditional HUIM algorithms often produce a very large set of high-utility itemsets (HUIs). As a result, analyzing HUIs can be very time consuming for users. Moreover, a large set of HUIs also makes HUIM algorithms less efficient in terms of execution time and memory consumption. To address this problem, closed high-utility itemsets (CHUIs), concise and lossless representations of all HUIs, were proposed recently. Although mining CHUIs is useful and desirable, it remains a computationally expensive task. This is because current algorithms often generate a huge number of candidate itemsets and are unable to prune the search space effectively. In this paper, we address these issues by proposing a novel algorithm called CLS-Miner. The proposed algorithm utilizes the utility-list structure to directly compute the utilities of itemsets without producing candidates. It also introduces three novel strategies to reduce the search space, namely chain-estimated utility co-occurrence pruning, lower branch pruning, and pruning by coverage. Moreover, an effective method for checking whether an itemset is a subset of another itemset is introduced to further reduce the time required for discovering CHUIs. To evaluate the performance of the proposed algorithm and its novel strategies, extensive experiments have been conducted on six benchmark datasets having various characteristics. Results show that the proposed strategies are highly efficient and effective, that the proposed CLS-Miner algorithmoutperforms the current state-ofthe- art CHUD and CHUI-Miner algorithms, and that CLSMiner scales linearly.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
Differentially private high-dimensional data publication via grouping and truncating techniques
Ning WANG, Yu GU, Jia XU, Fangfang LI, Ge YU
Front. Comput. Sci.. 2019, 13 (2): 382-395.
https://doi.org/10.1007/s11704-017-6591-x
The count of one column for high-dimensional datasets, i.e., the number of records containing this column, has been widely used in numerous applications such as analyzing popular spots based on check-in location information and mining valuable items from shopping records. However, this poses a privacy threat when directly publishing this information. Differential privacy (DP), as a notable paradigm for strong privacy guarantees, is thereby adopted to publish all column counts. Prior studies have verified that truncating records or grouping columns can effectively improve the accuracy of published results. To leverage the advantages of the two techniques, we combine these studies to further boost the accuracy of published results. However, the traditional penalty function, which measures the error imported by a given pair of parameters including truncating length and group size, is so sensitive that the derived parameters deviate from the optimal parameters significantly. To output preferable parameters, we first design a smart penalty function that is less sensitive than the traditional function. Moreover, a two-phase selection method is proposed to compute these parameters efficiently, together with the improvement in accuracy. Extensive experiments on a broad spectrum of real-world datasets validate the effectiveness of our proposals.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
Cursor momentum for fascination measurement
Yu HONG, Kai WANG, Weiyi GE, Yingying QIU, Guodong ZHOU
Front. Comput. Sci.. 2019, 13 (2): 396-412.
https://doi.org/10.1007/s11704-017-6607-6
We present a very different cause of search engine user behaviors—fascination. It is generally identified as the initial effect of a product attribute on users’ interest and purchase intentions. Considering the fact that in most cases the cursor is driven directly by a hand to move via a mouse (or touchpad), we use the cursor movement as the critical feature to analyze the personal reaction against the fascinating search results. This paper provides a deep insight into the goal-directed cursor movement that occurs within a remarkably short period of time (<30 milliseconds), which is the interval between a user’s click-through and decision-making behaviors. Instead of the fundamentals, we focus on revealing the characteristics of the split-second cursor movement. Our empirical findings showed that a user may push or pull the mouse with a slightly greater strength when fascinated by a search result. As a result, the cursor slides toward the search result with an increased momentum. We model the momentum through a combination of translational and angular kinetic energy calculations. Based on Fitts’ law, we implement goal-directed cursor movement identification. Supported by the momentum, together with other physical features, we built different fascination-based search result reranking systems. Our experiments showed that goal-directed cursor momentum is an effective feature in detecting fascination. In particular, they show feasibility in both the personalized and cross-media cases. In addition, we detail the advantages and disadvantages of both click-through rate and cursor momentum for re-ranking search results.
References |
Supplementary Material |
Related Articles |
Metrics
|
|
Physical-barrier detection based collective motion analysis
Gaoqi HE, Qi CHEN, Dongxu JIANG, Yubo YUAN, Xingjian LU
Front. Comput. Sci.. 2019, 13 (2): 426-436.
https://doi.org/10.1007/s11704-018-7165-2
Collective motion is one of the most fascinating phenomena and mainly caused by the interactions between individuals. Physical-barriers, as the particular facilities which divide the crowd into different lanes, greatly affect the measurement of such interactions. In this paper we propose the physical-barrier detection based collective motion analysis (PDCMA) approach. The main idea is that the interaction between spatially adjacent pedestrians actually does not exist if they are separated by the physical-barrier. Firstly, the physical-barriers are extracted by two-stage clustering. The scene is automatically divided into several motion regions. Secondly, local region collectiveness is calculated to represent the interactions between pedestrians in each region. Finally, extensive evaluations use the three typical methods, i.e., the PDCMA, the Collectiveness, and the average normalized Velocity, to show the efficiency and efficacy of our approach in the scenes with and without physical barriers. Moreover, several escalator scenes are selected as the typical physical-barrier test scenes to demonstrate the performance of our approach.Comparedwith the current collectivemotion analysis methods, our approach better adapts to the scenes with physical barriers.
References |
Supplementary Material |
Related Articles |
Metrics
|
18 articles
|