Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

   Online First

Administered by

, Volume 17 Issue 3

For Selected: View Abstracts Toggle Thumbnails
RESEARCH ARTICLE
Complexity of adaptive testing in scenarios defined extensionally
Ismael RODRÍGUEZ, David RUBIO, Fernando RUBIO
Front. Comput. Sci.. 2023, 17 (3): 173206-.  
https://doi.org/10.1007/s11704-022-1673-9

Abstract   HTML   PDF (2819KB)

In this paper, we consider a testing setting where the set of possible definitions of the Implementation Under Test (IUT), as well as the behavior of each of these definitions in all possible interactions, are extensionally defined, i.e., on an element-by-element and case-by-case basis. Under this setting, the problem of finding the minimum testing strategy such that collected observations will necessarily let us decide whether the IUT is correct or not (i.e., whether it necessarily belongs to the set of possible correct definitions or not) is studied in four possible problem variants: with or without non-determinism; and with or without more than one possible definition in the sets of possible correct and incorrect definitions. The computational complexity of these variants is studied, and properties such as PSPACE-completeness and Log-APX-hardness are identified.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
LETTER
RESEARCH ARTICLE
A search-based identification of variable microservices for enterprise SaaS
Sedigheh KHOSHNEVIS
Front. Comput. Sci.. 2023, 17 (3): 173208-.  
https://doi.org/10.1007/s11704-022-1390-4

Abstract   HTML   PDF (2657KB)

Recently, SaaS applications are developed as a composition of microservices that serve diverse tenants having similar but different requirements, and hence, can be developed as variability-intensive microservices. Manual identification of these microservices is difficult, time-consuming, and costly, since, they have to satisfy a set of quality metrics for several SaaS architecture configurations at the same time. In this paper, we tackle the multi-objective optimization problem of identifying variable microservices aiming optimal granularity (new metric proposed), commonality, and data convergence, with a search-based approach employing the MOEA/D algorithm. We empirically and experimentally evaluated the proposed method following the Goal-Question-Metric approach. The results show that the method is promising in identifying fully consistent, highly reusable, variable microservices with an acceptable multi-tenancy degree. Moreover, the identified microservices, although not structurally very similar to those identified by the expert architects, provide design quality measures (granularity, etc.) close to (and even better than) the experts.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Multi-dimensional information-driven many-objective software remodularization approach
Amarjeet PRAJAPATI, Anshu PARASHAR, Amit RATHEE
Front. Comput. Sci.. 2023, 17 (3): 173209-.  
https://doi.org/10.1007/s11704-022-1449-2

Abstract   HTML   PDF (4045KB)

Most of the search-based software remodularization (SBSR) approaches designed to address the software remodularization problem (SRP) areutilizing only structural information-based coupling and cohesion quality criteria. However, in practice apart from these quality criteria, there require other aspects of coupling and cohesion quality criteria such as lexical and changed-history in designing the modules of the software systems. Therefore, consideration of limited aspects of software information in the SBSR may generate a sub-optimal modularization solution. Additionally, such modularization can be good from the quality metrics perspective but may not be acceptable to the developers. To produce a remodularization solution acceptable from both quality metrics and developers’ perspectives, this paper exploited more dimensions of software information to define the quality criteria as modularization objectives. Further, these objectives are simultaneously optimized using a tailored many-objective artificial bee colony (MaABC) to produce a remodularization solution. To assess the effectiveness of the proposed approach, we applied it over five software projects. The obtained remodularization solutions are evaluated with the software quality metrics and developers view of remodularization. Results demonstrate that the proposed software remodularization is an effective approach for generating good quality modularization solutions.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Forecasting technical debt evolution in software systems: an empirical study
Lerina AVERSANO, Mario Luca BERNARDI, Marta CIMITILE, Martina IAMMARINO, Debora MONTANO
Front. Comput. Sci.. 2023, 17 (3): 173210-.  
https://doi.org/10.1007/s11704-022-1541-7

Abstract   HTML   PDF (14135KB)

Technical debt is considered detrimental to the long-term success of software development, but despite the numerous studies in the literature, there are still many aspects that need to be investigated for a better understanding of it. In particular, the main problems that hinder its complete understanding are the absence of a clear definition and a model for its identification, management, and forecasting. Focusing on forecasting technical debt, there is a growing notion that preventing technical debt build-up allows you to identify and address the riskiest debt items for the project before they can permanently compromise it. However, despite this high relevance, the forecast of technical debt is still little explored. To this end, this study aims to evaluate whether the quality metrics of a software system can be useful for the correct prediction of the technical debt. Therefore, the data related to the quality metrics of 8 different open-source software systems were analyzed and supplied as input to multiple machine learning algorithms to perform the prediction of the technical debt. In addition, several partitions of the initial dataset were evaluated to assess whether prediction performance could be improved by performing a data selection. The results obtained show good forecasting performance and the proposed document provides a useful approach to understanding the overall phenomenon of technical debt for practical purposes.

Figures and Tables | References | Related Articles | Metrics
Exploring the tidal effect of urban business district with large-scale human mobility data
Hongting NIU, Ying SUN, Hengshu ZHU, Cong GENG, Jiuchun YANG, Hui XIONG, Bo LANG
Front. Comput. Sci.. 2023, 17 (3): 173319-.  
https://doi.org/10.1007/s11704-022-1623-6

Abstract   HTML   PDF (18787KB)

Business districts are urban areas that have various functions for gathering people, such as work, consumption, leisure and entertainment. Due to the dynamic nature of business activities, there exists significant tidal effect on the boundary and functionality of business districts. Indeed, effectively analyzing the tidal patterns of business districts can benefit the economic and social development of a city. However, with the implicit and complex nature of business district evolution, it is non-trivial for existing works to support the fine-grained and timely analysis on the tidal effect of business districts. To this end, we propose a data-driven and multi-dimensional framework for dynamic business district analysis. Specifically, we use the large-scale human trajectory data in urban areas to dynamically detect and forecast the boundary changes of business districts in different time periods. Then, we detect and forecast the functional changes in business districts. Experimental results on real-world trajectory data clearly demonstrate the effectiveness of our framework on detecting and predicting the boundary and functionality change of business districts. Moreover, the analysis on practical business districts shows that our method can discover meaningful patterns and provide interesting insights into the dynamics of business districts. For example, the major functions of business districts will significantly change in different time periods in a day and the rate and magnitude of boundaries varies with the functional distribution of business districts.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
APPCorp: a corpus for Android privacy policy document structure analysis
Shuang LIU, Fan ZHANG, Baiyang ZHAO, Renjie GUO, Tao CHEN, Meishan ZHANG
Front. Comput. Sci.. 2023, 17 (3): 173320-.  
https://doi.org/10.1007/s11704-022-1627-2

Abstract   HTML   PDF (3642KB)

With the increasing popularity of mobile devices and the wide adoption of mobile Apps, an increasing concern of privacy issues is raised. Privacy policy is identified as a proper medium to indicate the legal terms, such as the general data protection regulation (GDPR), and to bind legal agreement between service providers and users. However, privacy policies are usually long and vague for end users to read and understand. It is thus important to be able to automatically analyze the document structures of privacy policies to assist user understanding. In this work we create a manually labelled corpus containing 231 privacy policies (of more than 566,000 words and 7,748 annotated paragraphs). We benchmark our data corpus with 3 document classification models and achieve more than 82% on F1-score.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Vehicle color recognition based on smooth modulation neural network with multi-scale feature fusion
Mingdi HU, Long BAI, Jiulun FAN, Sirui ZHAO, Enhong CHEN
Front. Comput. Sci.. 2023, 17 (3): 173321-.  
https://doi.org/10.1007/s11704-022-1389-x

Abstract   HTML   PDF (8165KB)

Vehicle Color Recognition (VCR) plays a vital role in intelligent traffic management and criminal investigation assistance. However, the existing vehicle color datasets only cover 13 classes, which can not meet the current actual demand. Besides, although lots of efforts are devoted to VCR, they suffer from the problem of class imbalance in datasets. To address these challenges, in this paper, we propose a novel VCR method based on Smooth Modulation Neural Network with Multi-Scale Feature Fusion (SMNN-MSFF). Specifically, to construct the benchmark of model training and evaluation, we first present a new VCR dataset with 24 vehicle classes, Vehicle Color-24, consisting of 10091 vehicle images from a 100-hour urban road surveillance video. Then, to tackle the problem of long-tail distribution and improve the recognition performance, we propose the SMNN-MSFF model with multi-scale feature fusion and smooth modulation. The former aims to extract feature information from local to global, and the latter could increase the loss of the images of tail class instances for training with class-imbalance. Finally, comprehensive experimental evaluation on Vehicle Color-24 and previously three representative datasets demonstrate that our proposed SMNN-MSFF outperformed state-of-the-art VCR methods. And extensive ablation studies also demonstrate that each module of our method is effective, especially, the smooth modulation efficiently help feature learning of the minority or tail classes. Vehicle Color-24 and the code of SMNN-MSFF are publicly available and can contact the author to obtain.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Heterogeneous clustering via adversarial deep Bayesian generative model
Xulun YE, Jieyu ZHAO
Front. Comput. Sci.. 2023, 17 (3): 173322-.  
https://doi.org/10.1007/s11704-022-1376-2

Abstract   HTML   PDF (9685KB)

This paper aims to study the deep clustering problem with heterogeneous features and unknown cluster number. To address this issue, a novel deep Bayesian clustering framework is proposed. In particular, a heterogeneous feature metric is first constructed to measure the similarity between different types of features. Then, a feature metric-restricted hierarchical sample generation process is established, in which sample with heterogeneous features is clustered by generating it from a similarity constraint hidden space. When estimating the model parameters and posterior probability, the corresponding variational inference algorithm is derived and implemented. To verify our model capability, we demonstrate our model on the synthetic dataset and show the superiority of the proposed method on some real datasets. Our source code is released on the website: Github.com/yexlwh/Heterogeneousclustering.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
LETTER
RESEARCH ARTICLE
Gauze: enabling communication-friendly block synchronization with cuckoo filter
Xiaoqiang DING, Liushun ZHAO, Lailong LUO, Junjie XIE, Deke GUO, Jinxi LI
Front. Comput. Sci.. 2023, 17 (3): 173403-.  
https://doi.org/10.1007/s11704-022-1685-5

Abstract   HTML   PDF (14695KB)

Block synchronization is an essential component of blockchain systems. Traditionally, blockchain systems tend to send all the transactions from one node to another for synchronization. However, such a method may lead to an extremely high network bandwidth overhead and significant transmission latency. It is crucial to speed up such a block synchronization process and save bandwidth consumption. A feasible solution is to reduce the amount of data transmission in the block synchronization process between any pair of peers. However, existing methods based on the Bloom filter or its variants still suffer from multiple roundtrips of communications and significant synchronization delay. In this paper, we propose a novel protocol named Gauze for fast block synchronization. It utilizes the Cuckoo filter (CF) to discern the transactions in the receiver’s mempool and the block to verify, providing an efficient solution to the problem of set reconciliation in the P2P (Peer-to-Peer Network) network. By up to two rounds of exchanging and querying the CFs, the sending node can acknowledge whether the transactions in a block are contained by the receiver’s mempool or not. Based on this message, the sender only needs to transfer the missed transactions to the receiver, which speeds up the block synchronization and saves precious bandwidth resources. The evaluation results show that Gauze outperforms existing methods in terms of the average processing latency (about 10× lower than Graphene) and the total synchronization space cost (about 10× lower than Compact Blocks) in different scenarios.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
A primal-dual approximation algorithm for the k-prize-collecting minimum vertex cover problem with submodular penalties
Xiaofei LIU, Weidong LI, Jinhua YANG
Front. Comput. Sci.. 2023, 17 (3): 173404-.  
https://doi.org/10.1007/s11704-022-1665-9

Abstract   HTML   PDF (2172KB)

In this paper, we consider the k-prize-collecting minimum vertex cover problem with submodular penalties, which generalizes the well-known minimum vertex cover problem, minimum partial vertex cover problem and minimum vertex cover problem with submodular penalties. We are given a cost graph G=(V,E;c) and an integer k. This problem determines a vertex set SV such that S covers at least k edges. The objective is to minimize the total cost of the vertices in S plus the penalty of the uncovered edge set, where the penalty is determined by a submodular function. We design a two-phase combinatorial algorithm based on the guessing technique and the primal-dual framework to address the problem. When the submodular penalty cost function is normalized and nondecreasing, the proposed algorithm has an approximation factor of 3. When the submodular penalty cost function is linear, the approximation factor of the proposed algorithm is reduced to 2, which is the best factor if the unique game conjecture holds.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
LETTER
Group relational privacy protection on time-constrained point of interests
Bo NING, Xiaonan LI, Fan YANG, Yunhao SUN, Guanyu LI, George Y. YUAN
Front. Comput. Sci.. 2023, 17 (3): 173607-.  
https://doi.org/10.1007/s11704-022-2090-9

Abstract   HTML   PDF (1844KB)
Figures and Tables | References | Supplementary Material | Related Articles | Metrics
RESEARCH ARTICLE
Joint user profiling with hierarchical attention networks
Xiaojian LIU, Yi ZHU, Xindong WU
Front. Comput. Sci.. 2023, 17 (3): 173608-.  
https://doi.org/10.1007/s11704-022-1437-6

Abstract   HTML   PDF (3227KB)

User profiling by inferring user personality traits, such as age and gender, plays an increasingly important role in many real-world applications. Most existing methods for user profiling either use only one type of data or ignore handling the noisy information of data. Moreover, they usually consider this problem from only one perspective. In this paper, we propose a joint user profiling model with hierarchical attention networks (JUHA) to learn informative user representations for user profiling. Our JUHA method does user profiling based on both inner-user and inter-user features. We explore inner-user features from user behaviors (e.g., purchased items and posted blogs), and inter-user features from a user-user graph (where similar users could be connected to each other). JUHA learns basic sentence and bag representations from multiple separate sources of data (user behaviors) as the first round of data preparation. In this module, convolutional neural networks (CNNs) are introduced to capture word and sentence features of age and gender while the self-attention mechanism is exploited to weaken the noisy data. Following this, we build another bag which contains a user-user graph. Inter-user features are learned from this bag using propagation information between linked users in the graph. To acquire more robust data, inter-user features and other inner-user bag representations are joined into each sentence in the current bag to learn the final bag representation. Subsequently, all of the bag representations are integrated to lean comprehensive user representation by the self-attention mechanism. Our experimental results demonstrate that our approach outperforms several state-of-the-art methods and improves prediction performance.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Jointly beam stealing attackers detection and localization without training: an image processing viewpoint
Yaoqi YANG, Xianglin WEI, Renhui XU, Weizheng WANG, Laixian PENG, Yangang WANG
Front. Comput. Sci.. 2023, 17 (3): 173704-.  
https://doi.org/10.1007/s11704-022-1550-6

Abstract   HTML   PDF (10873KB)

Recently revealed beam stealing attacks could greatly threaten the security and privacy of IEEE 802.11ad communications. The premise to restore normal network service is detecting and locating beam stealing attackers without their cooperation. Current consistency-based methods are only valid for one single attacker and are parameter-sensitive. From the viewpoint of image processing, this paper proposes an algorithm to jointly detect and locate multiple beam stealing attackers based on RSSI (Received Signal Strength Indicator) map without the training process involved in deep learning-based solutions. Firstly, an RSSI map is constructed based on interpolating the raw RSSI data for enabling high-resolution localization while reducing monitoring cost. Secondly, three image processing steps, including edge detection and segmentation, are conducted on the constructed RSSI map to detect and locate multiple attackers without any prior knowledge about the attackers. To evaluate our proposal’s performance, a series of experiments are conducted based on the collected data. Experimental results have shown that in typical parameter settings, our algorithm’s positioning error does not exceed 0.41 m with a detection rate no less than 91%.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
Meaningful image encryption algorithm based on compressive sensing and integer wavelet transform
Xiaoling HUANG, Youxia DONG, Guodong YE, Yang SHI
Front. Comput. Sci.. 2023, 17 (3): 173804-.  
https://doi.org/10.1007/s11704-022-1419-8

Abstract   HTML   PDF (13152KB)

A new meaningful image encryption algorithm based on compressive sensing (CS) and integer wavelet transformation (IWT) is proposed in this study. First of all, the initial values of chaotic system are encrypted by RSA algorithm, and then they are open as public keys. To make the chaotic sequence more random, a mathematical model is constructed to improve the random performance. Then, the plain image is compressed and encrypted to obtain the secret image. Secondly, the secret image is inserted with numbers zero to extend its size same to the plain image. After applying IWT to the carrier image and discrete wavelet transformation (DWT) to the inserted image, the secret image is embedded into the carrier image. Finally, a meaningful carrier image embedded with secret plain image can be obtained by inverse IWT. Here, the measurement matrix is built by both chaotic system and Hadamard matrix, which not only retains the characteristics of Hadamard matrix, but also has the property of control and synchronization of chaotic system. Especially, information entropy of the plain image is employed to produce the initial conditions of chaotic system. As a result, the proposed algorithm can resist known-plaintext attack (KPA) and chosen-plaintext attack (CPA). By the help of asymmetric cipher algorithm RSA, no extra transmission is needed in the communication. Experimental simulations show that the normalized correlation (NC) values between the host image and the cipher image are high. That is to say, the proposed encryption algorithm is imperceptible and has good hiding effect.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
DBST: a lightweight block cipher based on dynamic S-box
Liuyan YAN, Lang LI, Ying GUO
Front. Comput. Sci.. 2023, 17 (3): 173805-.  
https://doi.org/10.1007/s11704-022-1677-5

Abstract   HTML   PDF (6943KB)

IoT devices have been widely used with the advent of 5G. These devices contain a large amount of private data during transmission. It is primely important for ensuring their security. Therefore, we proposed a lightweight block cipher based on dynamic S-box named DBST. It is introduced for devices with limited hardware resources and high throughput requirements. DBST is a 128-bit block cipher supporting 64-bit key, which is based on a new generalized Feistel variant structure. It retains the consistency and significantly boosts the diffusion of the traditional Feistel structure. The SubColumns of round function is implemented by combining bit-slice technology with subkeys. The S-box is dynamically associated with the key. It has been demonstrated that DBST has a good avalanche effect, low hardware area, and high throughput. Our S-box has been proven to have fewer differential features than RECTANGLE S-box. The security analysis of DBST reveals that it can against impossible differential attack, differential attack, linear attack, and other types of attacks.

Figures and Tables | References | Related Articles | Metrics
A fine-grained privacy protection data aggregation scheme for outsourcing smart grid
Hongyang LI, Xinghua LI, Qingfeng CHENG
Front. Comput. Sci.. 2023, 17 (3): 173806-.  
https://doi.org/10.1007/s11704-022-2003-y

Abstract   HTML   PDF (7214KB)

Compared with the traditional power grid, smart grid involves many advanced technologies and applications. However, due to the rapid development of various network technologies, smart grid is facing the challenges of balancing privacy, security, efficiency, and functionality. In the proposed scheme, we design a privacy protection scheme for outsourcing smart grid aided by fog computing, which supports fine-grained privacy-protected data aggregation based on user characteristics. The fog server matches the encrypted characteristics in the received message with the encrypted aggregation rules issued by the service provider. Therefore, the service provider can get more fine-grained analysis data based on user characteristics. Different from the existing outsourcing smart grid schemes, the proposed scheme can achieve real-time pricing on the premise of protecting user privacy and achieving system fault tolerance. Finally, experiment analyses demonstrate that the proposed scheme has less computation overhead and lower transmission delay than existing schemes.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
REVIEW ARTICLE
In silico prediction methods of self-interacting proteins: an empirical and academic survey
Zhanheng CHEN, Zhuhong YOU, Qinhu ZHANG, Zhenhao GUO, Siguo WANG, Yanbin WANG
Front. Comput. Sci.. 2023, 17 (3): 173901-.  
https://doi.org/10.1007/s11704-022-1563-1

Abstract   HTML   PDF (5754KB)

In silico prediction of self-interacting proteins (SIPs) has become an important part of proteomics. There is an urgent need to develop effective and reliable prediction methods to overcome the disadvantage of high cost and labor intensive in traditional biological wet-lab experiments. The goal of our survey is to sum up a comprehensive overview of the recent literature with the computational SIPs prediction, to provide important references for actual work in the future. In this review, we first describe the data required for the task of DTIs prediction. Then, some interesting feature extraction methods and computational models are presented on this topic in a timely manner. Afterwards, an empirical comparison is performed to demonstrate the prediction performance of some classifiers under different feature extraction and encoding schemes. Overall, we conclude and highlight potential methods for further enhancement of SIPs prediction performance as well as related research directions.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
RESEARCH ARTICLE
AE-TPGG: a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction
Shuchang ZHAO, Li ZHANG, Xuejun LIU
Front. Comput. Sci.. 2023, 17 (3): 173902-.  
https://doi.org/10.1007/s11704-022-2011-y

Abstract   HTML   PDF (15886KB)

Single-cell RNA sequencing (scRNA-seq) technology has become an effective tool for high-throughout transcriptomic study, which circumvents the averaging artifacts corresponding to bulk RNA-seq technology, yielding new perspectives on the cellular diversity of potential superficially homogeneous populations. Although various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material, the technical noise and biological variation are inevitably introduced into experimental process, resulting in high dropout events, which greatly hinder the downstream analysis. Considering the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data, we propose a customized autoencoder based on a two-part-generalized-gamma distribution (AE-TPGG) for scRNA-seq data analysis, which takes mixed discrete-continuous random variables of scRNA-seq data into account using a two-part model and utilizes the generalized gamma (GG) distribution, for fitting the positive and right-skewed continuous data. The adopted autoencoder enables AE-TPGG to captures the inherent relationship between genes. In addition to the ability of achieving low-dimensional representation, the AE-TPGG model also provides a denoised imputation according to statistical characteristic of gene expression. Results on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses.

Figures and Tables | References | Supplementary Material | Related Articles | Metrics
20 articles