Clustered Reinforcement Learning

doi:10.1007/s11704-024-3194-1

Front. Comput. Sci.

2025, Vol. 19

Issue (4) : 194313 https://doi.org/10.1007/s11704-024-3194-1

Artificial Intelligence

Clustered Reinforcement Learning

Xiao MA¹, Shen-Yi ZHAO¹, Zhao-Heng YIN², Wu-Jun LI¹(

)

¹. National Key Laboratory for Novel Software Technology, Department of Computer Science and Technology, Nanjing University, Nanjing 210023, China
². Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA 94720-1770, USA

Download: PDF(2856 KB) HTML
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks

Abstract

Exploration strategy design is a challenging problem in reinforcement learning (RL), especially when the environment contains a large state space or sparse rewards. During exploration, the agent tries to discover unexplored (novel) areas or high reward (quality) areas. Most existing methods perform exploration by only utilizing the novelty of states. The novelty and quality in the neighboring area of the current state have not been well utilized to simultaneously guide the agent’s exploration. To address this problem, this paper proposes a novel RL framework, called clustered reinforcement learning (CRL), for efficient exploration in RL. CRL adopts clustering to divide the collected states into several clusters, based on which a bonus reward reflecting both novelty and quality in the neighboring area (cluster) of the current state is given to the agent. CRL leverages these bonus rewards to guide the agent to perform efficient exploration. Moreover, CRL can be combined with existing exploration strategies to improve their performance, as the bonus rewards employed by these existing exploration strategies solely capture the novelty of states. Experiments on four continuous control tasks and six hard-exploration Atari-2600 games show that our method can outperform other state-of-the-art methods to achieve the best performance.

Keywords deep reinforcement learning exploration count-based method clustering K-means

Corresponding Author(s): Wu-Jun LI

About author:

Just Accepted Date: 28 February 2024 Issue Date: 20 May 2024

Cite this article:

Xiao MA,Shen-Yi ZHAO,Zhao-Heng YIN, et al. Clustered Reinforcement Learning[J]. Front. Comput. Sci., 2025, 19(4): 194313.

URL:

https://academic.hep.com.cn/fcs/EN/10.1007/s11704-024-3194-1
https://academic.hep.com.cn/fcs/EN/Y2025/V19/I4/194313

Fig.1 (a) Using clustering to divide the collected states (blue dots) into 5 clusters. The agent is rewarded with 1 in the orange area and receives no reward in other areas; (b) the clustering-based bonus rewards with novelty alone (

η = 1.0

); (c) the clustering-based bonus rewards (

η = 0.5

). The blue bar represents the portion of bonus rewards reflecting the novelty of states, and the orange bar represents the portion reflecting the quality of states

Fig.2 A snapshot of Mountain Car

Fig.3 (a) The total number of novel states

N Novelty

(y-axis) of Hash on Mountain Car with different lengths of hash codes

d

(x-axis) when the bonus coefficient is set to

0.1

; (b) the mean average return (y-axis) of Hash on Mountain Car with different lengths of hash codes (x-axis) when the bonus coefficient is set to

0.1

. The results are averaged over 100 random seeds

Fig.4 The training curve of

CRL

on coordinate-based Mountain Car when

{K, β, η} = {32, 0.01, 0.1}

Fig.5 Coordinate-based Mountain Car. (a) All dots represent states collected in iteration 0. Green dots represent states allocated to cluster

1 (0)

and blue dots represent states allocated to cluster

2 (0)

; (b) all dots represent states collected in iteration 5. Green dots represent states allocated to cluster

1 (5)

and blue dots represent states allocated to cluster

2 (5)

. The x-coordinate is the agent’s horizontal position, and the y-coordinate is the agent’s horizontal velocity

Fig.6 The training curve of

CRL

on pixel-based Mountain Car when

{K, β, η} = {32, 0.01, 0.1}

Fig.7 Pixel-based Mountain Car. (a) All dots represent states collected in iteration 10. Green dots represent states allocated to cluster

1 (10)

and blue dots represent states allocated to cluster

2 (10)

; (b) all dots represent states collected in iteration 14. Green dots represent states allocated to cluster

1 (14)

and blue dots represent states allocated to cluster

2 (14)

. The x-coordinate is the agent’s horizontal position, and the y-coordinate is the agent’s horizontal velocity

Tab.1 The results of our method and all baselines on four continuous control tasks over 5 random seeds. For our method, the numbers in parentheses indicate the values of

η

. Boldface numbers are the best results among all methods

Fig.8 Training curves of TRPO, Hash, VIME, and

CRL

on Mountain Car, Cart-Pole Swing up, Double Pendulum and Half-Cheetah. The results are averaged over 5 random seeds. The solid line represents the mean average return, and the shaded area represents one standard deviation. On Mountain Car and Half-Cheetah, the training curves of TRPO coincide with the x-axis. (a) Mountain Car; (b) Cart-Pole Swing up; (c) Double Pendulum; (d) Half-Cheetah

Layer	Configuration
conv 1	filter $32 × 8 × 8$ , stride 4, Leaky RELU
conv 2	filter $64 × 4 × 4$ , stride 2, Leaky RELU
conv 3	filter $64 × 3 × 3$ , stride 1, Leaky RELU
full 4	256 units

Tab.2 Configuration of the network for the random feature on Atari-2600 games

Method	Freeway	Frostbite	Gravitar	Montezuma	Solaris	Venture
TRPO [58]	17.55	1229.66	500.33	0	2110.22	283.48
$CRL$	30.80 (0.75)	4337.98 (0.1)	552.46 (0.1)	0 (0.75)	3672.55 (0.5)	312.40 (0.1)
Hash [24]	22.29	2954.10	577.47	0	2619.32	299.61
$CRL-Hash$	28.38 (0.75)	4148.90 (0.1)	585.79 (0.1)	0 (0.75)	2741.48 (0.5)	328.50 (0.1)
RND [41]	21.52	2837.70	867.30	2188.80	765.47	966.00
$CRL-RND$	20.85 (0.9)	4076.60 (0.9)	1002.40 (0.75)	2453.30 (0.5)	1021.60 (0.5)	981.20 (0.9)
NovelD [42]	21.39	3476.46	677.90	1744.80	975.52	283.60
$CRL-NovelD$	19.97 (0.9)	3520.06 (0.9)	971.50 (0.5)	2323.40 (0.5)	980.16 (0.5)	498.60 (0.9)

Tab.3 The mean average return of our method and baselines after training for 200M frames on six hard-exploration Atari-2600 games over 5 random seeds. For our method, the numbers in parentheses indicate the values of

η

. The Boldface numbers are the best results among all methods

Method	Freeway	Frostbite	Gravitar	Montezuma	Solaris	Venture
Hash [24]	22.29	2954.10	577.47	0	2619.32	299.61
$CRL$	30.80	4337.98	552.46	0	3672.55	312.40
$Hash R F$ [24]	27.28	5530.79	520.67	0	2470.54	72.30
$CRL R F$	28.60	4444.63	572.74	0	2891.14	190.18
$Hash B A S S$ [24]	32.18	2958.44	524.28	265.16	2372.05	401.08
$CRL B A S S$	31.60	6173.75	602.60	379.68	3397.51	582.69

Tab.4 The mean average return of CRL and Hash when adopting different features after training for 200M frames on six hard-exploration Atari-2600 games over 5 random seeds

Fig.9 Training curves of TRPO,

Hash B A S S

and

CRL B A S S

on six hard-exploration Atari-2600 games. The results are averaged over 5 random seeds. The solid line represents the mean average return and the shaded area represents one standard deviation. (a) Freeway; (b) Frostbite; (c) Gravitar; (d) Montezuma’s Revenge; (e) Solaris; (f) Venture

Fig.10 Sensitivity to

K

(x-axis) of

CRL

and

CRL B A S S

on Freeway when

β = 0.1

η = 0.25

and

β = 0.1

η = 0.75

. The results are averaged over 5 random seeds. The y-axis represents mean average return, which is averaged over 5 random seeds after training for 500 iterations. (a) CRL. β=0.1, η=0.25; (b) CRL. β=0.1, η=0.75; (c) CRL_BASS. β=0.1, η=0.25; (d) CRL_BASS. β=0.1, η=0.75

Fig.11 Visualization of states collected during iteration 0 on Freeway, along with their corresponding cluster centers for

K = 16

MAR	$CRL$					MAR	$CRL B A S S$
MAR	$η$					MAR	$η$
$β$	0.1	0.25	0.5	0.75	0.9	$β$	0.1	0.25	0.5	0.75	0.9
0	17.55	17.55	17.55	17.55	17.55	0	17.55	17.55	17.55	17.55	17.55
0.01	25.60	27.67	24.21	25.09	24.38	0.01	23.52	24.42	27.15	24.92	23.54
0.1	30.10	30.72	28.43	30.80	29.66	0.1	30.07	31.35	29.89	31.60	23.28
1	25.19	28.52	28.75	23.79	24.35	1	23.53	29.15	22.85	22.82	24.22

Tab.5 Effect of

β

and

η

using

CRL

and

CRL B A S S

on Freeway when

β

is chosen from

{0, 0.01, 0.1, 1}

η

is chosen from

{0.1, 0.25, 0.5, 0.75, 0.9}

, and

K

is set to 16. The results are averaged over 5 random seeds after 500 iterations

MAR	$β = 0.01$		$β = 0.1$
$K$	$η = 0.25$	$η = 0.75$	$η = 0.25$	$η = 0.75$
16	1.0	1.0	1.0	1.0
24	1.0	1.0	1.0	1.0
32	1.0	1.0	1.0	1.0
40	1.0	1.0	1.0	1.0

Tab.6 The mean average return of

CRL

on Mountain Car with different settings. The results are averaged over 5 random seeds

Fig.12 Mean average return of CRL and Hash with respect to wall-clock time on Freeway during the first 50 iterations

1	R S, Sutton A G Barto . Reinforcement Learning: an Introduction. Cambridge: MIT Press, 1998
2	V, Mnih K, Kavukcuoglu D, Silver A A, Rusu J, Veness M G, Bellemare A, Graves M, Riedmiller A K, Fidjeland G, Ostrovski S, Petersen C, Beattie A, Sadik I, Antonoglou H, King D, Kumaran D, Wierstra S, Legg D Hassabis . Human-level control through deep reinforcement learning. Nature, 2015, 518( 7540): 529–533
3	D, Silver A, Huang C J, Maddison A, Guez L, Sifre Den Driessche G, Van J, Schrittwieser I, Antonoglou V, Panneershelvam M, Lanctot S, Dieleman D, Grewe J, Nham N, Kalchbrenner I, Sutskever T P, Lillicrap M, Leach K, Kavukcuoglu T, Graepel D Hassabis . Mastering the game of go with deep neural networks and tree search. Nature, 2016, 529( 7587): 484–489
4	G, Lample D S Chaplot . Playing FPS games with deep reinforcement learning. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 2140−2146
5	A P, Badia B, Piot S, Kapturowski P, Sprechmann A, Vitvitskyi D, Guo C Blundell . Agent57: Outperforming the Atari human benchmark. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 48
6	X, Ma W J Li . State-based episodic memory for multi-agent reinforcement learning. Machine Learning, 2023, 112( 12): 5163–5190
7	B, Singh R, Kumar V P Singh . Reinforcement learning in robotic applications: a comprehensive survey. Artificial Intelligence Review, 2022, 55( 2): 945–990
8	Y, Wen J, Si A, Brandt X, Gao H H Huang . Online reinforcement learning control for the personalization of a robotic knee prosthesis. IEEE Transactions on Cybernetics, 2020, 50( 6): 2346–2356
9	T P, Lillicrap J J, Hunt A, Pritzel N, Heess T, Erez Y, Tassa D, Silver D Wierstra . Continuous control with deep reinforcement learning. In: Proceedings of the 4th International Conference on Learning Representations. 2016
10	Y, Duan X, Chen R, Houthooft J, Schulman P Abbeel . Benchmarking deep reinforcement learning for continuous control. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 1329−1338
11	H, Modares I, Ranatunga F L, Lewis D O Popa . Optimized assistive human-robot interaction using reinforcement learning. IEEE Transactions on Cybernetics, 2016, 46( 3): 655–667
12	S Amarjyoti . Deep reinforcement learning for robotic manipulation-the state of the art. 2017, arXiv preprint arXiv: 1701.08878
13	Y, Xu M, Fang L, Chen Y, Du J, Zhou C Zhang . Perceiving the world: Question-guided reinforcement learning for text-based games. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022, 538−560
14	D, Ghalandari C, Hokamp G Ifrim . Efficient unsupervised sentence compression by fine-tuning transformers with reinforcement learning. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022, 1267−1280
15	H, Li Y, Hu Y, Cao G, Zhou P Luo . Rich-text document styling restoration via reinforcement learning. Frontiers of Computer Science, 2021, 15( 4): 154328
16	K L A, Yau K H, Kwong C Shen . Reinforcement learning models for scheduling in wireless networks. Frontiers of Computer Science, 2013, 7( 5): 754–766
17	Y, Qin H, Wang S, Yi X, Li L Zhai . A multi-objective reinforcement learning algorithm for deadline constrained scientific workflow scheduling in clouds. Frontiers of Computer Science, 2021, 15( 5): 155105
18	Y C, Lin C T, Chen C Y, Sang S H Huang . Multiagent-based deep reinforcement learning for risk-shifting portfolio management. Applied Soft Computing, 2022, 123: 108894
19	Y, Zhang P, Zhao Q, Wu B, Li J, Huang M Tan . Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Transactions on Knowledge and Data Engineering, 2022, 34( 1): 236–248
20	X, Li C, Cui D, Cao J, Du C Zhang . Hypergraph-based reinforcement learning for stock portfolio selection. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing. 2022, 4028−4032
21	K, Xu Y, Zhang D, Ye P, Zhao M Tan . Relation-aware transformer for portfolio policy learning. In: Proceedings of the 29th International Joint Conference on Artificial Intelligence. 2020, 641
22	Z, Wang B, Huang S, Tu K, Zhang L Xu . DeepTrader: A deep reinforcement learning approach for risk-return balanced portfolio management with market conditions embedding. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. 2021, 643−650
23	L, Ouyang J, Wu X, Jiang D, Almeida C L, Wainwright P, Mishkin C, Zhang S, Agarwal K, Slama A, Ray J, Schulman J, Hilton F, Kelton L, Miller M, Simens A, Askell P, Welinder P F, Christiano J, Leike R Lowe . Training language models to follow instructions with human feedback. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022
24	H R, Tang R, Houthooft D, Foote A, Stooke X, Chen Y, Duan J, Schulman Turck F, De P Abbeel . #exploration: A study of count-based exploration for deep reinforcement learning. In: Proceedings of the 31th International Conference on Neural Information Processing Systems. 2017, 2753−2762
25	H, Qian Y Yu . Derivative-free reinforcement learning: a review. Frontiers of Computer Science, 2021, 15( 6): 156336
26	O, Chapelle L Li . An empirical evaluation of Thompson sampling. In: Proceedings of the 24th International Conference on Neural Information Processing Systems. 2011, 2249−2257
27	V, Mnih A P, Badia M, Mirza A, Graves T, Harley T P, Lillicrap D, Silver K Kavukcuoglu . Asynchronous methods for deep reinforcement learning. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 1928−1937
28	M, Fortunato M G, Azar B, Piot J, Menick M, Hessel I, Osband A, Graves V, Mnih R, Munos D, Hassabis O, Pietquin C, Blundell S Legg . Noisy networks for exploration. In: Proceedings of the 6th International Conference on Learning Representations. 2018
29	M, Plappert R, Houthooft P, Dhariwal S, Sidor R Y, Chen X, Chen T, Asfour P, Abbeel M Andrychowicz . Parameter space noise for exploration. In: Proceedings of the 6th International Conference on Learning Representations. 2018
30	I, Osband C, Blundell A, Pritzel Roy B Van . Deep exploration via bootstrapped DQN. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 4033−4041
31	I, Osband Roy B, Van D J, Russo Z Wen . Deep exploration via randomized value functions. Journal of Machine Learning Research, 2019, 20( 124): 1–62
32	M, Kearns S Singh . Near-optimal reinforcement learning in polynomial time. Machine Learning, 2002, 49(2−3): 209−232
33	R I, Brafman M Tennenholtz . R-MAX - a general polynomial time algorithm for near-optimal reinforcement learning. Journal of Machine Learning Research, 2003, 3: 213–231
34	M G, Bellemare S, Srinivasan G, Ostrovski T, Schaul D, Saxton R Munos . Unifying count-based exploration and intrinsic motivation. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 1479−1487
35	G, Ostrovski M G, Bellemare Den Oord A, Van R Munos . Count-based exploration with neural density models. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 2721−2730
36	R, Houthooft X, Chen Y, Duan J, Schulman Turck F, De P Abbeel . VIME: variational information maximizing exploration. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 1117−1125
37	B C, Stadie S, Levine P Abbeel . Incentivizing exploration in reinforcement learning with deep predictive models. 2015, arXiv preprint arXiv: 1507.00814
38	D, Pathak P, Agrawal A A, Efros T Darrell . Curiosity-driven exploration by self-supervised prediction. In: Proceedings of the 34th International Conference on Machine Learning. 2017, 2778−2787
39	A S, Klyubin D, Polani C L Nehaniv . Empowerment: a universal agent-centric measure of control. In: Proceedings of the IEEE Congress on Evolutionary Computation. 2005, 128−135
40	J, Fu J D, Co-Reyes S Levine . EX2: exploration with exemplar models for deep reinforcement learning. In: Proceedings of the 31th International Conference on Neural Information Processing Systems. 2017, 2577−2587
41	Y, Burda H, Edwards A J, Storkey O Klimov . Exploration by random network distillation. In: Proceedings of the 7th International Conference on Learning Representations. 2019
42	T, Zhang H, Xu X, Wang Y, Wu K, Keutzer J E, Gonzalez Y Tian . NovelD: A simple yet effective exploration criterion. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 25217−25230
43	P, Auer R Ortner . Logarithmic online regret bounds for undiscounted reinforcement learning. In: Proceedings of the 19th International Conference on Neural Information Processing Systems. 2006, 49−56
44	I, Osband D, Russo Roy B Van . (More) efficient reinforcement learning via posterior sampling. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3003−3011
45	A, Ecoffet J, Huizinga J, Lehman K O, Stanley J Clune . Go-explore: a new approach for hard-exploration problems. 2019, arXiv preprint arXiv: 1901.10995
46	A, Ecoffet J, Huizinga J, Lehman K O, Stanley J Clune . First return, then explore. Nature, 2021, 590( 7847): 580–586
47	M G, Bellemare Y, Naddaf J, Veness M Bowling . The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 2013, 47: 253–279
48	A L, Strehl M L Littman . An analysis of model-based interval estimation for Markov decision processes. Journal of Computer and System Sciences, 2008, 74( 8): 1309–1331
49	R Ortner . Adaptive aggregation for reinforcement learning in average reward Markov decision processes. Annals of Operations Research, 2013, 208( 1): 321–336
50	A G Barto . Intrinsic motivation and reinforcement learning. In: Baldassarre G, Mirolli M, eds. Intrinsically Motivated Learning in Natural and Artificial Systems. Berlin: Springer, 2013, 17−47
51	D E Berlyne . Structure and Direction in Thinking. Hoboken: Wiley, 1965
52	S, Mannor I, Menache A, Hoze U Klein . Dynamic abstraction in reinforcement learning via clustering. In: Proceedings of the 21st International Conference on Machine Learning. 2004
53	N, Tziortziotis K Blekas . A model based reinforcement learning approach using on-line clustering. In: Proceedings of the IEEE International Conference on Tools with Artificial Intelligence. 2012, 712−718
54	T, Wang T, Gupta A, Mahajan B, Peng S, Whiteson C J Zhang . RODE: learning roles to decompose multi-agent tasks. In: Proceedings of the 9th International Conference on Learning Representations. 2021
55	F, Christianos G, Papoudakis A, Rahman S V Albrecht . Scaling multi-agent reinforcement learning with selective parameter sharing. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 1989−1998
56	T, Mandel Y E, Liu E, Brunskill Z Popovic . Efficient Bayesian clustering for reinforcement learning. In: Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2016, 1830−1838
57	Coates A, Ng A Y. Learning feature representations with K-means. In: Montavon G, Orr G B, Müller K R, eds. Neural Networks: Tricks of the Trade. 2nd ed. Berlin: Springer, 2012, 561−580
58	J, Schulman S, Levine P, Moritz M, Jordan P Abbeel . Trust region policy optimization. In: Proceedings of the 32nd International Conference on Machine Learning. 2015, 1889−1897
59	Y, Burda H, Edwards D, Pathak A J, Storkey T, Darrell A A Efros . Large-scale study of curiosity-driven learning. In: Proceedings of the 7th International Conference on Learning Representations. 2019
60	K, Wang K, Zhou B, Kang J, Feng S Yan . Revisiting intrinsic reward for exploration in procedurally generated environments. In: Proceedings of the 11th International Conference on Learning Representations. 2023
61	M S Charikar . Similarity estimation techniques from rounding algorithms. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing. 2002, 380−388
62	C, Voloshin H M, Le N, Jiang Y Yue . Empirical study of off-policy policy evaluation for reinforcement learning. In: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. 2021
63	V, Nair G E Hinton . Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning. 2010, 807−814
64	A L, Maas A Y, Hannun A Y Ng . Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning. 2013
65	Der Maaten L, Van G Hinton . Visualizing data using t-SNE. Journal of Machine Learning Research, 2008, 9( 86): 2579–2605

[1]

FCS-23194-OF-XM_suppl_1

Download

[1]	Jingyu LIU, Shi CHEN, Li SHEN. A comprehensive survey on graph neural network accelerators[J]. Front. Comput. Sci., 2025, 19(2): 192104-.
[2]	Yuya CUI, Degan ZHANG, Jie ZHANG, Ting ZHANG, Lixiang CAO, Lu CHEN. Multi-user reinforcement learning based task migration in mobile edge computing[J]. Front. Comput. Sci., 2024, 18(4): 184504-.
[3]	Zhe YUAN, Zhewei WEI, Fangrui LV, Ji-Rong WEN. Index-free triangle-based graph local clustering[J]. Front. Comput. Sci., 2024, 18(3): 183404-.
[4]	Jing ZHANG, Ruidong FAN, Hong TAO, Jiacheng JIANG, Chenping HOU. Constrained clustering with weak label prior[J]. Front. Comput. Sci., 2024, 18(3): 183338-.
[5]	Bo YANG, Xiuyin MA, Chunhui WANG, Haoran GUO, Huai LIU, Zhi JIN. User story clustering in agile development: a framework and an empirical study[J]. Front. Comput. Sci., 2023, 17(6): 176213-.
[6]	Mingzhao WANG, Henry HAN, Zhao HUANG, Juanying XIE. Unsupervised spectral feature selection algorithms for high dimensional data[J]. Front. Comput. Sci., 2023, 17(5): 175330-.
[7]	Xulun YE, Jieyu ZHAO. Heterogeneous clustering via adversarial deep Bayesian generative model[J]. Front. Comput. Sci., 2023, 17(3): 173322-.
[8]	Momo MATSUDA, Yasunori FUTAMURA, Xiucai YE, Tetsuya SAKURAI. Distortion-free PCA on sample space for highly variable gene detection from single-cell RNA-seq data[J]. Front. Comput. Sci., 2023, 17(1): 171310-.
[9]	Arpita BISWAS, Abhishek MAJUMDAR, Soumyabrata DAS, Krishna Lal BAISHNAB. OCSO-CA: opposition based competitive swarm optimizer in energy efficient IoT clustering[J]. Front. Comput. Sci., 2022, 16(1): 161501-.
[10]	Suyu MEI. A framework combines supervised learning and dense subgraphs discovery to predict protein complexes[J]. Front. Comput. Sci., 2022, 16(1): 161901-.
[11]	Peng YANG, Qi YANG, Ke TANG, Xin YAO. Parallel exploration via negatively correlated search[J]. Front. Comput. Sci., 2021, 15(5): 155333-.
[12]	Zheng HUO, Ping HE, Lisha HU, Huanyu ZHAO. DP-UserPro: differentially private user profile construction and publication[J]. Front. Comput. Sci., 2021, 15(5): 155811-.
[13]	Panthadeep BHATTACHARJEE, Pinaki MITRA. A survey of density based clustering algorithms[J]. Front. Comput. Sci., 2021, 15(1): 151308-.
[14]	Zhihan JIANG, Yan LIU, Xiaoliang FAN, Cheng WANG, Jonathan LI, Longbiao CHEN. Understanding urban structures and crowd dynamics leveraging large-scale vehicle mobility data[J]. Front. Comput. Sci., 2020, 14(5): 145310-.
[15]	Xibin DONG, Zhiwen YU, Wenming CAO, Yifan SHI, Qianli MA. A survey on ensemble learning[J]. Front. Comput. Sci., 2020, 14(2): 241-258.

Viewed

Full text

Abstract

Cited

Shared

Discussed