The booming Internet technology and media have generated large sets of social media data, with which the social sensing analyses based on users' reviews have become a research hotspot and have been increasingly applied in the study of urban park usage and perception. However, most existing studies adopt a single model for text data processing. To fill this gap, this study aims to compare social media text data analysis methods and assess their advantages, disadvantages and applicability in park perception research. The Lexicon-based classification analysis model (lexicon model) and LDA (Latent Dirichlet Allocation) model widely used in relevant research were selected. Based on text data obtained from public reviews of 10 urban parks in Beijing on Dianping, this study explored the perception topic distribution of each park and all parks in general, and compared the classification results of perception topics between these two models. Results show that the lexicon model is conducive to the parallel comparison of perception frequency between parks, while the LDA model can directly reflect each park's characteristics and visitors' perception preferences; the combined use of the two models can optimize park perception assessment. Results from the two methods reveal that visitors to urban parks in Beijing focused more on their social recreation needs and visual aesthetics brought by the natural landscape, as well as conditions of the transportation facilities and the consumption in the parks. This research can provide optimization suggestions for the selection and use of social media text analysis methods, and a basis and guidance for park construction and management improvement.
● Exploring the advantages, disadvantages, and applicability of two text analysis models
● The lexicon model is more suitable for parallel comparison between perceived objects by users
● The Latent Dirichlet Allocation (LDA) model can better capture the characteristics of each individual perceived object
● Taking advantage of the two models’ strengths is vital for optimizing landscape perception assessment
. [J]. Landscape Architecture Frontiers, 2023, 11(5): 8-21.
Zhenyu SHANG, Kexin CHENG, Yuqing JIAN, Zhifang WANG. Comparison and Applicability Study of Analysis Methods for Social Media Text Data: Taking Perception of Urban Parks in Beijing as an Example. Landsc. Archit. Front., 2023, 11(5): 8-21.
Air quality improvement, microclimate regulation, noise
Humidity, exposure, freshness, wind and sunshine
Biodiversity
Animals, plants
Swan, holly, dead wood, birdsong and floral fragrance
History and culture
Cultural and historical values, cultural heritage, historical sites
Qing Dynasty, relics, art, Dragon Boat Festival
Aesthetic appreciation
Scenery, beauty, inspiration
Flowers, photography, attractive, unpleasant
Education
Popularization of science, education
Knowledge, learned, knowledgeable, ignorant
Religion
Religious worship, belief, refuge
Rituals, Buddha beads, Taoism, enlightenment, marriage seeking
Physical and mental recovery
Relaxation, stress release, mind restoration
Soothing, beautiful, downcast, cheerful
Recreational activities
Outdoor activities, sports
Walking, boating, hiking, ball games
Social interaction
Social integration, interaction between individuals
Mom, dad, friends and relatives, gatherings
Tab.2
Fig.1
Fig.2
Fig.3
Park
Number of topics
Coherence score
Yuanmingyuan Park
10
0.5611
Yuyuantan Park
10
0.5216
Fragrant Hills Park
9
0.6049
Jingshan Park
9
0.6334
Beijing Shiyuan Park
9
0.5829
Chaoyang Park
10
0.5845
Beijing World Park
6
0.6577
Olympic Forest Park
10
0.5046
Badachu Park
9
0.6385
Beijing Garden Expo Park
8
0.5411
Tab.3
Yuanmingyuan Park
Yuyuantan Park
Fragrant Hills Park
Jingshan Park
Beijing Shiyuan Park
Topic 1
Transportation and tickets
Epidemic
Hiking activities
Cultural heritage
Natural landscape
Topic 2
Cultural relics
Cherry blossom festival
Park Introduction
Historical change
Service facilities
Topic 3
Garden landscape
Lupine view
Transportation and tickets
Park introduction
Social activities
Topic 4
Natural landscape
Cherry blossom view
Experiential perception
Featured flowers
Pavilion experience
Topic 5
Historical perception
Tickets and consumption
Cultural landscape
Featured constructions
Park introduction
Topic 6
Patriotic education
Natural landscape
Education and learning
Surrounding landscape
Music festival
Topic 7
Park introduction
Transportation facilities
Natural landscape
Epidemic
Service experience
Topic 8
Lotus flowers
Park introduction
The Forbidden City vision
Transportation and tickets
Topic 9
Featured ice-cream
Leisure and entertainment
Topic 10
Chaoyang Park
Beijing World Park
Olympic Forest Park
Badachu Park
Beijing Garden Expo Park
Topic 1
Transportation and tickets
Transportation and tickets
Leisure sports
Worship activities
Featured gardens
Topic 2
Service experience
Collective memory and perception
Transportation facilities
Leisure facilities
Aesthetic experience
Topic 3
Parent-child activities
Park introduction
Social activities
Buddhist landscape
Activity experience
Topic 4
Leisure and entertainment
Performances
Night show
Gatherings
Transportation facilities
Topic 5
Book market
Featured landscapes
Sporting activities
Hiking activities
Cultural activities
Topic 6
Park introduction
Summer and Autumn view
Park introduction
Leisure activities
Topic 7
Temple fair
Spring view
Collective memory and
Park introduction perception
Topic 8
Spring activities
Park introduction
Transportation facilities
Topic 9
Epidemic
Topic 10
Gatherings
Tab.4
Fig.4
Fig.5
LexiconLDA
Transportation and tickets
Spring view
Featured constructions
Religious culture
Hiking activities
Autumn view
Social activities
Collective memory and perception
Cultural history
Gatherings and performances
Environmental improvement
0.012
0.172
?0.088
?0.019
0.045
0.095
0.011
?0.054
?0.030
?0.059
Biodiversity
—**
0.200
?0.049
—**
?0.126
0.039
?0.088
0.020
0.041
0.008
History and culture
?0.076
?0.066
0.148
0.126
?0.050
?0.053
?0.131
0.049
0.248
—**
Aesthetic appreciation
?0.063
0.121
0.024
0.024
?0.078
0.125
?0.123
0.031
0.153
?0.057
Education
?0.026
0.069
0.141
0.045
?0.067
?0.050
?0.044
0.104
0.101
0.009
Religion
0.023
?0.070
0.010
0.182
0.183
?0.020
?0.038
?0.049
?0.051
0.046
Physical and mental recovery
—**
0.067
?0.063
?0.036
0.082
0.032
?0.002
0.017
?0.024
?0.050
Recreational activities
0.082
0.130
?0.100
?0.030
0.034
0.010
0.113
?0.081
?0.062
0.009
Social interaction
0.063
0.045
?0.043
?0.073
?0.025
?0.092
0.259
0.028
?0.157
0.050
Tab.5
Fig.6
Research subject
Lexicon model
LDA model
Perception topic classification
Advantages
1) Clear topic classification to make the topics different from each other and with specific contents 2) Effective identification of perception contents with limited attention 3) Analysis results available for parallel comparison between parks
More comprehensive, real-time reflection of visitors' perception contents
Disadvantages
Little consideration of the actual use of the park, resulting in a possible lack of perception contents
1) Lack of sensitivity to low-frequency perception topics due to failing to extract perception contents less frequently mentioned 2) Incapability of making parallel comparison between parks
Perception content identification
Advantages
Precise identification of the perception contents, covering topics with low perception frequency
1) Emphasis on the park characteristics, by a detailed classification of the perception topics with high awareness level 2) Clear perception contents to ensure quick identification of the topics
Disadvantages
1) Words extracted basing on the lexicon, for which the analysis results rely heavily on the completeness of the lexicon 2) Further manual interpretation of specific perception contents required
Relatively blurred boundaries between perception topics
Scope of application
Advantages
Suitable for regional-scale, multi-park perception analysis and inter-park comparison, with a comprehensive lexicon that can be adapted to different research objects
More effective perception analysis for individual parks
Disadvantages
Higher requirements for the lexicon to make it adjustable
Ambiguous results for regional-scale, multi-park perception analysis, making manual interpretation difficult
Tab.6
1
A. P. Ferreira,, T. H. Silva,, & A. A. Loureiro, (2020) Uncovering spatiotemporal and semantic aspects of tourists mobility using social sensing. Computer Communications, (160), 240– 252.
2
Y. Li,, J. Guo,, & Y. Chen, (2022). A new approach for tourists' visual behavior patterns and perception evaluation based on multi-source data. Journal of Geo-information Science, 24(10), 2004– 2020.
3
Y. Liu,, X. Liu,, S. Gao,, L. Gong,, C. Kang,, Y. Zhi,, G. Chi,, & L. Shi, (2015). Social sensing: A new approach to understanding our socioeconomic environments. Annals of the Association of American Geographers, 105(3), 512– 530.
4
Y. Liu, (2016). Revisiting several basic geographical concepts: A social sensing perspective. Acta Geographica Sinica, 71(4), 564– 575.
5
T. Mao,, Y. Wu,, & W. Huang, (2023). Content mining and sentiment analysis of online comments for ethnic museums in autonomous regions. Economic Geography, 43(8), 229– 236.
6
X. He, (2019). Research on Social Sensing and Spatiotemporal Pattern of Xiong'an New District Based on Weibo Data [Master's thesis]. Hebei Normal University.
7
S. Zhang,, & W. Zhou, (2018). Recreational visits to urban parks and factors affecting park visits: Evidence from geotagged social media data. Landscape and Urban Planning, (180), 27– 35.
8
M. L. Donahue,, B. L. Keeler,, S. A. Wood,, D. M. Fisher,, Z. A. Hamstead,, & T. McPhearson, (2018). Using social media to understand drivers of urban park visitation in the Twin Cities, MN. Landscape and Urban Planning, (175), 1– 10.
9
F. Li,, F. Li,, S. Li,, & Y. Long, (2019). Deciphering the recreational use of urban parks: Experiments using multi-source big data for all Chinese cities. Science of the Total Environment, (701), 134896.
10
H. Liang,, & Q. Zhang, (2021). Temporal and spatial assessment of urban park visits from multiple social media data sets: A case study of Shanghai, China. Journal of Cleaner Production, (297), 126682.
11
D. B. Van Berkel,, P. Tabrizian,, M. A. Dorning,, L. Smart,, D. Newcomb,, M. Mehaffey,, A. Neale,, & R. K. Meentemeyer, (2018). Quantifying the visual-sensory landscape qualities that contribute to cultural ecosystem services using social media and LiDAR. Ecosystem services, (31), 326– 335.
12
E. Oteros-Rozas,, B. Martín-López,, N. Fagerholm,, C. Bieling,, & T. Plieninger, (2018). Using social media photos to explore the relation between cultural ecosystem services and landscape features across five European sites. Ecological Indicators, (94), 74– 86.
13
D. R. Richards,, & D. A. Friess, (2015). A rapid indicator of cultural ecosystem service usage at a fine spatial scale: Content analysis of social media photographs. Ecological Indicators, (53), 187– 195.
14
Y. Pan,, & J. Li, (2021). Landscape preference based on user-generated photograph metadata: The case of Xixi National Wetland Park. Natural Protected Areas, (1), 100– 108.
15
X. Zhu,, M. Gao,, R. Zhang,, & B. Zhang, (2021). Quantifying emotional differences in urban green spaces extracted from photos on social networking sites: A study of 34 parks in three cities in northern China. Urban Forestry & Urban Greening, (62), 127133.
16
F. M. Wartmann,, E. Acheson,, & R. S. Purves, (2018). Describing and comparing landscapes using tags, texts, and free lists: An interdisciplinary approach. International Journal of Geographical Information Science, 32(8), 1572– 1592.
17
Y. Yan,, J. Chen,, & Z. Wang, (2020). Mining public sentiments and perspectives from geotagged social media data for appraising the post-earthquake recovery of tourism destinations. Applied Geography, (123), 102306.
18
C. Marcotte,, & P. A. Stokowski, (2021). Place meanings and national parks: A rhetorical analysis of social media texts. Journal of Outdoor Recreation and Tourism, (35), 100383.
19
H. Bai,, Z. Song,, S. Liang,, P. Zhang,, & G. Zhang, (2023). Imagery perception analysis and comprehensive attraction evaluation of tourism destinations based on Internet text data—Taking Nanjing City as example. Areal Research and Development, 42(4), 89.
20
Y. Zhao,, S. Pang,, & Z. Wu, (2021). Research on geographic semantic ontology model based on social sensing data for emergency management of events. Information Science, (2), 44– 53.
21
Y. Chen,, C. Gong,, Y. Fan,, X. Li,, Y. Liang,, & M. Hu, (2022). Spatio-temporal variation assessment of urban waterlogging in Zhengzhou using social media data. Journal of China Hydrology, 42(3), 26, 48– 52.
22
S. Li,, F. Zhao,, Y. Zhou,, X. Tian,, & H. Huang, (2022). Analysis of public opinion and disaster loss estimates from typhoons based on Microblog data. Journal of Tsinghua University (Science and Technology), 62(1), 43– 51.
23
B. Yang,, & J. Zhang, (2017). Research on tourism image and perception of Tianmu Mountain based on network text analysis— Based on travel notes and comments of Ctrip. Journal of Fujian Forestry Science and Technology, 44(4), 118– 125.
24
X. Wang,, & M. Xia, (2018). Research on tourist preference and satisfaction in Huangshan Scenic Spot based on network review data. Tourism Overview, (18), 59– 60.
25
A. C. Wight, (2020). Visitor perceptions of European Holocaust Heritage: A social media analysis. Tourism Management, (81), 104142.
26
Z. Xu,, J. Dong,, Z. Chen,, W. Fu,, M. Wang,, & J. Dong, (2021). Image Perception of the historical ancient town scenic spot of Yunshuiyao. Journal of Chinese Urban Forestry, 19(2), 115– 120.
27
S. B. Park,, J. Kim,, Y. K. Lee,, & C. M. Ok, (2020). Visualizing theme park visitors' emotions using social media analytics and geospatial analytics. Tourism Management, (80), 104127.
28
N. O. Widmar,, C. Bir,, M. Clifford,, & N. Slipchenko, (2020). Social media sentimentas an additional performance measure? Examples from iconic theme park destinations. Journal of Retailing and Consumer Services, (56), 102157.
29
C. Wan,, G. Q. Shen,, & S. Choi, (2021). Eliciting users' preferences and values in urban parks: Evidence from analyzing social media data from Hong Kong. Urban Forestry & Urban Greening, (62), 127172.
30
L. Li,, C. Zhang,, L. Han,, L. Qing,, & H. Ji, (2021). Research on multi-scale evaluation system of parks based on comment text—Taking Chengdu parks as an example. Intelligent City, (2), 3– 6.
31
Q. Jiang,, G. Wang,, X. Liang,, & N. Liu, (2022). Research on the perception of cultural ecosystem services in urban parks via analyses of online comment data. Landscape Architecture Frontiers, 10(5), 32– 51.
32
F. Jing,, H. Sun,, & D. Long, (2017). Tourist experience elements structure characteristics analysis of Xixi National Wetland Park based on web text. Journal of Zhejiang University (Science Edition), 44(5), 623– 630.
33
X. Wang,, & X. Li, (2017). Research on the analysis of social services value of forest park in Beijing based on network big data. Chinese Landscape Architecture, (10), 14– 18.
34
S. Zhao,, & B. Liu, (2019). Research on visitor perception of urban parks based on analysis of network text data—Take the main urban area of Nanjing as an example. 2019 Urban Development and Planning Proceedings( pp. 263−272). Chinese Society for Urban Studies.
35
X. Gao,, Y. Jin,, X. Wang,, & J. Hao, (2021). Research on product perceptual evaluation method based on online review mining. Modern Manufacturing Engineering, (12), 13– 20.
36
X. Lu, (2014). Research on text clustering algorithm based on K-means. Computer Programming Skills & Maintenance, (24), 33– 35.
37
D. Wang,, J. Li,, & Y. Shi, (2020). Methods of government document clustering based on K-means algorithm. Software Guide, 19(6), 201– 204.
38
W. Ma,, G. Chen,, X. Li,, W. Su,, Y. Chai,, Y. Pu,, J. Zeng,, & X. Liu, (2021). Chinese comment classification based on Naive Bayesian algorithm. Journal of Computer Applications, 41(S2), 31– 35.
39
F. C. Permana,, Y. Rosmansyah,, & A. S. Abdullah, (2017). Naive Bayes as opinion classifier to evaluate students satisfaction based on student sentiment in Twitter social media. Journal of Physics: Conference Series, (893), 012051.
40
X. Han,, & Y. Li, (2022). Research on the influencing factors of social media rumor-refuting information dissemination effect in emergencies. Information Studies: Theory & Application, 45(8), 97– 103.
41
Y. Zeng,, Z. Li,, & Y. Zhou, (2020). Article feature extraction and flow control based on text mining. Electronic Technology & Software Engineering, (2), 176– 177.
42
Z. Wang,, Y. Miao,, M. Xu,, Z. Zhu,, S. Qureshi,, & Q. Chang, (2021). Revealing the differences of urban parks' services to human wellbeing based upon social media data. Urban Forestry & Urban Greening, (63), 127233.
43
Z. Wang,, Z. Zhu,, M. Xu,, & S. Qureshi, (2021). Fine-grained assessment of greenspace satisfaction at regional scale using content analysis of social media and machine learning. Science of the Total Environment, (776), 145908.
44
T. Zheng,, Y. Yan,, W. Zhang,, J. Zhu,, C. Wang,, Y. Rong,, & H. Lu, (2022). Landscape assessment on urban parks using social media data. Acta Ecologica Sinica, 42(2), 561– 568.
45
V. Taecharungroj,, & B. Mathayomchan, (2019). Analysing TripAdvisor reviews of tourist attractions in Phuket, Thailand. Tourism Management, (75), 550– 568.
46
S. Dong,, & Q. Wang, (2019). LDA-based tourist perception dimension recognition: Research framework and empirical research—Taking the National Mine Park as an example. Journal of Beijing Union University (Humanities and Social Sciences), 17(2), 42– 49.
47
C. Liang,, & R. Li, (2020). Tourism destination image perception analysis based on the Latent Dirichlet Allocation model and dominant semantic dimensions: A case of the Old Town of Lijiang. Progress in Geography, 39(4), 614– 626.
48
Y. Song,, R. Wang,, J. Fernandez,, & D. Li, (2021). Investigating sense of place of the Las Vegas Strip using online reviews and machine learning approaches. Landscape and Urban Planning, (205), 103956.
49
W. Zhou, (2021) Research on Tourism Destination Evaluation Based on Improved AHP of LDA: A Case Study of 5A Scenic Spots in Jiangxi Province. [Master's thesis]. Jiangxi University of Finance and Economics.
50
Beijing Statistics Bureau (2021). Beijing statistics yearbook. China Statistics Press.
51
Z. Zhu, (2020). An Assessment Framework of Green Space Satisfaction Using Social Media Data: Content Analysis with Machine Learning. [Master's thesis]. Peking University.
52
Z. Wang,, Y. Miao,, Z. Zhu,, J. Zhou,, & S. Wang, (2020). A method for landscape service identification of parks. (No. CN111310444A). China National Intellectual Property Administration.
53
S. Buchel,, & N. Frantzeskaki, (2015). Citizens' voice: A case study about perceived ecosystem services by urban park users in Rotterdam, the Netherlands. Ecosystem Services, (12), 169– 177.
54
S. Huang,, J. Pearce,, J. Wen,, R. K. Dowling,, & A. J. Smith, (2020). Segmenting Western Australian national park visitors by perceived benefits: A factor-item mixed approach. International Journal of Tourism Research, 22(6), 814– 824.
55
L. Willemen,, P. H. Verburg,, L. Hein,, & M. E. van Mensvoort, (2008). Spatial characterization of landscape functions. Landscape and Urban Planning, 88(1), 34– 43.
56
R. Sun,, F. Li,, & L. Chen, (2019). A demand index for recreational ecosystem services associated with urban parks in Beijing, China. Journal of Environmental Management, (251), 109612.
57
C. J. van Riper,, G. T. Kyle,, S. G. Sutton,, M. Barnes,, & B. C. Sherrouse, (2012). Mapping outdoor recreationists' perceived social values for ecosystem services at Hinchinbrook Island National Park, Australia. Applied Geography, 35(1−2), 164– 173.
58
J. Wang,, M. Wang,, & B. Du, (2019). A study of the change trend of social concern in the field of consumption in China—The LDA Model analysis based on the text of Daily Economic News List in People's Daily Online (2007—2017). Journal of Baoding University, 32(2), 41– 49.
59
D. M. Blei,, A. Y. Ng,, & M. I. Jordan, (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, (3), 993– 1022.
60
T. Brandt,, J. Bendler,, & D. Neumann, (2017). Social media analytics and value creation in urban smart tourism ecosystems. Information & Management, 54(6), 703– 713.
61
M. Röder,, A. Both,, & A. Hinneburg, (2015). Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining(pp. 399−408). Association for Computing Machinery.
62
K. Stevens,, P. Kegelmeyer,, D. Andrzejewski,, & D. Buttler, (2012). Exploring Topic Coherence over Many Models and Many Topics. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(pp. 952−961). Association for Computational Linguistics.
63
S. Syed,, & C. T. Weber, (2018). Using machine learning to uncover latent research topics in fishery models. Reviews in Fisheries Science & Aquaculture, 26(3), 319– 336.
64
Y. Chen,, Y. Zhu,, & G. Fu, (2022). Visitor perception toward outstanding universal value of Xinjiang Tianshan—Based on web text analysis. Special Zone Economy, 398(3), 124– 128.
65
Q. Liu,, X. Wang,, & J. Liu, (2022). Study on relationship among tourist perceived value, satisfaction and environmental responsibility behavior in forest park. Ecological Economy, 38(2), 137– 141.
66
K. Cao,, & Y. Chen, (2021). Service evaluation of Shenzhen parks based on social data. Special Zone Economy, (4), 127– 129.
67
Y. Ye,, & H. Qiu, (2022). Urban park image perception based on network text analysis. Journal of Chinese Urban Forestry, 20(1), 90– 95.
68
D. Han,, C. Wang,, & M. Xiao, (2018). Text categorization scheme based on semi-supervised learning and Latent Dirichlet allocation model. Computer Engineering and Design, 39(10), 3265– 3271.
69
X. Guo,, J. Ding,, H. Jiang,, & Z. Chen, (2020). ZeroNet text content analysis based on semi-supervised LDA topic model. Information Technology, (3), 32– 38.