Please wait a minute...
Landscape Architecture Frontiers

ISSN 2096-336X

ISSN 2095-5413 (Online)

CN 10-1105/TU

邮发代号 80-985

Landscape Architecture Frontiers  2023, Vol. 11 Issue (5): 8-21   https://doi.org/10.15302/J-LAF-1-020083
  本期目录
Comparison and Applicability Study of Analysis Methods for Social Media Text Data: Taking Perception of Urban Parks in Beijing as an Example
Zhenyu SHANG, Kexin CHENG, Yuqing JIAN, Zhifang WANG()
College of Architecture and Landscape, Peking University, Beijing 100080, China
 全文: PDF(2806 KB)   HTML
Abstract

The booming Internet technology and media have generated large sets of social media data, with which the social sensing analyses based on users' reviews have become a research hotspot and have been increasingly applied in the study of urban park usage and perception. However, most existing studies adopt a single model for text data processing. To fill this gap, this study aims to compare social media text data analysis methods and assess their advantages, disadvantages and applicability in park perception research. The Lexicon-based classification analysis model (lexicon model) and LDA (Latent Dirichlet Allocation) model widely used in relevant research were selected. Based on text data obtained from public reviews of 10 urban parks in Beijing on Dianping, this study explored the perception topic distribution of each park and all parks in general, and compared the classification results of perception topics between these two models. Results show that the lexicon model is conducive to the parallel comparison of perception frequency between parks, while the LDA model can directly reflect each park's characteristics and visitors' perception preferences; the combined use of the two models can optimize park perception assessment. Results from the two methods reveal that visitors to urban parks in Beijing focused more on their social recreation needs and visual aesthetics brought by the natural landscape, as well as conditions of the transportation facilities and the consumption in the parks. This research can provide optimization suggestions for the selection and use of social media text analysis methods, and a basis and guidance for park construction and management improvement.

● Exploring the advantages, disadvantages, and applicability of two text analysis models

● The lexicon model is more suitable for parallel comparison between perceived objects by users

● The Latent Dirichlet Allocation (LDA) model can better capture the characteristics of each individual perceived object

● Taking advantage of the two models’ strengths is vital for optimizing landscape perception assessment

Key wordsSocial Sensing    Text Analysis    Lexicon    Latent Dirichlet Allocation (LDA)    Urban Park    Landscape Perception
收稿日期: 2023-02-13      出版日期: 2024-01-16
Corresponding Author(s): Zhifang WANG   
 引用本文:   
. [J]. Landscape Architecture Frontiers, 2023, 11(5): 8-21.
Zhenyu SHANG, Kexin CHENG, Yuqing JIAN, Zhifang WANG. Comparison and Applicability Study of Analysis Methods for Social Media Text Data: Taking Perception of Urban Parks in Beijing as an Example. Landsc. Archit. Front., 2023, 11(5): 8-21.
 链接本文:  
https://academic.hep.com.cn/laf/CN/10.15302/J-LAF-1-020083
https://academic.hep.com.cn/laf/CN/Y2023/V11/I5/8
No. Name Area (hm2) Number of reviews
1 Yuanmingyuan Park 350.0 17,805
2 Yuyuantan Park 129.4 17,698
3 Fragrant Hills Park 188.0 13,825
4 Jingshan Park 23.0 13,628
5 Beijing Shiyuan Park 503.0 11,923
6 Chaoyang Park 288.7 11,750
7 Beijing World Park 53.3 11,338
8 Olympic Forest Park 680.0 10,673
9 Badachu Park 332.0 9,889
10 Beijing Garden Expo Park 513.0 8,736
Tab.1  
Topic Contents Example words from the lexicon
Environmental improvement Air quality improvement, microclimate regulation, noise Humidity, exposure, freshness, wind and sunshine
Biodiversity Animals, plants Swan, holly, dead wood, birdsong and floral fragrance
History and culture Cultural and historical values, cultural heritage, historical sites Qing Dynasty, relics, art, Dragon Boat Festival
Aesthetic appreciation Scenery, beauty, inspiration Flowers, photography, attractive, unpleasant
Education Popularization of science, education Knowledge, learned, knowledgeable, ignorant
Religion Religious worship, belief, refuge Rituals, Buddha beads, Taoism, enlightenment, marriage seeking
Physical and mental recovery Relaxation, stress release, mind restoration Soothing, beautiful, downcast, cheerful
Recreational activities Outdoor activities, sports Walking, boating, hiking, ball games
Social interaction Social integration, interaction between individuals Mom, dad, friends and relatives, gatherings
Tab.2  
Fig.1  
Fig.2  
Fig.3  
Park Number of topics Coherence score
Yuanmingyuan Park 10 0.5611
Yuyuantan Park 10 0.5216
Fragrant Hills Park 9 0.6049
Jingshan Park 9 0.6334
Beijing Shiyuan Park 9 0.5829
Chaoyang Park 10 0.5845
Beijing World Park 6 0.6577
Olympic Forest Park 10 0.5046
Badachu Park 9 0.6385
Beijing Garden Expo Park 8 0.5411
Tab.3  
Yuanmingyuan Park Yuyuantan Park Fragrant Hills Park Jingshan Park Beijing Shiyuan Park
Topic 1 Transportation and tickets Epidemic Hiking activities Cultural heritage Natural landscape
Topic 2 Cultural relics Cherry blossom festival Park Introduction Historical change Service facilities
Topic 3 Garden landscape Lupine view Transportation and tickets Park introduction Social activities
Topic 4 Natural landscape Cherry blossom view Experiential perception Featured flowers Pavilion experience
Topic 5 Historical perception Tickets and consumption Cultural landscape Featured constructions Park introduction
Topic 6 Patriotic education Natural landscape Education and learning Surrounding landscape Music festival
Topic 7 Park introduction Transportation facilities Natural landscape Epidemic Service experience
Topic 8 Lotus flowers Park introduction The Forbidden City vision Transportation and tickets
Topic 9 Featured ice-cream Leisure and entertainment
Topic 10
Chaoyang Park Beijing World Park Olympic Forest Park Badachu Park Beijing Garden Expo Park
Topic 1 Transportation and tickets Transportation and tickets Leisure sports Worship activities Featured gardens
Topic 2 Service experience Collective memory and perception Transportation facilities Leisure facilities Aesthetic experience
Topic 3 Parent-child activities Park introduction Social activities Buddhist landscape Activity experience
Topic 4 Leisure and entertainment Performances Night show Gatherings Transportation facilities
Topic 5 Book market Featured landscapes Sporting activities Hiking activities Cultural activities
Topic 6 Park introduction Summer and Autumn view Park introduction Leisure activities
Topic 7 Temple fair Spring view Collective memory and Park introduction perception
Topic 8 Spring activities Park introduction Transportation facilities
Topic 9 Epidemic
Topic 10 Gatherings
Tab.4  
Fig.4  
Fig.5  
LexiconLDA Transportation and tickets Spring view Featured constructions Religious culture Hiking activities Autumn view Social activities Collective memory and perception Cultural history Gatherings and performances
Environmental improvement 0.012 0.172 ?0.088 ?0.019 0.045 0.095 0.011 ?0.054 ?0.030 ?0.059
Biodiversity ** 0.200 ?0.049 ** ?0.126 0.039 ?0.088 0.020 0.041 0.008
History and culture ?0.076 ?0.066 0.148 0.126 ?0.050 ?0.053 ?0.131 0.049 0.248 **
Aesthetic appreciation ?0.063 0.121 0.024 0.024 ?0.078 0.125 ?0.123 0.031 0.153 ?0.057
Education ?0.026 0.069 0.141 0.045 ?0.067 ?0.050 ?0.044 0.104 0.101 0.009
Religion 0.023 ?0.070 0.010 0.182 0.183 ?0.020 ?0.038 ?0.049 ?0.051 0.046
Physical and mental recovery ** 0.067 ?0.063 ?0.036 0.082 0.032 ?0.002 0.017 ?0.024 ?0.050
Recreational activities 0.082 0.130 ?0.100 ?0.030 0.034 0.010 0.113 ?0.081 ?0.062 0.009
Social interaction 0.063 0.045 ?0.043 ?0.073 ?0.025 ?0.092 0.259 0.028 ?0.157 0.050
Tab.5  
Fig.6  
Research subject Lexicon model LDA model
Perception topic classification Advantages 1) Clear topic classification to make the topics different from each other and with specific contents 2) Effective identification of perception contents with limited attention 3) Analysis results available for parallel comparison between parks More comprehensive, real-time reflection of visitors' perception contents
Disadvantages Little consideration of the actual use of the park, resulting in a possible lack of perception contents 1) Lack of sensitivity to low-frequency perception topics due to failing to extract perception contents less frequently mentioned 2) Incapability of making parallel comparison between parks
Perception content identification Advantages Precise identification of the perception contents, covering topics with low perception frequency 1) Emphasis on the park characteristics, by a detailed classification of the perception topics with high awareness level 2) Clear perception contents to ensure quick identification of the topics
Disadvantages 1) Words extracted basing on the lexicon, for which the analysis results rely heavily on the completeness of the lexicon 2) Further manual interpretation of specific perception contents required Relatively blurred boundaries between perception topics
Scope of application Advantages Suitable for regional-scale, multi-park perception analysis and inter-park comparison, with a comprehensive lexicon that can be adapted to different research objects More effective perception analysis for individual parks
Disadvantages Higher requirements for the lexicon to make it adjustable Ambiguous results for regional-scale, multi-park perception analysis, making manual interpretation difficult
Tab.6  
1 A. P. Ferreira,, T. H. Silva,, & A. A. Loureiro, (2020) Uncovering spatiotemporal and semantic aspects of tourists mobility using social sensing. Computer Communications, (160), 240– 252.
2 Y. Li,, J. Guo,, & Y. Chen, (2022). A new approach for tourists' visual behavior patterns and perception evaluation based on multi-source data. Journal of Geo-information Science, 24(10), 2004– 2020.
3 Y. Liu,, X. Liu,, S. Gao,, L. Gong,, C. Kang,, Y. Zhi,, G. Chi,, & L. Shi, (2015). Social sensing: A new approach to understanding our socioeconomic environments. Annals of the Association of American Geographers, 105(3), 512– 530.
4 Y. Liu, (2016). Revisiting several basic geographical concepts: A social sensing perspective. Acta Geographica Sinica, 71(4), 564– 575.
5 T. Mao,, Y. Wu,, & W. Huang, (2023). Content mining and sentiment analysis of online comments for ethnic museums in autonomous regions. Economic Geography, 43(8), 229– 236.
6 X. He, (2019). Research on Social Sensing and Spatiotemporal Pattern of Xiong'an New District Based on Weibo Data [Master's thesis]. Hebei Normal University.
7 S. Zhang,, & W. Zhou, (2018). Recreational visits to urban parks and factors affecting park visits: Evidence from geotagged social media data. Landscape and Urban Planning, (180), 27– 35.
8 M. L. Donahue,, B. L. Keeler,, S. A. Wood,, D. M. Fisher,, Z. A. Hamstead,, & T. McPhearson, (2018). Using social media to understand drivers of urban park visitation in the Twin Cities, MN. Landscape and Urban Planning, (175), 1– 10.
9 F. Li,, F. Li,, S. Li,, & Y. Long, (2019). Deciphering the recreational use of urban parks: Experiments using multi-source big data for all Chinese cities. Science of the Total Environment, (701), 134896.
10 H. Liang,, & Q. Zhang, (2021). Temporal and spatial assessment of urban park visits from multiple social media data sets: A case study of Shanghai, China. Journal of Cleaner Production, (297), 126682.
11 D. B. Van Berkel,, P. Tabrizian,, M. A. Dorning,, L. Smart,, D. Newcomb,, M. Mehaffey,, A. Neale,, & R. K. Meentemeyer, (2018). Quantifying the visual-sensory landscape qualities that contribute to cultural ecosystem services using social media and LiDAR. Ecosystem services, (31), 326– 335.
12 E. Oteros-Rozas,, B. Martín-López,, N. Fagerholm,, C. Bieling,, & T. Plieninger, (2018). Using social media photos to explore the relation between cultural ecosystem services and landscape features across five European sites. Ecological Indicators, (94), 74– 86.
13 D. R. Richards,, & D. A. Friess, (2015). A rapid indicator of cultural ecosystem service usage at a fine spatial scale: Content analysis of social media photographs. Ecological Indicators, (53), 187– 195.
14 Y. Pan,, & J. Li, (2021). Landscape preference based on user-generated photograph metadata: The case of Xixi National Wetland Park. Natural Protected Areas, (1), 100– 108.
15 X. Zhu,, M. Gao,, R. Zhang,, & B. Zhang, (2021). Quantifying emotional differences in urban green spaces extracted from photos on social networking sites: A study of 34 parks in three cities in northern China. Urban Forestry & Urban Greening, (62), 127133.
16 F. M. Wartmann,, E. Acheson,, & R. S. Purves, (2018). Describing and comparing landscapes using tags, texts, and free lists: An interdisciplinary approach. International Journal of Geographical Information Science, 32(8), 1572– 1592.
17 Y. Yan,, J. Chen,, & Z. Wang, (2020). Mining public sentiments and perspectives from geotagged social media data for appraising the post-earthquake recovery of tourism destinations. Applied Geography, (123), 102306.
18 C. Marcotte,, & P. A. Stokowski, (2021). Place meanings and national parks: A rhetorical analysis of social media texts. Journal of Outdoor Recreation and Tourism, (35), 100383.
19 H. Bai,, Z. Song,, S. Liang,, P. Zhang,, & G. Zhang, (2023). Imagery perception analysis and comprehensive attraction evaluation of tourism destinations based on Internet text data—Taking Nanjing City as example. Areal Research and Development, 42(4), 89.
20 Y. Zhao,, S. Pang,, & Z. Wu, (2021). Research on geographic semantic ontology model based on social sensing data for emergency management of events. Information Science, (2), 44– 53.
21 Y. Chen,, C. Gong,, Y. Fan,, X. Li,, Y. Liang,, & M. Hu, (2022). Spatio-temporal variation assessment of urban waterlogging in Zhengzhou using social media data. Journal of China Hydrology, 42(3), 26, 48– 52.
22 S. Li,, F. Zhao,, Y. Zhou,, X. Tian,, & H. Huang, (2022). Analysis of public opinion and disaster loss estimates from typhoons based on Microblog data. Journal of Tsinghua University (Science and Technology), 62(1), 43– 51.
23 B. Yang,, & J. Zhang, (2017). Research on tourism image and perception of Tianmu Mountain based on network text analysis— Based on travel notes and comments of Ctrip. Journal of Fujian Forestry Science and Technology, 44(4), 118– 125.
24 X. Wang,, & M. Xia, (2018). Research on tourist preference and satisfaction in Huangshan Scenic Spot based on network review data. Tourism Overview, (18), 59– 60.
25 A. C. Wight, (2020). Visitor perceptions of European Holocaust Heritage: A social media analysis. Tourism Management, (81), 104142.
26 Z. Xu,, J. Dong,, Z. Chen,, W. Fu,, M. Wang,, & J. Dong, (2021). Image Perception of the historical ancient town scenic spot of Yunshuiyao. Journal of Chinese Urban Forestry, 19(2), 115– 120.
27 S. B. Park,, J. Kim,, Y. K. Lee,, & C. M. Ok, (2020). Visualizing theme park visitors' emotions using social media analytics and geospatial analytics. Tourism Management, (80), 104127.
28 N. O. Widmar,, C. Bir,, M. Clifford,, & N. Slipchenko, (2020). Social media sentimentas an additional performance measure? Examples from iconic theme park destinations. Journal of Retailing and Consumer Services, (56), 102157.
29 C. Wan,, G. Q. Shen,, & S. Choi, (2021). Eliciting users' preferences and values in urban parks: Evidence from analyzing social media data from Hong Kong. Urban Forestry & Urban Greening, (62), 127172.
30 L. Li,, C. Zhang,, L. Han,, L. Qing,, & H. Ji, (2021). Research on multi-scale evaluation system of parks based on comment text—Taking Chengdu parks as an example. Intelligent City, (2), 3– 6.
31 Q. Jiang,, G. Wang,, X. Liang,, & N. Liu, (2022). Research on the perception of cultural ecosystem services in urban parks via analyses of online comment data. Landscape Architecture Frontiers, 10(5), 32– 51.
32 F. Jing,, H. Sun,, & D. Long, (2017). Tourist experience elements structure characteristics analysis of Xixi National Wetland Park based on web text. Journal of Zhejiang University (Science Edition), 44(5), 623– 630.
33 X. Wang,, & X. Li, (2017). Research on the analysis of social services value of forest park in Beijing based on network big data. Chinese Landscape Architecture, (10), 14– 18.
34 S. Zhao,, & B. Liu, (2019). Research on visitor perception of urban parks based on analysis of network text data—Take the main urban area of Nanjing as an example. 2019 Urban Development and Planning Proceedings( pp. 263−272). Chinese Society for Urban Studies.
35 X. Gao,, Y. Jin,, X. Wang,, & J. Hao, (2021). Research on product perceptual evaluation method based on online review mining. Modern Manufacturing Engineering, (12), 13– 20.
36 X. Lu, (2014). Research on text clustering algorithm based on K-means. Computer Programming Skills & Maintenance, (24), 33– 35.
37 D. Wang,, J. Li,, & Y. Shi, (2020). Methods of government document clustering based on K-means algorithm. Software Guide, 19(6), 201– 204.
38 W. Ma,, G. Chen,, X. Li,, W. Su,, Y. Chai,, Y. Pu,, J. Zeng,, & X. Liu, (2021). Chinese comment classification based on Naive Bayesian algorithm. Journal of Computer Applications, 41(S2), 31– 35.
39 F. C. Permana,, Y. Rosmansyah,, & A. S. Abdullah, (2017). Naive Bayes as opinion classifier to evaluate students satisfaction based on student sentiment in Twitter social media. Journal of Physics: Conference Series, (893), 012051.
40 X. Han,, & Y. Li, (2022). Research on the influencing factors of social media rumor-refuting information dissemination effect in emergencies. Information Studies: Theory & Application, 45(8), 97– 103.
41 Y. Zeng,, Z. Li,, & Y. Zhou, (2020). Article feature extraction and flow control based on text mining. Electronic Technology & Software Engineering, (2), 176– 177.
42 Z. Wang,, Y. Miao,, M. Xu,, Z. Zhu,, S. Qureshi,, & Q. Chang, (2021). Revealing the differences of urban parks' services to human wellbeing based upon social media data. Urban Forestry & Urban Greening, (63), 127233.
43 Z. Wang,, Z. Zhu,, M. Xu,, & S. Qureshi, (2021). Fine-grained assessment of greenspace satisfaction at regional scale using content analysis of social media and machine learning. Science of the Total Environment, (776), 145908.
44 T. Zheng,, Y. Yan,, W. Zhang,, J. Zhu,, C. Wang,, Y. Rong,, & H. Lu, (2022). Landscape assessment on urban parks using social media data. Acta Ecologica Sinica, 42(2), 561– 568.
45 V. Taecharungroj,, & B. Mathayomchan, (2019). Analysing TripAdvisor reviews of tourist attractions in Phuket, Thailand. Tourism Management, (75), 550– 568.
46 S. Dong,, & Q. Wang, (2019). LDA-based tourist perception dimension recognition: Research framework and empirical research—Taking the National Mine Park as an example. Journal of Beijing Union University (Humanities and Social Sciences), 17(2), 42– 49.
47 C. Liang,, & R. Li, (2020). Tourism destination image perception analysis based on the Latent Dirichlet Allocation model and dominant semantic dimensions: A case of the Old Town of Lijiang. Progress in Geography, 39(4), 614– 626.
48 Y. Song,, R. Wang,, J. Fernandez,, & D. Li, (2021). Investigating sense of place of the Las Vegas Strip using online reviews and machine learning approaches. Landscape and Urban Planning, (205), 103956.
49 W. Zhou, (2021) Research on Tourism Destination Evaluation Based on Improved AHP of LDA: A Case Study of 5A Scenic Spots in Jiangxi Province. [Master's thesis]. Jiangxi University of Finance and Economics.
50 Beijing Statistics Bureau (2021). Beijing statistics yearbook. China Statistics Press.
51 Z. Zhu, (2020). An Assessment Framework of Green Space Satisfaction Using Social Media Data: Content Analysis with Machine Learning. [Master's thesis]. Peking University.
52 Z. Wang,, Y. Miao,, Z. Zhu,, J. Zhou,, & S. Wang, (2020). A method for landscape service identification of parks. (No. CN111310444A). China National Intellectual Property Administration.
53 S. Buchel,, & N. Frantzeskaki, (2015). Citizens' voice: A case study about perceived ecosystem services by urban park users in Rotterdam, the Netherlands. Ecosystem Services, (12), 169– 177.
54 S. Huang,, J. Pearce,, J. Wen,, R. K. Dowling,, & A. J. Smith, (2020). Segmenting Western Australian national park visitors by perceived benefits: A factor-item mixed approach. International Journal of Tourism Research, 22(6), 814– 824.
55 L. Willemen,, P. H. Verburg,, L. Hein,, & M. E. van Mensvoort, (2008). Spatial characterization of landscape functions. Landscape and Urban Planning, 88(1), 34– 43.
56 R. Sun,, F. Li,, & L. Chen, (2019). A demand index for recreational ecosystem services associated with urban parks in Beijing, China. Journal of Environmental Management, (251), 109612.
57 C. J. van Riper,, G. T. Kyle,, S. G. Sutton,, M. Barnes,, & B. C. Sherrouse, (2012). Mapping outdoor recreationists' perceived social values for ecosystem services at Hinchinbrook Island National Park, Australia. Applied Geography, 35(1−2), 164– 173.
58 J. Wang,, M. Wang,, & B. Du, (2019). A study of the change trend of social concern in the field of consumption in China—The LDA Model analysis based on the text of Daily Economic News List in People's Daily Online (2007—2017). Journal of Baoding University, 32(2), 41– 49.
59 D. M. Blei,, A. Y. Ng,, & M. I. Jordan, (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, (3), 993– 1022.
60 T. Brandt,, J. Bendler,, & D. Neumann, (2017). Social media analytics and value creation in urban smart tourism ecosystems. Information & Management, 54(6), 703– 713.
61 M. Röder,, A. Both,, & A. Hinneburg, (2015). Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining(pp. 399−408). Association for Computing Machinery.
62 K. Stevens,, P. Kegelmeyer,, D. Andrzejewski,, & D. Buttler, (2012). Exploring Topic Coherence over Many Models and Many Topics. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(pp. 952−961). Association for Computational Linguistics.
63 S. Syed,, & C. T. Weber, (2018). Using machine learning to uncover latent research topics in fishery models. Reviews in Fisheries Science & Aquaculture, 26(3), 319– 336.
64 Y. Chen,, Y. Zhu,, & G. Fu, (2022). Visitor perception toward outstanding universal value of Xinjiang Tianshan—Based on web text analysis. Special Zone Economy, 398(3), 124– 128.
65 Q. Liu,, X. Wang,, & J. Liu, (2022). Study on relationship among tourist perceived value, satisfaction and environmental responsibility behavior in forest park. Ecological Economy, 38(2), 137– 141.
66 K. Cao,, & Y. Chen, (2021). Service evaluation of Shenzhen parks based on social data. Special Zone Economy, (4), 127– 129.
67 Y. Ye,, & H. Qiu, (2022). Urban park image perception based on network text analysis. Journal of Chinese Urban Forestry, 20(1), 90– 95.
68 D. Han,, C. Wang,, & M. Xiao, (2018). Text categorization scheme based on semi-supervised learning and Latent Dirichlet allocation model. Computer Engineering and Design, 39(10), 3265– 3271.
69 X. Guo,, J. Ding,, H. Jiang,, & Z. Chen, (2020). ZeroNet text content analysis based on semi-supervised LDA topic model. Information Technology, (3), 32– 38.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed