UPSI Digital Repository (UDRep)
|
|
|
Abstract : Universiti Pendidikan Sultan Idris |
In this paper, we surveyed recent publications on topic modeling and analyzed the forms of visualizations and tools used. Expectedly, this information will help Natural Language Processing (NLP) researchers to make better decisions about which types of visualization are appropriate for them and which tools can help them. This could also spark further development of existing visualizations or the emergence of new visualizations if a gap is present. Topic modeling is an NLP technique used to identify topics hidden in a collection of documents. Visualizing these topics permits a faster understanding of the underlying subject matter in terms of its domain. This survey covered publications from 2017 to early 2022. The PRISMA methodology was used to review the publications. One hundred articles were collected, and 42 were found eligible for this study after filtration. Two research questions were formulated. The first question asks, "What are the different forms of visualizations used to display the result of topic modeling?" and the second question is "What visualization software or API is used? From our results, we discovered that different forms of visualizations meet different purposes of their display. We categorized them as maps, networks, evolution-based charts, and others. We also discovered that LDAvis is the most frequently used software/API, followed by the R language packages and D3.js. The primary limitation of this survey is it is not exhaustive. Hence, some eligible publications may not be included. 2023, Politeknik Negeri Padang. All rights reserved. |
References |
P. Kherwa and P. Bansal, "Topic Modeling: A Comprehensive Review EAI Endorsed Transactions on Scalable Information Systems," EAI Endorsed Transactions on Scalable Information Systems, vol. 7, no.24, pp. 1–16, 2019. M. J. Page et al., "The PRISMA 2020 statement: An updated guideline for reporting systematic reviews," The BMJ, vol. 372, 2021, doi:10.1136/bmj.n71. U. Chauhan and A. Shah, "Topic Modeling Using Latent Dirichlet allocation: A Survey," ACM Computing Surveys, vol. 54, no. 7, 2022, doi: 10.1145/3462478. C. Sievert and K. Shirley, "LDAvis: A Method for Visualizing and Interpreting Topics," in Workshop on Interactive Language Learning, Visualization and Interfaces, 2015, pp. 63–70, doi: 10.3115/v1/w14-3110. P. N. Castillo, Mastering D3. js. Packt Publishing Ltd, 2014. M. E. Roberts, B. M. Stewart, and D. Tingley, "stm: R Package for Structural Topic Models," Journal of Statistical Software, vol. 91, no.1, pp. 1–40, 2019, doi: 10.18637/jss.v000.i00. A. A. Haidar, B. Yang, and J. G. Ganascia, "Visualizing the first world war using StreamGraphs and information extraction," Proceedings of the International Conference on Information Visualisation, vol. 2016-Augus, pp. 290–293, 2016, doi: 10.1109/IV.2016.81. J. de Leeuw and P. Mair, "Multidimensional scaling using majorization: SMACOF in R," Journal of Statistical Software, vol. 31, no. 3, pp. 1–30, 2009, doi: 10.18637/jss.v031.i03. M. E. Martin and N. Schuurman, "Area-Based Topic Modeling and Visualization of Social Media for Qualitative GIS," Annals of the American Association of Geographers, vol. 107, no. 5, pp. 1028–1039, 2017, doi: 10.1080/24694452.2017.1293499. N. Schneider, N. Fechner, G. A. Landrum, and N. Stiefl, "Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach," Journal of Chemical Information and Modeling, vol. 57, no. 8, pp. 1816–1831, 2017, doi:10.1021/acs.jcim.7b00249. S. Liu and P. Jansson, "City event detection from social media with neural embeddings and topic model visualization," Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017, vol. 2018-Janua, no. 2012, pp. 4111–4116, 2017, doi:10.1109/BigData.2017.8258430. J. Yan et al., "MetaTopics: An integration tool to analyze microbial community profile by topic model," BMC Genomics, vol. 18, no. Suppl 1, pp. 1–5, 2017, doi: 10.1186/s12864-016-3257-2. S. Manna and O. Phongpanangam, "Exploring Topic Models on Short Texts: A Case Study with Crisis Data," Proceedings - 2nd IEEE International Conference on Robotic Computing, IRC 2018, vol. 2018-Janua, pp. 377–382, 2018, doi: 10.1109/IRC.2018.00078. T. Helldin, H. J. Steinhauer, A. Karlsson, and G. Mathiason, "Situation Awareness in Telecommunication Networks Using Topic Modeling," 2018 21st International Conference on Information Fusion, FUSION 2018, pp. 549–556, 2018, doi: 10.23919/ICIF.2018.8455529. X. Cheng et al., "Topic modelling of ecology, environment and poverty nexus: An integrated framework," Agriculture, Ecosystems and Environment, vol. 267, no. July, pp. 1–14, 2018, doi:10.1016/j.agee.2018.07.022. M. Choi et al., "TopicOnTiles: Tile-based spatio-temporal event analytics via exclusive topic modeling on social media," Conference on Human Factors in Computing Systems - Proceedings, vol. 2018- April, pp. 1–11, 2018, doi: 10.1145/3173574.3174157. D. Jin et al., "A novel generative topic embedding model by introducing network communities," The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019, pp.2886–2892, 2019, doi: 10.1145/3308558.3313623. Q. Liu, Q. Chen, J. Shen, H. Wu, Y. Sun, and W. K. Ming, "Data analysis and visualization of newspaper articles on thirdhand smoke:A topic modeling approach," JMIR Medical Informatics, vol. 7, no. 1, pp. 1–9, 2019, doi: 10.2196/12414. K. R. Prasad, M. Mohammed, and R. M. Noorullah, "Hybrid topic cluster models for social healthcare data," International Journal of Advanced Computer Science and Applications, vol. 10, no. 11, pp.490–506, 2019, doi: 10.14569/IJACSA.2019.0101168. C. Koylu, "Modeling and visualizing semantic and spatio-temporal evolution of topics in interpersonal communication on Twitter," International Journal of Geographical Information Science, vol. 33, no. 4, pp. 805–832, 2019, doi: 10.1080/13658816.2018.1458987. D. J. Carter and A. Rahmani, "Proximity and Neighbourhood: Using Topic Modelling to Read The Development of Law in the High Court of Australia," Monash University Law Review, vol. 45, no. 3, pp. 785–824, 2019. A. Goswami, P. Mohapatra, and C. Zhai, "Quantifying and visualizing the demand and supply gap from e-commerce search data using topic models," The Web Conference 2019 - Companion of the World Wide Web Conference, WWW 2019, pp. 348–353, 2019, doi:10.1145/3308560.3316605. S. K. Ray, A. Ahmad, and C. A. Kumar, "Review and Implementation of Topic Modeling in Hindi," Applied Artificial Intelligence, vol. 33, no. 11, pp. 979–1007, 2019, doi: 10.1080/08839514.2019.1661576. B. Zafari and T. Ekin, "Topic modelling for medical prescription fraud and abuse detection," Journal of the Royal Statistical Society. Series C: Applied Statistics, vol. 68, no. 3, pp. 751–769, 2019, doi:10.1111/rssc.12332. E. S. Negara, D. Triadi, and R. Andryani, "Topic Modelling Twitter Data with Latent Dirichlet Allocation Method," ICECOS 2019 - 3rd International Conference on Electrical Engineering and Computer Science, Proceeding, no. October 2019, pp. 386–390, 2019, doi:10.1109/ICECOS47637.2019.8984523. M. Asghari, D. Sierra-Sosa, and A. S. Elmaghraby, "A topic modeling framework for spatio-temporal information management," Information Processing and Management, vol. 57, no. 6, p. 102340, 2020, doi: 10.1016/j.ipm.2020.102340. M. Odlum et al., "Application of topic modeling to tweets as the foundation for health disparity research for COVID-19," Studies in Health Technology and Informatics, vol. 272, pp. 24–27, 2020, doi:10.3233/SHTI200484. G. Yang, A. Ma, Z. S. Qin, and L. Chen, "Application of topic models to a compendium of ChIP-Seq datasets uncovers recurrent transcriptional regulatory modules," Bioinformatics, vol. 36, no. 8, pp.2352–2358, 2020, doi: 10.1093/bioinformatics/btz975. G. Tao, Y. Miao, and S. Ng, "COVID-19 Topic Modeling and Visualization," Proceedings of the International Conference on Information Visualisation, vol. 2020-Septe, no. Iv, pp. 734–739, 2020, doi: 10.1109/IV51561.2020.00129. Q. Deng, Y. Gao, C. Wang, and H. Zhang, "Detecting information requirements for crisis communication from social media data: An interactive topic modeling approach," International Journal of Disaster Risk Reduction, vol. 50, no. January, p. 101692, 2020, doi:10.1016/j.ijdrr.2020.101692. L. Juan, Y. Wang, J. Jiang, Q. Yang, G. Wang, and Y. Wang, "Evaluating individual genome similarity with a topic model," Bioinformatics, vol. 36, no. 18, pp. 4757–4764, 2020, doi:10.1093/bioinformatics/btaa583. Y. Miyata, E. Ishita, F. Yang, M. Yamamoto, A. Iwase, and K. Kurata, "Knowledge structure transition in library and information science: topic modeling and visualization," Scientometrics, vol. 125, no. 1, pp.665–687, 2020, doi: 10.1007/s11192-020-03657-5. H. Liu, Z. Chen, J. Tang, Y. Zhou, and S. Liu, Mapping the technology evolution path: a novel model for dynamic topic detection and tracking, vol. 125, no. 3. Springer International Publishing, 2020. T. Zhang, B. Lee, Q. Zhu, X. Han, and E. M. Ye, "Multi-Dimension Topic Mining Based on Hierarchical Semantic Graph Model," IEEE Access, vol. 8, pp. 64820–64835, 2020, doi:10.1109/ACCESS.2020.2984352. |
This material may be protected under Copyright Act which governs the making of photocopies or reproductions of copyrighted materials. You may use the digitized material for private study, scholarship, or research. |