2023 Text Analysis Platform Hands-On Workshop III: Event Record – Text Mining and Digital Transformation Service Industry-University Alliance

Event Date: October 13, 2023, Friday, 13:30-17:00

Event Venue: Classroom CM3022, College of Management, National Sun Yat-sen University

The 2023 Text Analysis and Digital Transformation Industry-Academia Alliance is hosting its third workshop, which is also the first event following the recruitment of academic members. We have invited 25 participants, including corporate members and academic members, both in-person and via online video.

The primary purpose of this event is to instruct alliance members on how to use the basic functionalities of the text analysis platform. Through the guidance of the instructor, we aim for our members to become proficient in utilizing the platform’s features effectively, enabling them to leverage the platform’s resources for various types of analysis. To accommodate members who cannot attend the course in person, we provide synchronous options for both in-person and online participation, ensuring that remote members can actively engage in the learning process.

In this workshop, the instruction is led by Dr. Yihang Tsai, a doctoral student from Professor Huang Sanyi’s team. Dr. Tsai is supported by an online teaching assistant and two on-site teaching assistants who are readily available to provide assistance and address questions in real time. The topics covered in this session include “Data Collection,” “Data Preprocessing,” and “Text Content Analysis,” aiming to help participants, whether they are newcomers or revisiting the platform, become familiar with its operation and application.

The course begins with the fundamentals of data retrieval, teaching participants how to locate the documents they need. It then progresses to data preprocessing, which involves tasks such as standardizing formats, sentence segmentation, filtering important phrases, and removing stop words. Throughout the instruction, practical exercises are integrated with real-world examples, enabling participants to gain a deeper understanding of the operational workflow of the text analysis platform by actively using it. This hands-on experience equips them to apply the platform’s capabilities to their future research topics or data analysis projects.

Next, the concept of text content analysis will be introduced, with a focus on the application of regular expressions to describe the syntax rules of string manipulation. This includes tasks such as searching, matching, and extracting text based on specific patterns, such as URLs or email addresses. The instructional scenarios will involve analyzing specific target content within unstructured text and extracting relevant information. Therefore, the aim of this workshop is to ensure that participants, after completing the course, have a comprehensive understanding of how to retrieve data, process documents, and present the desired results in the form of visualizations like word clouds through the text analysis steps.

Through the workshop’s lectures and hands-on exercises, participants will progressively learn and become familiar with how to utilize the text analysis platform for data collection and processing. The interactive nature of the workshop encourages participants to engage in discussions with one another and seek guidance from the instructor and teaching assistants when encountering challenges during the operational process. This collaborative learning environment allows for the exchange of ideas and the resolution of any issues that may arise.

Finally, Professor Huang Sanyi, the host of the alliance, will conduct a Q&A session and wrap up the event. Participants will also be asked to complete a satisfaction survey to provide the alliance with valuable feedback and suggestions. The alliance expresses its gratitude for the support and encouragement from participants towards its activities. The next workshop is scheduled for December, and the alliance looks forward to the enthusiastic participation of its members. It aims to enhance their digital skills and foster cross-disciplinary experiences in the field of text analysis.