论文标题

Wordstream Maker:定性时间序列数据的轻质端到端可视化平台

WordStream Maker: A Lightweight End-to-end Visualization Platform for Qualitative Time-series Data

论文作者

Nguyen, Huyen N., Dang, Tommy, Bowe, Kathleen A.

论文摘要

无论是以转录的对话的形式,博客文章还是推文的形式,定性数据都为读者提供了对总体趋势以及通过文本表达的人类思想的多样性的丰富见解。但是,处理和分析大量定性数据很困难,通常需要多次密集的详细信息才能识别模式。这个困难与数据集中存在的每个其他问题或时间点相乘。然后,一个主要的挑战是创建可视化,以通过使识别和探索感兴趣趋势的易于识别和探索来支持定性数据的解释。通过结合文本和可视化的功能,WordStream先前已实现了时间序列文本数据的易于检索和处理,但是生成WordsTream所需的数据包裹仍然是非技术用户的重要障碍。作为回应,本文介绍了WordStream Maker:一个使用自然语言处理(NLP)的管道的端到端平台来帮助非技术用户处理原始文本数据并在没有编程练习的情况下生成可自定义的可视化。讨论了从将NLP集成到可视化中的经验教训,并讨论了大型数据集的规模,以及用例,以证明平台的有用性。

Whether it is in the form of transcribed conversations, blog posts, or tweets, qualitative data provides a reader with rich insight into both the overarching trends as well as the diversity of human ideas expressed through text. Handling and analyzing large amounts of qualitative data, however, is difficult, often requiring multiple time-intensive perusals in order to identify patterns. This difficulty is multiplied with each additional question or time point present in a data set. A primary challenge then is creating visualizations that support the interpretation of qualitative data by making it easier to identify and explore trends of interest. By combining the affordances of both text and visualizations, WordStream has previously enabled ease of information retrieval and processing of time-series text data, but the data-wrangling necessary to produce a WordStream remains a significant barrier for non-technical users. In response, this paper presents WordStream Maker: an end-to-end platform with a pipeline that utilizes natural language processing (NLP) to help non-technical users process raw text data and generate a customizable visualization without programming practice. Lessons learned from integrating NLP into visualization and scaling to large data sets are discussed, along with use cases to demonstrate the usefulness of the platform.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源