Research

underlined indicates students/postdocs I directly supervised;
† indicates equal contribution.

Recent Publications (Complete List...)

2025

[J41]

Reuben Luera, Ryan Rossi, Alexa Siu, Franck Dernoncourt, Tong Yu, Sungchul Kim, Ruiyi Zhang, Xiang Chen, Hanieh Salehy, Nedim Lipka, Samyadeep Basu, Puneet Mathur, Jian Zhao. Survey on User Interface Design and Interactions for Generative AI Applications. Foundations and Trends in Human-Computer Interaction, 19(3), pp. 213-289, 2025.
FnTs

Abstract: The applications of generative AI are diverse and impressive, and the interplay between users and AI in shaping these applications' impact is crucial. Current human-AI interaction literature has taken a broad look at how humans interact with generative AI, but it merits a deeper look into the user interface designs and patterns used to create these applications. Therefore, we present a survey that comprehensively presents taxonomies of how a human interacts with AI and the user interaction patterns designed to meet the needs of a variety of relevant use cases. We focus on explicit, user-initiated interactions and implicit, system-driven engagements, addressing how these approaches tackle critical issues, such as how generative AI applications can best be designed to meet user agency and control needs. With this survey, we aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike. In doing so, we also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.

[J40]

Grace Guo, Subhajit Das, Jian Zhao, Alex Endert. More Like Vis, Less Like Vis: Comparing Interactions for Integrating User Preferences into Partial Specification Recommenders. IEEE Transactions on Visualization and Computer Graphics, 2025 (In Press).
TVCG

Abstract: Visualization recommendation systems make data exploration less tedious by automating the process of visualization generation. They are particularly helpful for non-expert users who may not be familiar with a data set or the process of visualization specification. These systems allow users to input their preferences in the form of partial specifications to steer the recommendations made. However, the interaction approaches for partial specification input and their trade-offs have not been explored in prior work. In this paper, we compare three different combinations of interaction approaches and granularities for users to indicate a preferred partial specification: 1) manual input, 2) inferring preferred partial specifications from binary like/dislike ratings for a visualization as a whole, or 3) inferring preferred partial specifications from binary like/dislike ratings for granular components of a visualization specification. In a between-subjects study, participants were assigned to one of three conditions and asked to complete a data exploration task. Our results indicate that manual input led to a greater coverage of data dimensions, while like/dislike ratings led to a greater diversity of marks and channels used. Qualitative participant feedback also reveals differences in user strategy and visualization comprehension across the three interaction conditions. Finally, we conclude with a discussion on implications for multiplicity and visualization comprehension during visual data exploration.

[J39]

Di Liu^†, Jingwen Bai^†, Zhuoyi Zhang, Yilin Zhang, Zhenhao Zhang, Jian Zhao, Pengcheng An. TherAIssist: Assisting Art Therapy Homework and Client-Practitioner Collaboration through Human-AI Interaction. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 9(3), pp. 113:1-113:38, 2025.
IMWUT

Abstract: Art therapy homework is essential for fostering clients' reflection on daily experiences between sessions. However, current practices present challenges: clients often lack guidance for completing tasks that combine art-making and verbal expression, while therapists find it difficult to track and tailor homework. How HCI systems might support art therapy homework remains underexplored. To address this, we present TherAIssist, comprising a client-facing application leveraging human-AI co-creative art-making and conversational agents to facilitate homework, and a therapist-facing application enabling customization of homework agents and AI-compiled homework history. A 30-day field study with 24 clients and 5 therapists showed how TherAIssist supported clients' homework and reflection in their everyday settings. Results also revealed how therapists infused their practice principles and personal touch into the agents to offer tailored homework, and how AI-compiled homework history became a meaningful resource for in-session interactions. Implications for designing human-AI systems to facilitate asynchronous client-practitioner collaboration are discussed.

[J38]

Yuzhe You, Jarvis Tse, Jian Zhao. Panda or not Panda? Understanding Adversarial Attacks with Interactive Visualization. ACM Transactions on Interactive Intelligent Systems, 15(2), pp. 11:1-11:31, 2025.
TIIS

Abstract: Adversarial machine learning (AML) studies attacks that can fool machine learning algorithms into generating incorrect outcomes as well as the defenses against worst-case attacks to strengthen model robustness. Specifically for image classification, it is challenging to understand adversarial attacks due to their use of subtle perturbations that are not human-interpretable, as well as the variability of attack impacts influenced by diverse methodologies, instance differences, and model architectures. Through a design study with AML learners and teachers, we introduce AdvEx, a multi-level interactive visualization system that comprehensively presents the properties and impacts of evasion attacks on different image classifiers for novice AML learners. We quantitatively and qualitatively assessed AdvEx in a two-part evaluation including user studies and expert interviews. Our results show that AdvEx is not only highly effective as a visualization tool for understanding AML mechanisms, but also provides an engaging and enjoyable learning experience, thus demonstrating its overall benefits for AML learners.

[J37]

Jiaqi Jiang, Shanghao Li, Xian Li, Yingxin Xu, Jian Zhao, Pengcheng An. Playful Antisedentary Interactions for Online Meeting Scenarios: A Research Through Design Approach. JMIR Serious Games, 13(e62778), pp. 1-20, 2025.
JMIR

Abstract: Background: Online meetings have become part of many people's everyday lives. People spend a longer time sitting still in front of screens. The extended periods of uninterrupted sedentary behavior bring irreversible health damage in the long term. Previous studies have demonstrated many interventions to change sedentary lifestyle. However, few of them targeted at solving sedentary behavior when meeting online. The design opportunities in online meeting contexts are not well explored yet. Objective: This study aims to understand users' experiences with gamified bodily interaction as an anti-sedentary measure during online meetings, as well as to explore how to design appropriate anti-sedentary interactions for online meeting scenarios. Methods: This study adopted a 'research through design' approach to develop and get users' experience of gamified bodily interactions as interventions against sedentary behavior during online meetings. In collaboration with 11 users, we co-designed and iterated three prototypes, which led to the development of BIG-AOME (Bodily Interaction Gamification towards Anti-sedentary Online Meeting Environments) framework. User studies were conducted with three groups totaling 15 participants, utilizing these prototypes. During co-design and evaluation, all group semi-structured interviews were transcribed into written format and analyzed using Hsieh's conventional qualitative content analysis method. Results: We developed three prototypes as design instances of anti-sedentary gamified bodily interactions for online meetings. Empirical findings were gathered to understand user experiences with these prototypes. Additionally, we have established and detailed a preliminary design framework for crafting gamified bodily interactions for online meeting environments. Conclusions: Our research findings indicate that designing anti-sedentary bodily interactions for online meetings has the potential to alter sedentary behaviors while enhancing social connections. Furthermore, the BIG-AOME framework that we propose explores the design space for anti-sedentary physical interactions in the context of online meetings. This framework detailing pertinent design choices and considerations.

[C57]

Yue Lyu, Xizi Wang, Hanlu Ma, Yalong Yang, Jian Zhao. ATCion: Exploring the Design of Icon-based Visual Aids for Enhancing In-cockpit Air Traffic Control Communication. Proceedings of ACM Symposium on User Interface Software and Technology, pp. 73:1-73:21, 2025.
UIST'25

Abstract: Effective communication between pilots and air traffic control (ATC) is essential for aviation safety, but verbal exchanges over radios are prone to miscommunication, especially under high workload conditions. While cockpit-embedded visual aids offer the potential to enhance ATC communication, little is known about how to design and integrate such aids. We present an exploratory, user-centered investigation into the design and integration of icon-based visual aids, named ATCion, to support in-cockpit ATC communication, through four phases involving 22 pilots and 1 ATC controller. This study contributes a validated set of design principles and visual icon components for ATC messages. In a comparative study of ATCion, text-based visual aids, and no visual aids, we found that our design improved readback accuracy and reduced memory workload, without negatively impacting flight operations; most participants preferred ATCion over text-based aids, citing their clarity, low cognitive cost, and fast interpretability. Further, we point to implications and opportunities for integrating icon-based aids into future multimodal ATC communication systems to improve both safety and efficiency.

[C56]

Xuye Liu, Yuzhe You, Tengfei Ma, Jian Zhao. MACEDON : Supporting Programmers with Real-Time Multi-Dimensional Code Evaluation and Optimization. Proceedings of ACM Symposium on User Interface Software and Technology, pp. 31:1-31:17, 2025.
UIST'25

Abstract: Recent advancements in Large Language Models (LLMs) have led programmers to increasingly turn to them for code optimization and evaluation. However, programmers need to frequently switch between code evaluation and prompt authoring because there is a lack of understanding of the underlying code. Yet, current LLM-driven code assistants do not provide sufficient transparency to help programmers track their code based on the intended evaluation metrics, a crucial step before aligning these evaluations with their optimization goals. To address this gap, we adopted an iterative, user-centered design process by first conducting a formative study and a large-scale code analysis. Based on the findings, we then developed MACEDON, a system that supports multi-dimensional code evaluation in real time, direct code segment optimization, as well as shareable report displays. We evaluated MACEDON through a controlled lab study with 24 novice programmers and two real-world case studies. The results show that MACEDON significantly improved users' ability to identify code issues, apply effective optimizations, and understand their code's evolving state. Our findings suggest that multi-dimensional evaluation, combined with interactive, segment-specific guidance, empowers users to perform more structured and confident code optimization.

[C55]

Wenshuo Zhang, Leixian Shen, Shuchang Xu, Jindu Wang, Jian Zhao, Huamin Qu, Linping Yuan. NeuroSync: Intent-Aware Code-Based Problem Solving via Direct LLM Understanding Modification. Proceedings of ACM Symposium on User Interface Software and Technology, pp. 30:1-30:19, 2025.
Best Paper Honorable Mention UIST'25

Abstract: Conversational LLMs have been widely adopted by domain users with limited programming experience to solve domain problems. However, these users often face misalignment between their intent and generated code, resulting in frustration and rounds of clarification. This work first investigates the cause of this misalignment, which dues to bidirectional ambiguity: both user intents and coding tasks are inherently nonlinear, yet must be expressed and interpreted through linear prompts and code sequences. To address this, we propose direct intent-task matching, a new human-LLM interaction paradigm that externalizes and enables direct manipulation of the LLM understanding, i.e., the coding tasks and their relationships inferred by the LLM prior to code generation. As a proof-of-concept, this paradigm is then implemented in NeuroSync, which employs a knowledge distillation pipeline to extract LLM understanding, user intents, and their mappings, and enhances the alignment by allowing users to intuitively inspect and edit them via visualizations. We evaluate the algorithmic components of NeuroSync via technical experiments, and assess its overall usability and effectiveness via a user study (N=12). The results show that it enhances intent-task alignment, lowers cognitive effort, and improves coding efficiency.

[C54]

Xuye Liu, Tengfei Ma, Yimu Wang, Fengjie Wang, Jian Zhao. NBDESCRIB: A Dataset for Text Description Generation from Tables and Code in Jupyter Notebooks with Guidelines. Findings of the Association for Computational Linguistics, pp. 26584–26606, 2025.
ACL'25

Abstract: Generating cell-level descriptions for Jupyter Notebooks, which is a major resource consisting of codes, tables, and descriptions, has been attracting increasing research attention. However, existing methods for Jupyter Notebooks mostly focus on generating descriptions from code snippets or table outputs independently. On the other side, descriptions should be personalized as users have different purposes in different scenarios while previous work ignored this situation during description generation. In this work, we formulate a new task, personalized description generation with code, tables,and user-written guidelines in Jupyter Notebooks. To evaluate this new task, we collect and propose a benchmark, namely NBDESCRIB, containing code, tables, and user-written guidelines as inputs and personalized descriptions as targets. Extensive experiments show that while existing models of text generation are able to generate fluent and readable descriptions, they still struggle to produce factually correct descriptions without user-written guidelines. CodeT5 achieved the highest scores in Orientation (1.27) and Correctness (-0.43) among foundation models in human evaluation, while the ground truth scored higher in Orientation (1.45) and Correctness (1.19). Common error patterns involve misalignment with guidelines, incorrect variable values, omission of im-031 portant code information, and reasoning errors.032 Moreover, ablation studies show that adding guidelines significantly enhances performance, both qualitatively and quantitatively.

[C53]

Ryan Yen, Yimeng Xie, Nicole Sultanum, Jian Zhao. To Search or To Gen? Design Dimensions Integrating Web Search and Generative AI in Programmers' Information-Seeking Process. Proceedings of the ACM Designing Interactive Systems Conference, pp. 1084-1106, 2025.
DIS'25

Abstract: Programmers now use both generative AI (GenAI) and traditional web search for information-seeking, yet how these tools are used individually or in combination remains unclear. To answer this, we conducted a multi-phase investigation, including retrospective interviews to identify foraging behaviours and challenges and an observational study with a technology probe to analyze how contextual information flows across tools. Our findings reveal that effective information-seeking requires adaptable strategies and varying levels of contextual detail. Building on these insights, we propose five design dimensions for developing tools that integrate web search, GenAI, and code editors. We further demonstrated the generative power of these design dimensions with a proof-of-concept prototype, validated through a user study, offering actionable design implications for enhancing integrated information-seeking workflows across web search and GenAI in programming.

[C52]

Zipeng Ji, Pengcheng An, Jian Zhao. ClassComet: Exploring and Designing AI-generated Danmaku in Educational Videos to Enhance Online Learning. Proceedings of the ACM Designing Interactive Systems Conference, pp. 552-575, 2025.
DIS'25

Abstract: Danmaku, users' live comments synchronized with, and overlaying on videos, has recently shown potential in promoting online video-based learning. However, user-generated danmaku can be scarce—especially in newer or less viewed videos—and its quality is unpredictable, limiting its educational impact. This paper explores how large multimodal models (LMM) can be leveraged to automatically generate effective, high-quality danmaku. We first conducted a formative study to identify the desirable characteristics of content- and emotion-related danmaku in educational videos. Based on the obtained insights, we developed ClassComet, an educational video platform with novel LMM-driven techniques for generating relevant types of danmaku to enhance video-based learning. Through user studies, we examined the quality of generated danmaku and their influence on learning experiences. The results indicate that our generated danmaku is comparable to human-created ones, and videos with both content- and emotion-related danmaku showed significant improvement in viewers' engagement and learning outcome.

[C51]

Yuzhe You, Jian Zhao. Exploring Comparative Visual Approaches for Understanding Model Trade-offs in Adversarial Machine Learning. Proceedings of the Graphics Interface Conference, 2025 (In Press).
Best Student Paper GI'25

Abstract: Despite the effectiveness of adversarial training (AT) in enhancing model robustness, it suffers from the accuracy-robustness trade-off and the "robust fairness" problem. To strategize effectively, practitioners have the need to explore and compare model performance in both standard and adversarial settings concurrently. This work presents a design study with 11 experts to explore effective comparative visual techniques for multi-level trade-off analysis. We first collaborated with five adversarial machine learning (AML) experts in an iterative design process, based on which we developed a visual analytics design probe, VATRA, that employs an augmented hybrid comparative design to support concurrent accuracy and robustness evaluations for assessing model trade-offs. Further, we conducted user studies with six domain experts and derived two in-depth use cases of VATRA, providing empirical knowledge about how ML practitioners can leverage comparative visualizations for AML trade-off analysis.

[C50]

Jiawen Stefanie Zhu, Jian Zhao. What is Jiaozi: Exploring User Control of Language Style in Multilingual Conversational Agents. Proceedings of the Graphics Interface Conference, 2025 (In Press).
GI'25

Abstract: Recent advances in language models have significantly expanded the capabilities of AI-powered conversational agents. Nonetheless, current technology is still primarily designed with monolingual English speakers in mind, overlooking the need of more personalized agents by multilingual users. Particularly, prior work showed that multilingual individuals preferred conversational agents that accommodate their desired multilingual style. However, these approaches rely on probabilistic methods to automatically determine the agent's multilingual style, which often fails to align with the needs of multilingual users, as their preferences are nuanced, ad hoc, and difficult to predict. In our work, we explore user control of multilingual style as a step toward developing a mixed-initiative multilingual conversational agent tailored to the needs of multilingual users. We first derived design considerations and dimensions of user control from a formative study with 10 participants. Next, we implemented Mirrios, a prototypical conversational system with multilingual style control, and used it as a probe to conduct an user study with 12 participants. We identified preferred designs for multilingual style control and found that this control reduced the need to constrain language habits, accommodated ad hoc language needs, and enabled more personalized interactions with conversational agents. Based on our findings, we propose design implications to inform the design of multilingual style control and future mixed-initiative multilingual conversational agents.

[C49]

Futian Zhang, Jiawen Stefanie Zhu, Edward Lank, Keiko Katsuragawa, Jian Zhao. Fly the Moon to Me: Bimanual 3D Locomotion in Virtual Reality By Manipulating the Position of the Destination Object. Proceedings of the Graphics Interface Conference, 2025 (In Press).
GI'25

Abstract: Teleportation - changing the point of view in 3D space by specifying a position - is one of the most common locomotion solutions in VR. However, it currently lacks a mechanism to adjust the height in 3D space, and it is difficult for users to predict the exact final view after the teleportation. Users are relocated to a place without knowing what the final view will look like. As a result, they often need to perform remedial interactions to achieve their ideal position, which can be time-consuming and effort-intensive. In this paper, we present Fly the Moon to Me (Locomoontion), a novel technique that enables users to bring their destination to themselves through object manipulation. Users first create a copy of the object they want to approach as a preview by selecting it, then bring it to an ideal position and direction using existing object manipulation techniques, and then snap the original object to the preview together with the rest of the world. A controlled experiment with 18 participants via a teleportation task reveals that Locomoontion is more effective than the traditional Point\&Teleport technique with grabbing the world as a remedy to adjust the final positioning.

[C48]

Abdul Rahman Shaikh, Maoyuan Sun, Xingcheng Liu, Hamed Alhoori, Jian Zhao, David Koop. iTrace: Interactive tracing of Cross-View Data Relationships. Proceedings of the Graphics Interface Conference, 2025 (In Press).
GI'25

Abstract: Exploring data relations across multiple views has been a common task in many domains such as bioinformatics, cybersecurity, and healthcare. To support this, various techniques (e.g., visual links and brushing & linking) are used to show related visual elements across views via lines and highlights. However, understanding the relations using these techniques, when many related elements are scattered, can be difficult due to spatial distance and complexity. To address this, we present iTrace, an interactive visualization technique to effectively trace cross-view data relationships. iTrace leverages the concept of interactive focus transitions, which allows users to see and directly manipulate their focus as they navigate between views. By directing the user's attention through smooth transitions between related elements, iTrace makes it easier to follow data relationships. We demonstrate the effectiveness of iTrace with a user study, and we conclude with a discussion of how iTrace can be broadly used to enhance data exploration in various types of visualizations.

[C47]

Mohammad Hasan Payandeh, Jian Zhao. SenseSync: Supporting Collaborative Information-Seeking with the Involvement of Large Language Models. Proceedings of the Graphics Interface Conference, 2025 (In Press).
GI'25

Abstract: Recently, tools driven by Large Language Models (LLMs), such as ChatGPT, have been extensively used for gathering information. While LLMs improve efficiency in individual tasks, new challenges emerge in collaborative information-seeking when user groups collect data from their conversations with AI that have various contexts. To fill this knowledge gap, we investigate these challenges and reflect on them via the design, development, and evaluation of SenseSync. SenseSync supports collaborative work involving LLMs from different perspectives, featuring a dynamic graph to display individual and shared conversations with LLMs and a visual timeline for exploring collaborative activities over different periods. Moreover, SenseSync is enriched with contextual information and specific support for LLM-assisted information-seeking. A summative study was conducted to explore how pairs of participants used the tool, enriching our understanding of LLM-assisted collaborative information-seeking tasks.

[C46]

Futian Zhang, Paul Kokhanov, Edward Lank, Keiko Katsuragawa, Jian Zhao. Drum Menu: Bimanual Controller Command Access Techniques in Virtual Reality. Proceedings of the Graphics Interface Conference, 2025 (In Press).
GI'25

Abstract: Current Virtual Reality (VR) Head-Mounted Displays (HMDs) offer limited shortcuts for rapid command access, which often requires users to navigate menus through precise visual targeting at multiple depths. This process can be slow and distracting, particularly during immersive gaming or productivity tasks. While marking menus have shown effectiveness as a shortcut command access mechanism, their performance in VR has not been adequately studied. Moreover, their potential integration with 6-degree-of-freedom (6-DoF) controllers and 2-DoF joysticks in VR environments remains largely unexplored. In this paper, we introduce the Drum Menu, a bimanual shortcut command access technique derived from idea of traditional pie menus, featuring three input methods, designed for 4-item and 8-item layouts specifically for VR controller command access. Users can select commands by rotating the joystick, drawing a stroke, or pointing in different directions. Bimanual input enables simultaneous access to two menu levels. A controlled user study reveals that drum menus are faster than the unimanual versions for the 4-item layout. Additionally, users prefer the bimanual joystick drum menu with the 4-item layout given its short task time, low error rate and low physical movement. For the 8-item layout, stroke drum menus are found to be less error-prone for expert users compared to the other techniques.

[C45]

Xinyu Shi, Yinghou Wang, Ryan Rossi, Jian Zhao. Brickify: Enabling Expressive Design Intent Specification through Direct Manipulation on Design Tokens. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 424:1-424:20, 2025.
CHI'25

Abstract: Expressing design intent using natural language prompts requires designers to verbalize the ambiguous visual details concisely, which can be challenging or even impossible. To address this, we introduce Brickify, a visual-centric interaction paradigm — expressing design intent through direct manipulation on design tokens. Brickify extracts visual elements (e.g., subject, style, and color) from reference images and converts them into interactive and reusable design tokens that can be directly manipulated (e.g., resize, group, link, etc.) to form the visual lexicon. The lexicon reflects users' intent for both what visual elements are desired and how to construct them into a whole. We developed Brickify to demonstrate how AI models can interpret and execute the visual lexicon through an end-to-end pipeline. In a user study, experienced designers found Brickify more efficient and intuitive than text-based prompts, allowing them to describe visual details, explore alternatives, and refine complex designs with greater ease and control.

[C44]

Ryan Yen, Jian Zhao, Daniel Vogel. Code Shaping: Iterative Code Editing with Free-form AI-Interpreted Sketching. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 872:1-872:17, 2025.
Best Paper CHI'25

Abstract: We introduce the concept of code shaping, an interaction paradigm for editing code using free-form sketch annotations directly on top of the code and console output. To evaluate this concept, we conducted a three-stage design study with 18 different programmers to investigate how sketches can communicate intended code edits to an AI model for interpretation and execution. The results show how different sketches are used, the strategies programmers employ during iterative interactions with AI interpretations, and interaction design principles that support the reconciliation between the code editor and sketches. Finally, we demonstrate the practical application of the code shaping concept with two use case scenarios, illustrating design implications from the study.

[C43]

Ce Zhong, Xiang Li, Xizi Wang, Junwei Sun, Jian Zhao. Investigating Composite Relation with a Data-Physicalized Thing through the Deployment of the WavData Lamp. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 763:1-763:21, 2025.
CHI'25

Abstract: This paper reports on a field study of the WavData Lamp: an interactive lamp that can physically visualize people's music listening data by changing light colors and outstretching its form enclosure. We deployed five WavData Lamps to five participants' homes for two months to investigate their composite relation with a data-physicalized thing. Findings reveal that their music-listening norms were determined by the instantiated materiality of the Lamp in the early days. With a tilted form enclosure, the WavData Lamp successfully engendered rich actions and meanings of the cohabiting participants and their family members. In the end, the participants described their experiences of entangling with and living with the Lamp as a form of collaboration. Reflecting on these empirical insights explicitly extends the intrinsic meaning of the composite relation and offers rich implications to promote further HCI explorations and practices.

[C42]

Xuye Liu, Annie Sun, Pengcheng An, Tengfei Ma, Jian Zhao. Influencer: Empowering Everyday Users in Creating Promotional Posts via AI-infused Exploration and Customization. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 1004:1-1004:19, 2025.
CHI'25

Abstract: Creating promotional posts on social platforms enables everyday users to disseminate their creative outcomes, engage in community exchanges, or generate additional income from micro-businesses. However, crafting eye-catching posts with appealing images and effective captions can be challenging and time-consuming for everyday users since they are mostly design novices. We propose Influencer, an interactive tool that helps novice creators quickly generate ideas and create high-quality promotional post designs through AI. Influencer offers a multi-dimensional recommendation system for ideation through example-based image and caption suggestions. Further, Influencer implements a holistic promotional post-design system supporting context-aware exploration considering brand messages and user-specified design constraints, flexible fusion of content, and a mind-map-like layout for idea tracking. Our user study, comparing the system with industry-standard tools, along with two real-life case studies, indicates that Influencer is effective in assisting design novices to generate ideas as well as creative and diverse promotional posts with user-friendly interaction.

[C41]

Linping Yuan, Feilin Han, Liwenhan Xie, Junjie Zhang, Jian Zhao, Huamin Qu. You'll Be Alice Adventuring in Wonderland! Processes, Challenges, and Opportunities of Creating Animated Virtual Reality Stories. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 193:1-193:21, 2025.
CHI'25

Abstract: Animated virtual reality (VR) stories, combining the presence of VR and the artistry of computer animation, offer a compelling way to deliver messages and evoke emotions. Motivated by the growing demand for immersive narrative experiences, more creators are creating animated VR stories. However, a holistic understanding of their creation processes and challenges involved in crafting these stories is still limited. Based on semi-structured interviews with 21 animated VR story creators, we identify ten common stages in their end-to-end creation processes, ranging from idea generation to evaluation, which form diverse workflows that are story-driven or visual-driven. Additionally, we highlight nine unique issues that arise during the creation process, such as a lack of reference material for multi-element plots, the absence of specific functionalities for story integration, and inadequate support for audience evaluation. We compare the creation of animated VR stories to general XR applications and distill several future research opportunities.

[C40]

Boda Li, Minghao Li, Jian Zhao, Wei Cai. PartFlow: A Visualization Tool for Application Partitioning and Workload Offloading in Mobile Edge Computing. Proceedings of the IEEE Pacific Visualization Conference, pp. 108-117, 2025.
PacificVis'25

Abstract: In mobile edge computing (MEC), one optimization strategy for mobile applications is to offload heavy computing tasks to cloud and edge servers. Constructing partitioning algorithms involves modeling individual methods through static code profilers, but exploiting dynamic user-driven execution patterns is also crucial. This paper introduces PartFlow, an interactive visualization system that supports comprehensive analysis of mobile application components and aids researchers in developing partitioning and offloading algorithms using real human behavioral data. PartFlow collects application component data remotely through binary instrumentation of mobile applications. Interactive diagrams are designed to evaluate component performance and illustrate transition patterns using the collected data. Additionally, PartFlow integrates a deep learning (DL)-based approach for multi-step forecasting of component states to improve accuracy and user experience in algorithm design. A case study and user feedback demonstrate PartFlow's effectiveness in assisting researchers and engineers in creating offloading strategies.

[W23]

Jiawen Stefanie Zhu, Jian Zhao. Understanding Remote Communication between Grandparents and Grandchildren in Distributed Immigrant Families. Proceedings of the Graphics Interface Conference (Poster), 2025.
Best Poster GI'25

Abstract: Grandparent-grandchild bonds are crucial for both parties. Many immigrant families are geographically dispersed, and the grandparents and grandchildren need to rely on remote communication to maintain their relationships. In addition to geographical separation, grandparents and grandchildren in such families also face language and culture barriers during remote communication. The associated challenges and needs remain understudied as existing research primarily focuses on non-immigrant families or co-located immigrant families. To address this gap, we conducted interviews with six Chinese immigrant families in Canada. Our findings highlight unique challenges faced by immigrant families during remote communication, such as amplified language and cultural barriers due to geographic separation, and provide insights into how technology can better support remote communication. This work offers empirical knowledge about the communication needs of distributed immigrant families and provides directions for future research and design to support grandparent-grandchild remote communication in these families.

[W22]

Xuye Liu, Yimu Wang, Jian Zhao. ELIOT: Zero-Shot Video-Text Retrieval through Relevance-Boosted Captioning and Structural Information Extraction. Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (Student Research Workshop), 4, pp. 381-391, 2025.
NAACL'25

Abstract: Recent advances in video-text retrieval (VTR) have largely relied on supervised learning and fine-tuning. In this paper, we introduce ELIOT, a novel zero-shot VTR framework that leverages off-the-shelf video captioners, large language models (LLMs), and text retrieval methods—entirely without additional training or annotated data. Due to the limited power of captioning methods, the captions often miss important content in the video, resulting in unsatisfactory retrieval performance. To translate more information into video captions, we first generates initial captions for videos, then enhances them using a relevance-boosted captioning strategy powered by LLMs, enriching video descriptions with salient details. To further emphasize key content, we propose structural information extraction, organizing visual elements such as objects, events, and attributes into structured templates, further boosting the retrieval performance. Benefiting from the enriched captions and structuralized information, extensive experiments on several video-text retrieval benchmarks demonstrate the superiority of ELIOT over existing fine-tuned and pretraining methods without any data. They also show that the enriched captions capture key details from the video with minimal noise. Code and data will be released to facilitate future research.

[W21]

Xuye Liu, Tengfei Ma, Yimu Wang, Fengjie Wang, Jian Zhao. SENTIENT: A Dataset for Describing Code and Table Outputs under User-Specified Guidelines in Computational Notebooks. Proceedings of the Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (Student Research Workshop), 2025.
NAACL'25

Abstract: Generating cell-level descriptions for Jupyter Notebooks, which is a major resource consisting of codes, tables, and descriptions, has been attracting increasing research attention. However, existing methods for Jupyter Notebooks mostly focus on generating descriptions from code snippets or table outputs independently. On the other side, descriptions should be personalized as users have different purposes in different scenarios while previous work ignored this situation during description generation. In this work, we formulate a new task, personalized description generation with code, tables, and user-written guidelines in Jupyter Notebooks. To evaluate this new task, we collect and propose a benchmark, namely SENTIENT, containing code, tables, and user-written guidelines as inputs and personalized descriptions as targets. Extensive experiments show that while existing models of text generation are able to generate fluent and readable descriptions, they struggle to produce factually correct descriptions without user-written guidelines based on automatic and human evaluation. Common error patterns include guideline misalignment, incorrect variables, missing code details, and faulty reasoning. Ablations show that implementing guidelines substantially improves performance. Our downstream application examining Human-AI collaboration in the documentation for code-table demonstrated enhanced usability and effectiveness.

[W20]

Xinyu Shi, Shunan Guo, Jane Hoffswell, Gromit Yeuk-Yin Chan, Victor S. Bursztyn, Jian Zhao, Eunyee Koh. Comprehensive Sketching: Exploring Infographic Design Alternatives in Parallel. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pp. 145:1-145:8, 2025.
CHI'25

Abstract: Designing effective and memorable infographics requires both aesthetic creativity and strategic data binding decisions, demanding intensive exploration and iterative trials and errors. Although existing sketch-based tools automate the data binding process to support rapid prototyping, they typically rely on serial workflows that limit freeform exploration. To address this, we introduce the concept of comprehensive sketching which reimagines sketches as interactive objects for expressing design intent — defining what visuals to use, how to bind data, and where to arrange elements. We implement this idea in a tool named CompSketch. CompSketch features a freeform canvas that allows designers to sketch and organize multiple disjoint ideas without assuming every stroke contributes to the final design. An on-demand preview lets users control when and how data bindings are applied, facilitating seamless transitions between exploration and refinement. CompSketch encourages the divergent thinking and empowers designers to explore infographic design alternatives in parallel.

[W19]

Yuzhe You, Helen Weixu Chen, Jian Zhao. Enhancing AI Explainability for Non-technical Users with LLM-Driven Narrative Gamification. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pp. 221:1-221:7, 2025.
CHI'25

Abstract: Artificial intelligence (AI) is tightly integrated into modern technology, yet existing exploratory XAI visualizations are primarily designed for users with technical expertise. This leaves everyday users, who now also rely on AI systems for work and tasks, with limited resources to explore or understand AI. In this work, we explored the use of LLM-driven narrative gamification to enhance the learning and engagement of exploratory XAI visualizations. Specifically, we designed a design probe that enables non-experts to collect insights from an embedding projection by conversing directly with visualization elements similar to game NPCs. We conducted a preliminary comparative study to assess the effectiveness and usability of our design probe. Our study shows that while the tool enhances non-technical users' AI knowledge and is perceived as beneficial, the impact of gamification alone on understanding remains inconclusive. Participant opinions on engagement are mixed: some find it enriching, while others see it as disruptive.

2024

[J36]

Yue Lyu, Di Liu, Pengcheng An, Xin Tong, Huan Zhang, Keiko Katsuragawa, Jian Zhao. EMooly: Supporting Autistic Children in Collaborative Social-Emotional Learning with Caregiver Participation through Interactive AI-infused and AR Activities. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8(4), pp. 203:1-203:36, 2024.
IMWUT

Abstract: Children with autism spectrum disorder (ASD) have social-emotional deficits that lead to difficulties in recognizing emotions as well as understanding and responding to social interactions. This study presents EMooly, a tablet game that actively involves caregivers and leverages augmented reality (AR) and generative AI (GenAI) to enhance social-emotional learning for autistic children. Through a year of collaborative effort with five domain experts, we developed EMooly that engages children through personalized social stories, interactive and fun activities, and enhanced caregiver participation, focusing on emotion understanding and facial expression recognition. Compared with a baseline, a controlled study with 24 autistic children and their caregivers showed EMooly significantly improved children's emotion recognition skills and its novel features were preferred and appreciated. EMooly demonstrates the potential of AI and AR in enhancing social-emotional development for autistic children via prompt personalizing and engagement, and highlights the importance of caregiver involvement for optimal learning outcomes.

[J35]

Pengcheng An, Chaoyu Zhang, Haichen Gao, Ziqi Zhou, Yage Xiao, Jian Zhao. AniBalloons: Animated Chat Balloons as Affective Augmentation for Social Messaging and Chatbot Interaction. International Journal of Human-Computer Studies, 194, pp. 103365:1-103365:16, 2025 (Accepted in 2024).
IJHCS

Abstract: Despite being prominent and ubiquitous, message-based communication is limited in nonverbally conveying emotions. Besides emoticons or stickers, messaging users continue seeking richer options for affective communication. Recent research explored using chat-balloons' shape and color to communicate emotional states. However, little work explored whether and how chat-balloon animations could be designed to convey emotions. We present the design of AniBalloons, 30 chat-balloon animations conveying Joy, Anger, Sadness, Surprise, Fear, and Calmness. Using AniBalloons as a research means, we conducted three studies to assess the animations' affect recognizability and emotional properties (N = 40), and probe how animated chat-balloons would influence communication experience in typical scenarios including instant messaging (N = 72) and chatbot service (N = 70). Our exploration contributes a set of chat-balloon animations to complement nonverbal affective communication for a range of text-message interfaces, and empirical insights into how animated chat-balloons might mediate particular conversation experiences (e.g., perceived interpersonal closeness, or chatbot personality).

[J34]

Shaikh Shawon Arefin Shimon, Ali Neshati, Junwei Sun, Qiang Xu, Jian Zhao. Exploring Uni-manual Around Ear Off-Device Gestures for Earables Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8(1), pp. 3:1-3:29, 2024.
IMWUT

Abstract: Small form factor limits physical input space in earable (i.e., ear-mounted wearable) devices. Off-device earable inputs in alternate mid-air and on-skin around-ear interaction spaces using uni-manual gestures can address this input space limitation. Segmenting these alternate interaction spaces to create multiple gesture regions for reusing off-device gestures can expand earable input vocabulary by a large margin. Although prior earable interaction research has explored off-device gesture preferences and recognition techniques in such interaction spaces, supporting gesture reuse over multiple gesture regions needs further exploration. We collected and analyzed 7560 uni-manual gesture motion data from 18 participants to explore earable gesture reuse by segmentation of on-skin and mid-air spaces around the ear. Our results show that gesture performance degrades significantly beyond 3 mid-air and 5 on-skin around-ear gesture regions for different uni-manual gesture classes (e.g., swipe, pinch, tap). We also present qualitative findings on most and least preferred regions (and associated boundaries) by end-users for different uni-manual gesture shapes across both interaction spaces for earable devices. Our results complement earlier elicitation studies and interaction technologies for earables to help expand the gestural input vocabulary and potentially drive future commercialization of such devices.

[C39]

Liwei Wu, Yilin Zhang, Justin Leung, Jingyi Gao, April Li, Jian Zhao. Planar or Spatial: Exploring Design Aspects and Challenges for Presentations in Virtual Reality with No-coding Interface. Proceedings of the ACM Interactive Surfaces and Spaces Conference, pp. 528:1-528:23, 2024.
ISS'24

Abstract: The proliferation of virtual reality (VR) has led to its increasing adoption as an immersive medium for delivering presentations, distinct from other VR experiences like games and 360-degree videos by sharing information in richly interactive environments. However, creating engaging VR presentations remains a challenging and time-consuming task for users, hindering the full realization of VR presentation's capabilities. This research aims to explore the potential of VR presentation, analyze users' opinions, and investigate these via providing a user-friendly no-coding authoring tool. Through an examination of popular presentation software and interviews with seven professionals, we identified five design aspects and four design challenges for VR presentations. Based on the findings, we developed VRStory, a prototype for presentation authoring without coding to explore the design aspects and strategies for addressing the challenges. VRStory offers a variety of predefined and customizable VR elements, as well as modules for layout design, navigation control, and asset generation. A user study was then conducted with 12 participants to investigate their opinions and authoring experience with VRStory. Our results demonstrated that, while acknowledging the advantages of immersive and spatial features in VR, users often have a consistent mental model for traditional 2D presentations and may still prefer planar and static formats in VR for better accessibility and efficient communication. We finally shared our learned design considerations for future development of VR presentation tools, emphasizing the importance of balancing of promoting immersive features and ensuring accessibility.

[C38]

Temiloluwa Paul Femi-Gege, Matthew Brehmer, Jian Zhao. VisConductor: Affect-Varying Widgets for Animated Data Storytelling in Gesture-Aware Augmented Video Presentation. Proceedings of the ACM Interactive Surfaces and Spaces Conference, pp. 531:1-531:22, 2024.
ISS'24

Abstract: Augmented video presentation tools provide a natural way for presenters to interact with their content, resulting in engaging experiences for remote audiences, such as when a presenter uses hand gestures to manipulate and direct attention to visual aids overlaid on their webcam feed. However, authoring and customizing these presentations can be challenging, particularly when presenting dynamic data visualization (i.e., animated charts). To this end, we introduce VisConductor, an authoring and presentation tool that equips presenters with the ability to configure gestures that control affect-varying visualization animation, foreshadow visualization transitions, direct attention to notable data points, and animate the disclosure of annotations. These gestures are integrated into configurable widgets, allowing presenters to trigger content transformations by executing gestures within widget boundaries, with feedback visible only to them. Altogether, our palette of widgets provides a level of flexibility appropriate for improvisational presentations and ad-hoc content transformations, such as when responding to audience engagement. To evaluate VisConductor, we conducted two studies focusing on presenters (N = 11) and audience members (N = 11). Our findings indicate that our approach taken with VisConductor can facilitate interactive and engaging remote presentations with dynamic visual aids. Reflecting on our findings, we also offer insights to inform the future of augmented video presentation tools.

[C37]

Ryan Yen, Jian Zhao. Reifying the Reuse of User-AI Conversational Memories. Proceedings of ACM Symposium on User Interface Software and Technology, pp. 58:1-58:22, 2024.
UIST'24

Abstract: As users engage more frequently with AI conversational agents, conversations may exceed their 'memory' capacity, leading to failures in correctly leveraging certain memories for tailored responses. However, in finding past memories that can be reused or referenced, users need to retrieve relevant information in various conversations and articulate to the AI their intention to reuse these memories. To support this process, we introduce Memolet, an interactive object that reifies memory reuse. Users can directly manipulate Memolet to specify which memories to reuse and how to use them. We developed a system demonstrating Memolet's interaction across various memory reuse stages, including memory extraction, organization, prompt articulation, and generation refinement. We examine the system's usefulness with an N=12 within-subject study and provide design implications for future systems that support user-AI conversational memory reusing.

[C36]

Ryan Yen, Jiawen Stefanie Zhu, Sangho Suh, Haijun Xia, Jian Zhao. CoLadder: Supporting Programmers with Hierarchical Code Generation in Multi-Level Abstraction. Proceedings of ACM Symposium on User Interface Software and Technology, pp. 11:1-11:20, 2024.
UIST'24

Abstract: This paper adopted an iterative design process to gain insights into programmers' strategies when using LLMs for programming. We proposed CoLadder, a novel system that supports programmers by facilitating hierarchical task decomposition, direct code segment manipulation, and result evaluation during prompt authoring. A user study with 12 experienced programmers showed that CoLadder is effective in helping programmers externalize their problem-solving intentions flexibly, improving their ability to evaluate and modify code across various abstraction levels, from their task's goal to final code implementation.

[C35]

Maoyuan Sun, Yuanxin Wang, Courtney Bolton, Yue Ma, Tianyi Li, Jian Zhao. Investigating User Estimation of Missing Data in Visual Analysis. Proceedings of the Graphics Interface Conference, pp. 30:1-30:13, 2024.
GI'24

Abstract: Missing data is a pervasive issue in real-world analytics, stemming from a multitude of factors (e.g., device malfunctions and network disruptions), making it a ubiquitous challenge in many domains. Misperception of missing data impacts decision-making and causes severe consequences. To mitigate risks from missing data and facilitate proper handling, computing methods (e.g., imputation) have been studied, which often culminate in the visual representation of data for analysts to further check. Yet, the influence of these computed representations on user judgment regarding missing data remains unclear. To study potential influencing factors and their impact on user judgment, we conducted a crowdsourcing study. We controlled 4 factors: the distribution, imputation, and visualization of missing data, and the prior knowledge of data. We compared users' estimations of missing data with computed imputations under different combinations of these factors. Our results offer useful guidance for visualizing missing data and their imputations, which informs future studies on developing trustworthy computing methods for visual analysis of missing data.

[C34]

Xinyu Shi, Mingyu Liu, Ziqi Zhou, Ali Neshati, Ryan Rossi, Jian Zhao. Exploring Interactive Color Palettes for Abstraction-Driven Exploratory Image Colorization. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 146:1-146:16, 2024.
CHI'24

Abstract: Color design is essential in areas such as product, graphic, and fashion design. However, current tools like Photoshop, with their concrete-driven color manipulation approach, often stumble during early ideation, favoring polished end results over initial exploration. We introduced Mondrian as a test-bed for abstraction-driven approach using interactive color palettes for image colorization. Through a formative study with six design experts, we selected three design options for visual abstractions in color design and developed Mondrian where humans work with abstractions and AI manages the concrete aspects. We carried out a user study to understand the benefits and challenges of each abstraction format and compare the Mondrian with Photoshop. A survey involving 100 participants further examined the influence of each abstraction format on color composition perceptions. Findings suggest that interactive visual abstractions encourage a non-linear exploration workflow and an open mindset during ideation, thus providing better creative affordance.

[C33]

Xinyu Shi, Yinghou Wang, Yun Wang, Jian Zhao. Piet: Facilitating Color Authoring for Motion Graphics Video. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 148:1-148:17,2024.
Best Paper CHI'24

Abstract: Motion graphic (MG) videos are effective and compelling for presenting complex concepts through animated visuals; and colors are important to convey desired emotions, maintain visual continuity, and signal narrative transitions. However, current video color authoring workflows are fragmented, lacking contextual previews, hindering rapid theme adjustments, and not aligning with designers' progressive authoring flows. To bridge this gap, we introduce Piet, the first tool tailored for MG video color authoring. Piet features an interactive palette to visually represent color distributions, support controllable focus levels, and enable quick theme probing via grouped color shifts. We interviewed 6 domain experts to identify the frustrations in current tools and inform the design of Piet. An in-lab user study with 13 expert designers showed that Piet effectively simplified the MG video color authoring and reduced the friction in creative color theme exploration.

[C32]

Li Feng^†, Ryan Yen^†, Yuzhe You, Mingming Fan, Jian Zhao, Zhicong Lu. CoPrompt: Supporting Prompt Sharing and Referring in Collaborative Natural Language Programming. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 934:1-934:21, 2024.
CHI'24

Abstract: Natural language (NL) programming has become more approachable due to the powerful code-generation capability of large language models (LLMs). This shift to using NL to program enhances collaborative programming by reducing communication barriers and context-switching among programmers from varying backgrounds. However, programmers may face challenges during prompt engineering in a collaborative setting as they need to actively keep aware of their collaborators' progress and intents. In this paper, we aim to investigate ways to assist programmers' prompt engineering in a collaborative context. We first conducted a formative study to understand the workflows and challenges of programmers when using NL for collaborative programming. Based on our findings, we implemented a prototype, CoPrompt, to support collaborative prompt engineering by providing referring, requesting, sharing, and linking mechanisms. Our user study indicates that CoPrompt assists programmers in comprehending collaborators' prompts and building on their collaborators' work, reducing repetitive updates and communication costs.

[C31]

Pengcheng An, Jiawen Stefanie Zhu, Zibo Zhang, Yifei Yin, Qingyuan Ma, Che Yan, Linghao Du, Jian Zhao. EmoWear: Exploring Emotional Teasers for Voice Message Interaction on Smartwatches. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 279:1-279:16, 2024.
CHI'24

Abstract: Voice messages, by nature, prevent users from gauging the emotional tone without fully diving into the audio content. This hinders the shared emotional experience at the pre-retrieval stage. Research scarcely explored "Emotional Teasers"—pre-retrieval cues offering a glimpse into an awaiting message's emotional tone without disclosing its content. We introduce EmoWear, a smartwatch voice messaging system enabling users to apply 30 animation teasers on message bubbles to reflect emotions. EmoWear eases senders' choice by prioritizing emotions based on semantic and acoustic processing. EmoWear was evaluated in comparison with a mirroring system using color-coded message bubbles as emotional cues (N=24). Results showed EmoWear significantly enhanced emotional communication experience in both receiving and sending messages. The animated teasers were considered intuitive and valued for diverse expressions. Desirable interaction qualities and practical implications are distilled for future design. We thereby contribute both a novel system and empirical knowledge concerning emotional teasers for voice messaging.

[C30]

Xizi Wang, Ben Lafreniere, Jian Zhao. Exploring Visualizations for Precisely Guiding Bare Hand Gestures in Virtual Reality. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 636:1-636:19, 2024.
CHI'24

Abstract: Bare hand interaction in augmented or virtual reality (AR/VR) systems, while intuitive, often results in errors and frustration. However, existing methods, such as a static icon or a dynamic tutorial, can only inform simple and coarse hand gestures and lack corrective feedback. This paper explores various visualizations for enhancing precise hand interaction in VR. Through a comprehensive two-part formative study with 11 participants, we identified four types of essential information for visual guidance and designed different visualizations that manifest these information types. We further distilled four visual designs and conducted a controlled lab study with 15 participants to assess their effectiveness for various single- and double-handed gestures. Our results demonstrate that visual guidance significantly improved users' gesture performance, reducing time and workload while increasing confidence. Moreover, we found that the visualization did not disrupt most users' immersive VR experience or their perceptions of hand tracking and gesture recognition reliability.

[W18]

Ryan Yen, Jian Zhao, Daniel Vogel. Code Shaping: Iterative Code Editing with Free-form Sketching. Adjunct Proceedings of the ACM Symposium on User Interface Software and Technology (Poster), pp. 101:1-101:3, 2024.
Jury Best Poster Honorable Mention UIST'24

Abstract: We present an initial step towards building a system for programmers to edit code using free-form sketch annotations drawn directly onto editor and output windows. Using a working prototype system as a technical probe, an exploratory study (N = 6) examines how programmers sketch to annotate Python code to communicate edits for an AI model to perform. The results reveal personalized workflow strategies and how similar annotations vary in abstractness and intention across different scenarios and users.

[W17]

Ryan Yen, Yelizaveta Brus, Leyi Yan, Jimmy Lin, Jian Zhao. Scholarly Exploration via Conversations with Scholars-Papers Embedding. Proceedings of the IEEE Conference Visualization and Visual Analytics (Poster), 2024.
VIS'24

Abstract: The rapid expansion of academic publications across various sub-domains necessitates advanced visual analytics systems to help researchers efficiently navigate and explore the academic landscape. Recent advancements in retrieval augmented generation enable users to engage with data through complex, context-driven question-answering capabilities. However, existing approaches fail to provide adequate user control over the retrieval and generation process and do not reconcile visualizations with question-answering mechanisms. To address these limitations, we propose a system that supports contextually aware, controllable, and interactive exploration of academic publications and scholars. By enabling bidirectional interaction between question-answering components and Scholets, the 2D projections of scholarly works' embeddings, our system enables users to textually and visually interact with large amounts of publications. We report the system design and demonstrate its utility through an exploratory study with graduate researchers.

[W16]

Ryan Yen, Nicole Sultanum, Jian Zhao. To Search or To Gen? Exploring the Synergy between Generative AI and Web Search in Programming. Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, pp. 327:1-327:8, 2024.
CHI EA'24

Abstract: The convergence of generative AI and web search is reshaping problem-solving for programmers. However, the lack of understanding regarding their interplay in the information-seeking process often leads programmers to perceive them as alternatives rather than complementary tools. To analyze this interaction and explore their synergy, we conducted an interview study with eight experienced programmers. Drawing from the results and literature, we have identified three major challenges and proposed three decision-making stages, each with its own relevant factors. Additionally, we present a comprehensive process model that captures programmers' interaction patterns. This model encompasses decision-making stages, the information-foraging loop, and cognitive activities during system interaction, offering a holistic framework to comprehend and optimize the use of these convergent tools in programming.

[W15]

Jiawen Stefanie Zhu^†, Zibo Zhang^†, Jian Zhao. Facilitating Mixed-Methods Analysis with Computational Notebooks. Proceedings of the First Workshop on Human-Notebook Interactions, 2024.
CHI'24

Abstract: Data exploration is an important aspect of the workflow of mixed-methods researchers, who conduct both qualitative and quantitative analysis. However, there currently exists few tools that adequately support both types of analysis simultaneously, forcing researchers to context-switch between different tools and increasing their mental burden when integrating the results. To address this gap, we propose a unified environment that facilitates mixed-methods analysis in a computational notebook-based settings. We conduct a scenario study with three HCI mixed-methods researchers to gather feedback on our design concept and to understand our users' needs and requirements.

[W14]

Yue Lyu, Pengcheng An, Huan Zhang, Keiko Katsuragawa, Jian Zhao. Designing AI-Enabled Games to Support Social-Emotional Learning for Children with Autism Spectrum Disorders. Proceedings of the Second Workshop on Child-Centred AI, 2024.
CHI'24

Abstract: Children with autism spectrum disorder (ASD) experience challenges in grasping social-emotional cues, which can result in difficulties in recognizing emotions and understanding and responding to social interactions. Social-emotional intervention is an effective method to improve emotional understanding and facial expression recognition among individuals with ASD. Existing work emphasizes the importance of personalizing interventions to meet individual needs and motivate engagement for optimal outcomes in daily settings. We design a social-emotional game for ASD children, which generates personalized stories by leveraging the current advancement of artificial intelligence. Via a co-design process with five domain experts, this work offers several design insights into developing future AI-enabled gamified systems for families with autistic children. We also propose a fine-tuned AI model and a dataset of social stories for different basic emotions.

[W13]

Negar Arabzadeh, Kiarash Golzadeh, Christopher Risi, Charles Clarke, Jian Zhao. KnowFIRES: a Knowledge-graph Framework for Interpreting Retrieved Entities from Search. Advances in Information Retrieval (Proceedings of ECIR'24 (Demo)), pp. 182-188, 2024.
ECIR'24

Abstract: Entity retrieval is essential in information access domains where people search for specific entities, such as individuals, organizations, and places. While entity retrieval is an active research topic in Information Retrieval, it is necessary to explore the explainability and interpretability of them more extensively. KnowFIRES addresses this by of- fering a knowledge graph-based visual representation of entity retrieval results, focusing on contrasting different retrieval methods. KnowFIRES allows users to better understand these differences through the juxtaposition and superposition of retrieved sub-graphs.

2023

[J33]

Xuejun Du^†, Pengcheng An^†, Justin Leung, April Li, Linda Chapman, Jian Zhao. DeepThInk: Designing and Probing Human-AI Co-Creation in Digital Art Therapy. International Journal of Human-Computer Studies, 181, pp. 103139:1-103139:17, 2024 (Accepted in 2023).
Best Research Paper IJHCS

Abstract: Art therapy has been an essential form of psychotherapy to facilitate psychological well-being, which has been promoted and transformed by recent technological advances into digital art therapy. However, the potential of digital technologies has not been fully leveraged; especially, applying AI technologies in digital art therapy is still under-explored. In this paper, we propose an AI-infused art-making system, DeepThInk, to investigate the potential of introducing a human-AI co-creative process into art therapy, by collaborating with five experienced registered art therapists over ten months. DeepThInk offers a range of tools which can lower the expertise threshold for art-making while improving users' creativity and expressivity. We gathered the insights of DeepThInk through expert reviews and a two-part user evaluation with both synchronous and asynchronous therapy setups. This longitudinal iterative design process helped us derive and contextualize design principles of human-AI co-creation for art therapy, shedding light on future design in relevant domains

[J32]

Yue Lyu, Pengcheng An, Yage Xiao, Zibo Zhang, Huan Zhang, Keiko Katsuragawa, Jian Zhao. Eggly: Designing Mobile Augmented Reality Neurofeedback Training Games for Children with Autism Spectrum Disorder. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 7(2), pp.67:1-67:29, 2023.
IMWUT

Abstract: Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder that affects how children communicate and relate to other people and the world around them. Emerging studies have shown that neurofeedback training (NFT) games are an effective and playful intervention to enhance social and attentional capabilities for autistic children. However, NFT is primarily available in a clinical setting that is hard to scale. Also, the intervention demands deliberately-designed gamified feedback with fun and enjoyment, where little knowledge has been acquired in the HCI community. Through a ten-month iterative design process with four domain experts, we developed Eggly, a mobile NFT game based on a consumer-grade EEG headband and a tablet. Eggly uses novel augmented reality (AR) techniques to offer engagement and personalization, enhancing their training experience. We conducted two field studies (a single-session study and a three-week multi-session study) with a total of five autistic children to assess Eggly in practice at a special education center. Both quantitative and qualitative results indicate the effectiveness of the approach as well as contribute to the design knowledge of creating mobile AR NFT games.

[J31]

Andrea Batch, Yipeng Ji, Mingming Fan, Jian Zhao, Niklas Elmqvist. uxSense: Supporting User Experience Analysis with Visualization and Computer Vision. IEEE Transactions on Visualization and Computer Graphics, 30(7), pp. 3841-3856, 2024 (Accepted in 2023).
TVCG

Abstract: Analyzing user behavior from usability evaluation can be a challenging and time-consuming task, especially as the number of participants and the scale and complexity of the evaluation grows. We propose UXSENSE, a visual analytics system using machine learning methods to extract user behavior from audio and video recordings as parallel time-stamped data streams. Our implementation draws on pattern recognition, computer vision, natural language processing, and machine learning to extract user sentiment, actions,posture, spoken words, and other features from such recordings. These streams are visualized as parallel timelines in a web-based front-end, enabling the researcher to search, filter, and annotate data across time and space. We present the results of a user study involving professional UX researchers evaluating user data using uxSense. In fact, we used uxSense itself to evaluate their sessions.

[C29]

Liwei Wu, Qing Liu, Jian Zhao, Edward Lank. Interactions across Displays and Space: A Study of Virtual Reality Streaming Practices on Twitch. Proceedings of the ACM Interactive Surfaces and Spaces Conference, pp. 437:1-437:24, 2023.
Best Paper Honorable Mention ISS'23

Abstract: The growing live streaming economy and virtual reality (VR) technologies have sparked interest in VR streaming among streamers and viewers. However, limited research has been conducted to understand this emerging streaming practice. To address this gap, we conducted an in-depth thematic analysis of 34 streaming videos from 12 VR streamers with varying levels of experience, to explore the current practices, interaction styles, and strategies, as well as to investigate the challenges and opportunities for VR streaming. Our findings indicate that VR streamers face challenges in building emotional connections and maintaining streaming flow due to technical problems, lack of fluid transitions between physical and virtual environments, and not intentionally designed game scenes. As a response, we propose six design implications to encourage collaboration between game designers and streaming app developers, facilitating fluid, rich, and broad interactions for an enhanced streaming experience. In addition, we discuss the use of streaming videos as user-generated data for research, highlighting the lessons learned and emphasizing the need for tools to support streaming video analysis. Our research sheds light on the unique aspects of VR streaming, which combines interactions across displays and space.

[C28]

Qing Liu, Gustavo Alves, Jian Zhao. Challenges and Opportunities for Software Testing in Virtual Reality Application Development. Proceedings of the Graphics Interface Conference, 2023 (In Press).
GI'23

Abstract: Testing is a core process for the development of Virtual Reality (VR) software, which could ensure the delivery of high-quality VR products and experiences. As VR applications have become more popular in different fields, more challenges and difficulties have been raised during the testing phase. However, few studies have explored the challenges of software testing in VR development in detail. This paper aims to fill in the gap through a qualitative interview study composed of 14 professional VR developers and a survey study with 33 additional participants. As a result, we derived 10 key challenges that are often confronted by VR developers during software testing. Our study also sheds light on potential design directions for VR development tools based on the identified challenges and needs of the VR developers to alleviate existing issues in testing.

[C27]

Xinyu Shi, Ziqi Zhou, Jingwen Zhang, Ali Neshati, Anjul Tyagi, Ryan Rossi, Shunan Guo, Fan Du, Jian Zhao. De-Stijl: Facilitating Graphics Design with Interactive 2D Color Palette Recommendation. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 122:1-122:19, 2023.
CHI'23

Abstract: Selecting a proper color palette is critical in crafting a high-quality graphic design to gain visibility and communicate ideas effectively. To facilitate this process, we propose De-Stijl, an intelligent and interactive color authoring tool to assist novice designers in crafting harmonic color palettes, achieving quick design iterations, and fulfilling design constraints. Through De-Stijl, we contribute a novel 2D color palette concept that allows users to intuitively perceive color designs in context with their proportions and proximities. Further, De-Stijl implements a holistic color authoring system that supports 2D palette extraction, theme-aware and spatial-sensitive color recommendation, and automatic graphical elements (re)colorization. We evaluated De-Stijl through an in-lab user study by comparing the system with existing industry standard tools, followed by in-depth user interviews. Quantitative and qualitative results demonstrate that De-Stijl is effective in assisting novice design practitioners to quickly colorize graphic designs and easily deliver several alternatives.

[C26]

Fengjie Wang^†, Xuye Liu^†, Oujing Liu, Ali Neshati, Tengfei Ma, Min Zhu, Jian Zhao. Slide4N: Creating Presentation Slides from Computational Notebooks with Human-AI Collaboration. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 364:1-364:18, 2023.
CHI'23

Abstract: Data scientists often have to use other presentation tools (e.g., Microsoft PowerPoint) to create slides to communicate their analysis obtained using computational notebooks. Much tedious and repetitive work is needed to transfer the routines of notebooks (e.g., code, plots) to the presentable contents on slides (e.g., bullet points, figures). We propose a human-AI collaborative approach and operationalize it within Slide4N, an interactive AI assistant for data scientists to create slides from computational notebooks. Slide4N leverages advanced natural language processing techniques to distill key information from user-selected notebook cells and then renders them in appropriate slide layouts. The tool also provides intuitive interactions that allow further refinement and customization of the generated slides. We evaluated Slide4N with a two-part user study, where participants appreciated this human-AI collaborative approach compared to fully-manual or fully-automatic methods. The results also indicate the usefulness and effectiveness of Slide4N in slide creation tasks from notebooks.

[C25]

Chang Liu, Arif Usta, Jian Zhao, Semih Salihoglu. Governor: Turning Open Government Data Portals into Interactive Databases. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 415:1-415:16, 2023.
CHI'23

Abstract: The launch of open governmental data portals (OGDPs) has popularized the open data movement of last decade. Although the amount of data in OGDPs is increasing, their functionalities are limited to finding datasets with titles/descriptions and downloading the actual files. This hinders the end users, especially those without technical skills, to find the open data tables and make use of them. We present Governor, an open-sourced web application developed to make OGDPs more accessible to end users by facilitating searching actual records in the tables, previewing them directly without downloading, and suggesting joinable and unionable tables to users based on their latest working tables. Governor also manages the provenance of integrated tables allowing users and their collaborators to easily trace back to the original tables in OGDP. We evaluate Governor with a two-part user study and the results demonstrate its value and effectiveness in finding and integrating tables in OGDP.

[C24]

Emily Kuang, Ehsan Jahangirzadeh Soure, Mingming Fan, Jian Zhao, Kristen Shinohara. Collaboration with Conversational AI Assistants for UX Evaluation: Questions and How to Ask them (Voice vs. Text). Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 116:1-116:15, 2023.
CHI'23

Abstract: AI is promising in assisting UX evaluators with analyzing usability tests, but its judgments are typically presented as non-interactive visualizations. Evaluators may have questions about test recordings, but have no way of asking them. Interactive conversational assistants provide a Q&A dynamic that may improve analysis efficiency and evaluator autonomy. To understand the full range of analysis-related questions, we conducted a Wizard-of-Oz design probe study with 20 participants who interacted with simulated AI assistants via text or voice. We found that participants asked for five categories of information: user actions, user mental model, help from the AI assistant, product and task information, and user demographics. Those who used the text assistant asked more questions, but the question lengths were similar. The text assistant was perceived as significantly more efficient, but both were rated equally in satisfaction and trust. We also provide design considerations for future conversational AI assistants for UX evaluation.

[W12]

Catherine Thomas, Xuejun Du, Kai Wang, Jayant Rai, Kenichi Okamoto, Miles Li, Jian Zhao. A Novel Data Analysis Pipeline for Fiber-based in Vivo Calcium Imaging. Neuroscience Reports, 15(1), pp. S342-S343, 2023.

Abstract: Examining in vivo neural circuit dynamics in relation to behaviour is crucial to advances in understanding how the brain works. Two techniques that are often used to examine these dynamics are one photon calcium imaging and optogenetics. Fiber-based micro-endoscopy provides a versatile, modular, and lightweight option for combining in vivo calcium imaging and optogenetics in freely behaving animals. One challenge with this technique is that the data collected from such an approach are often complex and dense. Extraction of meaningful conclusions from these data can be computationally challenging and often requires coding experience. While numerous powerful analysis pipelines exist for detection and extraction of one photon calcium imaging data from head-mounted mini microscopes, few options are available for data using fiber-based imaging techniques. Further, available options for fiber-based imaging are not optimized, often requiring significant troubleshooting, and providing limited results. Lastly, the existing pipelines cannot combine in vivo calcium imaging data with optogenetics and behavioural parameters collected in the same experimental system (hardware and software). As such, as a collaborative endeavour between behavioural neuroscientists, optical engineers, and computer science visual processing experts, we have developed a novel pipeline for extraction, examination, and visualization of calcium imaging data for fiber-based approaches. This pipeline offers a user friendly, code-free interface with customizable features and parameters, capable of integrating imaging, optogenetics, and behavioural measures for holistic experimental visualization and analysis. This pipeline significantly expands the opportunities afforded to behavioural neuroscience researchers and shifts forward the possible research opportunities when examining circuit dynamics in freely behaving animals.

[W11]

Pengcheng An, Chaoyu Zhang, Haicheng Gao, Ziqi Zhou, Linghao Du, Che Yan, Yage Xiao, Jian Zhao. Affective Affordance of Message Balloon Animations: An Early Exploration of AniBalloons. Companion Publication of the ACM Conference on Computer-Supported Cooperative Work and Social Computing, pp. 138-143, 2023.
CSCW'23

Abstract: We introduce the preliminary exploration of AniBalloons, a novel form of chat balloon animations aimed at enriching nonverbal affective expression in text-based communications. AniBalloons were designed using extracted motion patterns from affective animations and mapped to six commonly communicated emotions. An evaluation study with 40 participants assessed their effectiveness in conveying intended emotions and their perceived emotional properties. The results showed that 80% of the animations effectively conveyed the intended emotions. AniBalloons covered a broad range of emotional parameters, comparable to frequently used emojis, offering potential for a wide array of affective expression in daily communication. The findings suggest AniBalloons' promise for enhancing emotional expressiveness in text-based communication and provide early insights for future affective design.

[W10]

Pengcheng An, Chaoyu Zhang, Haicheng Gao, Ziqi Zhou, Zibo Zhang, Jian Zhao. Animating Chat Balloons to Convey Emotions: theDesign Exploration of AniBalloons. Proceedings of the Graphics Interface Conference (Poster), 2023.
GI'23

Abstract: Text message-based communication has limitations in conveying nonverbal emotional expressions, resulting in less sense of connectedness and increased likelihood of miscommunication. While emoticons may partially compensate for this limitation, we argue that chat balloon animations could be a new and unique channel to further complement affective cues in text messages. In this paper, we present the design of AniBalloons, a set of 30 chat-balloon animations conveying six types of emotions, and evaluate their affect recognizability and emotional properties. Our results show that animated chat balloons, as independent from the message content, are effective in communicating intended emotions and cover a variety of valence-arousal parameters for daily communication. Our results thereby suggest the potential of chat-balloon animations as a unique affective channel for text messages.

2022

[J30]

Xingjun Li^†, Yizhi Zhang^†, Justin Leung^†, Chengnian Sun, Jian Zhao. EDAssistant: Supporting Exploratory Data Analysis in Computational Notebooks with In-Situ Code Search and Recommendation. ACM Transactions on Interactive Intelligent Systems, 13(1), pp. 1:1-1:27, 2023 (Accepted in 2022).
TIIS

Abstract: Using computational notebooks (e.g., Jupyter Notebook), data scientists rationalize their exploratory data analysis (EDA) based on their prior experience and external knowledge such as online examples. For novices or data scientists who lack specific knowledge about the dataset or problem to investigate, effectively obtaining and understanding the external information is critical to carrying out EDA. This paper presents EDAssistant, a JupyterLab extension that supports EDA with in-situ search of example notebooks and recommendation of useful APIs, powered by novel interactive visualization of search results. The code search and recommendation are enabled by advanced machine learning models, trained on a large corpus of EDA notebooks collected online. A user study is conducted to investigate both EDAssistant and data scientists' current practice (i.e., using external search engines). The results demonstrate the effectiveness and usefulness of EDAssistant, and participants appreciated its smooth and in-context support of EDA. We also report several design implications regarding code recommendation tools.

[J29]

Mingliang Xue, Yunhai Wang, Chang Han, Jian Zhang, Zheng Wang, Kaiyi Zhang, Christophe Hurter, Jian Zhao, Oliver Deussen. Target Netgrams: An Annulus-constrained Stress Model for Radial Graph Visualization. IEEE Transactions on Visualization and Computer Graphics, 29(10), pp. 4256-4268, 2023 (Accepted in 2022).
TVCG

Abstract: We present Target Netgrams as a visualization technique for radial layouts of graphs Inspired by manually created target sociograms, we propose an annulus-constrained stress model that aims to position nodes onto the annuli between adjacent circles for indicating their radial hierarchy, while maintaining the network structure (clusters and neighborhoods) and improving readability as much as possible. This is achieved by having more space on the annuli than traditional layout techniques. By adapting stress majorization to this model, the layout is computed as a constrained least square optimization problem. Additional constraints (e.g., parent-child preservation, attribute-based clusters and structure-aware radii) are provided for exploring nodes, edges, and levels of interest. We demonstrate the effectiveness of our method through a comprehensive evaluation, a user study, and a case study.

[J28]

Anjul Tyagi, Jian Zhao, Pushkar Patel, Swasti Khurana, Klaus Mueller. Infographics Wizard: Flexible Infographics Authoring and Design Exploration. Computer Graphics Forum (Proceedings of EuroVis 2022), 41(3), pp. 121-132, 2022.
CGF EuroVis'22

Abstract: Infographics are an aesthetic visual representation of information following specific design principles of human perception. Designing infographics can be a tedious process for non-experts and time-consuming, even for professional designers. With the help of designers, we propose a semi-automated infographic framework for general structured and flow-based infographic de- sign generation. For novice designers, our framework automatically creates and ranks infographic designs for a user-provided text with no requirement for design input. However, expert designers can still provide custom design inputs to customize the infographics. We will also contribute an individual visual group (VG) designs dataset (in SVG), along with a 1k complete info-graphic image dataset with segmented VGs in this work. Evaluation results confirm that by using our framework, designers from all expertise levels can generate generic infographic designs faster than existing methods while maintaining the same quality as hand-designed infographics templates.

[J27]

Takanori Fujiwara, Jian Zhao, Francine Chen, Yaoliang Yu, Kwan-Liu Ma. Network Comparison with Interpretable Contrastive Network Representation Learning. Journal of Data Science, Statistics, and Visualization, 2(5), pp. 1-35, 2022.
JDSSV

Abstract: Identifying unique characteristics in a network through comparison with another network is an essential network analysis task. For example, with networks of protein interactions obtained from normal and cancer tissues, we can discover unique types of interactions in cancer tissues. This analysis task could be greatly assisted by contrastive learning, which is an emerging analysis approach to discover salient patterns in one dataset relative to another. However, existing contrastive learning methods cannot be directly applied to networks as they are designed only for high-dimensional data analysis. To address this problem, we introduce a new analysis approach called contrastive network representation learning (cNRL). By integrating two machine learning schemes, network representation learning and contrastive learning, cNRL enables embedding of network nodes into a low-dimensional representation that reveals the uniqueness of one network compared to another. Within this approach, we also design a method, named i-cNRL, which offers interpretability in the learned results, allowing for understanding which specific patterns are only found in one network. We demonstrate the effectiveness of i-cNRL for network comparison with multiple network models and real-world datasets. Furthermore, we compare i-cNRL and other potential cNRL algorithm designs through quantitative and qualitative evaluations.

[S6]

Maoyuan Sun, Yue Ma, Yuanxin Wang, Tianyi Li, Jian Zhao, Yujun Liu, Ping-Shou Zhong. Toward Systematic Considerations of Missingness in Visual Analytics. Proceedings of the IEEE Visualization and Visual Analytics Conference, pp. 110-114, 2022.
Best Paper Honorable Mention VIS'22

Abstract: Data-driven decision making has been a common task in today's big data era, from simple choices such as finding a fast way to drive home, to complex decisions on medical treatment. It is often supported by visual analytics. For various reasons (e.g., system failure, interrupted network, intentional information hiding, or bias), visual analytics for sensemaking of data involves missingness (e.g., data loss and incomplete analysis), which impacts human decisions. For example, missing data can cost a business millions of dollars, and failing to recognize key evidence can put an innocent person in jail. Being aware of missingness is critical to avoid such catastrophes. To fulfill this, as an initial step, we consider missingness in visual analytics from two aspects: data-centric and human-centric. The former emphasizes missingness in three data-related categories: data composition, data relationship, and data usage. The latter focuses on the human-perceived missingness at three levels: observed-level, inferred-level, and ignored-level. Based on them, we discuss possible roles of visualizations for handling missingness, and conclude our discussion with future research opportunities.

[C23]

Sangho Suh, Jian Zhao, Edith Law. CodeToon: Story Ideation, Auto Comic Generation, and Structure Mapping for Code-Driven Storytelling. Proceedings of the ACM Symposium on User Interface Software and Technology, pp. 13:1-13:16, 2022.
UIST'22

Abstract: Recent work demonstrated how we can design and use coding strips, a form of comic strips with corresponding code, to enhance teaching and learning in programming. However, creating coding strips is a creative, time-consuming process. Creators have to generate stories from code (code→story) and design comics from stories (story→comic). We contribute CodeToon, a comic authoring tool that facilitates this code-driven storytelling process with two mechanisms: (1) story ideation from code using metaphor and (2) automatic comic generation from the story. We conducted a two-part user study that evaluates the tool and the comics generated by participants to test whether CodeToon facilitates the authoring process and helps generate quality comics. Our results show that CodeToon helps users create accurate, informative, and useful coding strips in a significantly shorter time. Overall, this work contributes methods and design guidelines for code-driven storytelling and opens up opportunities for using art to support computer science education.

[C22]

Nikhita Joshi^†, Matthew Lakier^†, Daniel Vogel, Jian Zhao. A Design Framework for Contextual and Embedded Information Visualizations in Spatial Augmented Reality. Proceedings of the Graphics Interface Conference, pp. 24:1-24:12, 2022.
GI'22

Abstract: Spatial augmented reality (SAR) displays digital content in a real environment in ways that are situated, peripheral, and viewable by multiple people. These capabilities change how embedded information visualizations are used, designed, and experienced. But a comprehensive design framework that captures the specific characteristics and parameters relevant to SAR is missing. We contribute a new design framework for leveraging context and surfaces in the environment for SAR visualizations. An accompanying design process shows how designers can apply the framework to generate and describe SAR visualizations. The framework captures how the user's intent, interaction, and six environmental and visualization factors can influence SAR visualizations. The potential of this design framework is illustrated through eighteen exemplar application scenarios and accompanying envisionment videos.

[C21]

Gloria Fernandez-Nieto, Pengcheng An, Jian Zhao, Simon Buckingham Shum, Roberto Martinez-Maldonado. Classroom Dandelions: Visualising Participants' Position, Trajectories and Body Orientation Augments Teachers' Sensemaking. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 564:1-564:17, 2022.
CHI'22

Abstract: Despite the digital revolution, physical space remains the site for teaching and learning embodied knowledge and skills. Both teachers and students must develop spatial competencies to effectively use classroom spaces, enabling fluid verbal and non-verbal interaction. While video permits rich activity capture, it provides no support for quickly seeing activity patterns that can assist learning. In contrast, position tracking systems permit the automated modelling of spatial behaviour, opening new possibilities for feedback. This paper introduces the design rationale for Dandelion Diagrams that integrate location, trajectory and body orientation over a variable period. Applied in two authentic teaching contexts (a science laboratory, and a nursing simulation) we show how heatmaps showing only teacher/student location led to misinterpretations that were resolved by overlaying Dandelion Diagrams. Teachers also identified a variety of ways they could aid professional development. We conclude Dandelion Diagrams assisted sensemaking, but discuss the ethical risks of over-interpretation.

[C20]

Pengcheng An^†, Ziqi Zhou^†, Qing Liu^†, Yifei Yin, Linghao Du, Da-Yuan Huang, Jian Zhao. VibEmoji: Exploring User-authoring Multi-modal Emoticons in Social Communication. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems, pp. 493:1-493:17, 2022.
CHI'22

Abstract: Emoticons are indispensable in online communications. With users' growing needs for more customized and expressive emoticons, recent messaging applications begin to support (limited) multi-modal emoticons: e.g., enhancing emoticons with animations or vibrotactile feedback. However, little empirical knowledge has been accumulated concerning how people create, share and experience multi-modal emoticons in everyday communication, and how to better support them through design. To tackle this, we developed VibEmoji, a user-authoring multi-modal emoticon interface for mobile messaging. Extending existing designs, VibEmoji grants users greater flexibility to combine various emoticons, vibrations, and animations on-the-fly, and offers non-aggressive recommendations based on these components' emotional relevance. Using VibEmoji as a probe, we conducted a four-week field study with 20 participants, to gain new understandings from in-the-wild usage and experience, and extract implications for design. We thereby contribute to both a novel system and various insights for supporting users' creation and communication of multi-modal emoticons.

[C19]

Mingming Fan, Xianyou Yang^†, Tsz Tung Yu^†, Vera Q. Liao, Jian Zhao. Human-AI Collaboration for UX Evaluation: Effects of Explanation and Synchronization. Proceedings of the ACM on Human-Computer Interaction, 6(CSCW1), pp. 96:1-96:32, 2022.
CSCW'22

Abstract: Analyzing usability test videos is arduous. Although recent research showed the promise of AI in assisting with such tasks, it remains largely unknown how AI should be designed to facilitate effective collaboration between user experience (UX) evaluators and AI. Inspired by the concepts of agency and work context in human and AI collaboration literature, we studied two corresponding design factors for AI-assisted UX evaluation: explanations and synchronization. Explanations allow AI to further inform humans how it identifies UX problems from a usability test session; synchronization refers to the two ways humans and AI collaborate: synchronously and asynchronously. We iteratively designed a tool, AI Assistant, with four versions of UIs corresponding to the two levels of explanations (with/without) and synchronization (sync/async). By adopting a hybrid wizard-of-oz approach to simulating an AI with reasonable performance, we conducted a mixed-method study with 24 UX evaluators identifying UX problems from usability test videos using AI Assistant. Our quantitative and qualitative results show that AI with explanations, regardless of being presented synchronously or asynchronously, provided better support for UX evaluators' analysis and was perceived more positively; when without explanations, synchronous AI better improved UX evaluators' performance and engagement compared to the asynchronous AI. Lastly, we present the design implications for AI-assisted UX evaluation and facilitating more effective human-AI collaboration.

[W9]

Zejiang Shen, Weining Li, Jian Zhao, Melissa Dell, Yaoliang Yu. OLALA: Object-Level Active Learning Based Layout Annotation. Proceedings of the EMNLP 5th Workshop on Natural Language Processing and Computational Social Science, 2022.
EMNLP

Abstract: Document images often have intricate layout structures, with numerous content regions (e.g. texts, figures, tables) densely arranged on each page. This makes the manual annotation of layout datasets expensive and inefficient. These characteristics also challenge existing active learning methods, as image-level scoring and selection suffer from the overexposure of common objects.Inspired by recent progresses in semi-supervised learning and self-training, we propose an Object-Level Active Learning framework for efficient document layout Annotation, OLALA. In this framework, only regions with the most ambiguous object predictions within an image are selected for annotators to label, optimizing the use of the annotation budget. For unselected predictions, the semi-automatic correction algorithm is proposed to identify certain errors based on prior knowledge of layout structures and rectifies them with minor supervision. Additionally, we carefully design a perturbation-based object scoring function for document images. It governs the object selection process via evaluating prediction ambiguities, and considers both the positions and categories of predicted layout objects. Extensive experiments show that OLALA can significantly boost model performance and improve annotation efficiency, given the same labeling budget.

See the complete publication list...

Thesis

[T1]

Jian Zhao Interactive Visual Data Exploration: A Multi-Focus Approach. Department of Computer Science, University of Toronto, 2015.

Abstract: Recently, the amount of digital information available in the world has been growing at a tremendous rate. This huge, heterogeneous, and complicated data that we are continuously generating could be an incredible resource for us to seek insights and make informed decisions. For this knowledge extraction to be efficient, visual exploration of data is demanded in addition to fully automatic methods, because visual exploration can integrate the creativity, flexibility, and general experience of the human user into the sense-making process through interaction and visualization techniques.

Due to the scale and complexity of data, robust conclusions are usually formed by coordinating many sub-regions in an information space, which leads to the approach of multi-focus visual exploration that allows browsing different data segments with multiple views and perspectives simultaneously. While prior research has proposed a myriad of information visualization techniques, there still lacks comprehensive understanding about how visual exploration can be facilitated by multi-focus interactive visualizations. This dissertation investigates issues and techniques of multi-focus visual exploration through five design studies, touching various types of data in a range of application domains.

The first two design studies address the exploration of numerical data values. KronoMiner presents a multi-purpose visual tool for exploring time-series based on a dynamic radial hierarchy; and the ChronoLenses system supports exploratory visual analysis of time-series by allowing users to progressively construct advanced analytical pipelines. The third design study focuses on the exploration of logical data structures, and presents DAViewer that facilitates computational linguistics researchers to explore and compare rhetorical trees. The last two design studies consider the exploration of heterogeneous data attributes (or facets). TimeSlice facilitates the browsing of multi-faceted events timelines by organizing visual queries in a tree structure; and PivotSlice aids the mining of relationships in multi-attributed networks through a dynamic subdivision of data with customized semantics.

This dissertation ends with critical reflections and generalizations of the experiences obtained from the case studies. High-level design considerations, conceptual models, and visualization theories are distilled to inform researchers and practitioners in information visualization for devising effective multi-focus visual interfaces.

Patents

[P22]

Wei Zhou, Mona Loorak, Ghazaleh Saniee-Monfared, Sachi Mizobuchi, Pourang Irani, Jian Zhao, Wei Li. Methods, Devices, Media for Input/Output Space Mapping in Head-Based Human-Computer Interactions. US11797081B2, Filed in 2021, Granted in 2023.

[P21]

Takanori Fujiwara, Jian Zhao, Francine Chen. System and Method for Contrastive Network Analysis and Visualization. US11538552B2, Filed in 2020, Granted in 2022.

[P20]

Jian Zhao System and Method for Summarizing and Steering Multi-User Collaborative Data Analysis. US10937213B2, Filed in 2019, Granted in 2021.

[P19]

Jian Zhao, Francine Chen System and Method for Automatically Sorting Ranked Items and Generating a Visual Representation of Ranked Results. US11010411B2, Filed in 2019, Granted in 2021.

[P18]

Hideto Oda, Chidansh Bhatt, Jian Zhao. Optimized Parts Pickup List and Routes for Efficient Manufacturing using Frequent Pattern Mining and Visualization. US20200226505A1, Filed in 2018, Granted in 2021.

[P17]

Patrick Chiu, Chelhwon Kim, Hajime Ueno, Yulius Tjahjadi, Anthony Dunnigan, Francine Chen, Jian Zhao,Bee-Yian Liew, Scott Carter. System for Searching Documents and People based on Detecting Documents and People around a Tables. US10810457B2, Filed in 2018, Granted in 2020.

[P16]

Jian Zhao, Francine Chen, Patrick Chiu. A Visual Analysis Framework for Understanding Missing Links in Bipartite Networks. US11176460B2, Filed in 2018, Granted in 2021.

[P15]

John Wenskovitch, Jian Zhao, Matthew Cooper, Scott Catter System and Method for Computational Notebook Interface. US10768904B2, Filed in 2018, Granted in 2020.

[P14]

Francine Chen, Jian Zhao, Yan-Ying Chen. System and Method for Generating Titles for Summarizing Conversational Documents. US20200026767A1, Filed in 2018, Abandoned.

[P13]

Jian Zhao, Yan-Ying Chen, Francine Chen. System and Method for Creating Visual Representation of Data based on Generated Glyphs. US10649618B2, Filed in 2018, Granted in 2020.

[P12]

Jian Zhao, Chidansh Bhatt, Matthew Cooper, Ayman Shamma. System and Method for Visualizing and Recommending Media Content Based on Sequential Context. US10776415B2, Filed in 2018, Granted in 2020.

[P11]

Jian Zhao, Siwei Fu. System and Method for Analyzing and Visualizing Team Conversational Data. US11086916B2, Filed in 2017, Granted in 2021.

[P10]

Jian Zhao, Francine Chen, Patrick Chiu. System for Visually Exploring Coordinated Relationships in Data. US10521445B2, Filed in 2017, Granted in 2019.

[P9]

Jian Zhao, Francine Chen, Patrick Chiu. System and Method for Visual Exploration of Sub-Network Patterns in Two-Mode Networks. US11068121B2, Filed in 2017, Granted in 2021.

[P8]

Jian Zhao, Francine Chen, Patrick Chiu. System and Method for Visually Exploring Search Results in Two-Mode Networks. US10521445B2, Filed in 2017, Granted in 2021.

[P7]

Francine Chen, Jian Zhao, Yan-Ying Chen. System and Method for User-Oriented Topic Selection and Browsing. US11080348B2, Filed 2017, Granted in 2021.

[P6]

Michael Glueck, Azam Khan, Jian Zhao. Handoff Support in Asynchronous Analysis Tasks using Knowledge Transfer Graphs. US20180081885A1, 2017.

[P5]

Jian Zhao, Michael Glueck, Azam Khan, Simon Breslay. Techniques For Mixed-Initiative Visualization of Data. US11663235B2, Filed in 2017, Granted in 2023.

[P4]

Jian Zhao, Michael Glueck, Azam Khan. Node-Centric Analysis of Dynamic Networks. US10142198B2, Filed in 2017, Granted in 2018.

[P3]

Mira Dontcheva, Jian Zhao, Aaron Hertzmann, Allan Wilson, Zhicheng Liu. Providing Visualizations of Event Sequence Data. US9577897B2, Filed in 2015, Granted in 2017.

[P2]

Liang Gou, Fei Wang, Jian Zhao, Michelle Zhou. Personal Emotion State Monitoring from Social Media. US20150213002A1, Filed in 2014, Abandoned.

[P1]

Jian Zhao, Steven Drucker, Danyel Fisher, Donald Brinkman. Relational Rendering of Multi-Faceted Data. US8872849B2, Filed in 2011, Granted in 2014.

Other Cool Stuff

See my arXiv author page.