research-article
Authors: Kevin Bönisch, Manuel Stoeckel, Alexander Mehler
HT '24: Proceedings of the 35th ACM Conference on Hypertext and Social Media
Pages 330 - 336
Published: 10 September 2024 Publication History
Metrics
Total Citations0Total Downloads0Last 12 Months0
Last 6 weeks0
New Citation Alert added!
This alert has been successfully added and will be sent to:
You will be notified whenever a record that you have chosen has been cited.
To manage your alert preferences, click on the button below.
Manage my Alerts
New Citation Alert!
Please log in to your account
Get Access
- Get Access
- References
- Media
- Tables
- Share
Abstract
We present HyperCausal, a 3D hypertext visualization framework for exploring causal inference in generative Large Language Models (LLMs). HyperCausal maps the generative processes of LLMs into spatial hypertexts, where tokens are represented as nodes connected by probability-weighted edges. The edges are weighted by the prediction scores of next tokens, depending on the underlying language model. HyperCausal facilitates navigation through the causal space of the underlying LLM, allowing users to explore predicted word sequences and their branching. Through comparative analysis of LLM parameters such as token probabilities and search algorithms, HyperCausal provides insight into model behavior and performance. Implemented using the Hugging Face transformers library and Three.js, HyperCausal ensures cross-platform accessibility to advance research in natural language processing using concepts from hypertext research. We demonstrate several use cases of HyperCausal and highlight the potential for detecting hallucinations generated by LLMs using this framework. The connection with hypertext research arises from the fact that HyperCausal relies on user interaction to unfold graphs with hierarchically appearing branching alternatives in 3D space. This approach refers to spatial hypertexts and early concepts of hierarchical hypertext structures. A third connection concerns hypertext fiction, since the branching alternatives mediated by HyperCausal manifest non-linearly organized reading threads along artificially generated texts that the user decides to follow optionally depending on the reading context.
References
[1]
Saranya A. and Subhashini R.2023. A systematic review of Explainable Artificial Intelligence models and applications: Recent developments and future trends. Decision Analytics Journal 7 (2023), 100230. https://doi.org/10.1016/j.dajour.2023.100230
[2]
Espen Aarseth. 1995. Cybertext: perspectives on ergodic literature. University of Bergen.
[3]
Mark Bernstein. 2002. Storyspace 1. In Proceedings of the Thirteenth ACM Conference on Hypertext and Hypermedia (College Park, Maryland, USA) (HYPERTEXT ’02). Association for Computing Machinery, New York, NY, USA, 172–181. https://doi.org/10.1145/513338.513383
Digital Library
[4]
TomB. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, DanielM. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. arxiv:2005.14165[cs.CL]
[5]
Brendan Bycroft. 2023. LLM Visualization. https://github.com/bbycroft/llm-viz. 3D Visualization of a GPT-style LLM.
[6]
Shan Carter, Zan Armstrong, Ludwig Schubert, Ian Johnson, and Chris Olah. 2019. Activation Atlas. Distill (2019). https://doi.org/10.23915/distill.00015 https://distill.pub/2019/activation-atlas.
[7]
C. Chen and M. Czerwinski. 1998. From Latent Semantics to Spatial Hypertext: An Integrated Approach. In Proceedings of 9th ACM Conference on Hypertext and Hypermedia, K.Grønbæk, E.Mylonas, and F.M. Shipman (Eds.). ACM, New York, 77–86.
[8]
Matthew Dahl, Varun Magesh, Mirac Suzgun, and DanielE. Ho. 2024. Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models. arxiv:2401.01301[cs.CL]
[9]
JosephF. DeRose, Jiayao Wang, and Matthew Berger. 2021. Attention Flows: Analyzing and Comparing Attention Mechanisms in Language Models. IEEE Transactions on Visualization and Computer Graphics 27, 2 (2021), 1160–1170. https://doi.org/10.1109/TVCG.2020.3028976 arxiv:2009.07053[cs.HC]
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
[11]
Angela Fan, Mike Lewis, and Yann Dauphin. 2018. Hierarchical Neural Story Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Iryna Gurevych and Yusuke Miyao (Eds.). Association for Computational Linguistics, Melbourne, Australia, 889–898. https://doi.org/10.18653/v1/P18-1082
[12]
Markus Freitag and Yaser Al-Onaizan. 2017. Beam Search Strategies for Neural Machine Translation. In Proceedings of the First Workshop on Neural Machine Translation. Association for Computational Linguistics. https://doi.org/10.18653/v1/w17-3207
[13]
Julie Gerlings, Arisa Shollo, and Ioanna Constantiou. 2021. Reviewing the Need for Explainable Artificial Intelligence (xAI). arxiv:2012.01007[cs.HC]
[14]
Prashant Gohel, Priyanka Singh, and Manoranjan Mohanty. 2021. Explainable AI: current status and future directions. arxiv:2107.07045[cs.LG]
[15]
FrankG. Halasz. 1988. Reflections on NoteCards: Seven Issues for the Next Generation of Hypermedia Systems. Commun. ACM 31, 7 (1988), 836–852.
Digital Library
[16]
Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. 2020. The Curious Case of Neural Text Degeneration. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net. https://openreview.net/forum?id=rygGQyrFvH
[17]
Minsuk Kahng, PierreY. Andrews, Aditya Kalro, and DuenHorng Chau. 2018. ActiVis: Visual Exploration of Industry-Scale Deep Neural Network Models. IEEE Transactions on Visualization and Computer Graphics 24, 1 (2018), 88–97. https://doi.org/10.1109/TVCG.2017.2744718 arxiv:1704.01942[cs.HC]
[18]
Rebecca Kehlbeck, Rita Sevastjanova, Thilo Spinner, Tobias Stähle, and Mennatallah El-Assady. 2021. Demystifying the Embedding Space of Language Models. https://bert-vs-gpt2.dbvis.de/.
[19]
Jaesong Lee, Joong-Hwi Shin, and Jun-Seok Kim. 2017. Interactive Visualization and Manipulation of Attention-based Neural Machine Translation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Lucia Specia, Matt Post, and Michael Paul (Eds.). Association for Computational Linguistics, Copenhagen, Denmark, 121–126. https://doi.org/10.18653/v1/D17-2021
[20]
Dekang Lin. 1998. Automatic Retrieval and Clustering of Similar Words. In Proceedings of the COLING-ACL ’98. 768–774.
[21]
Shusen Liu, Tao Li, Zhimin Li, Vivek Srikumar, Valerio Pascucci, and Peer-Timo Bremer. 2018. Visual Interrogation of Attention-Based Models for Natural Language Inference and Machine Comprehension. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Eduardo Blanco and Wei Lu (Eds.). Association for Computational Linguistics, Brussels, Belgium, 36–41. https://doi.org/10.18653/v1/D18-2007
[22]
CatherineC. Marshall and FrankM. Shipman III. 1993. Searching for the Missing Link: Discovering Implicit Structure in Spatial Hypertext. In Proceedings of the Fifth ACM Conference on Hypertext. ACM, 217–230.
[23]
Alexander Mehler. 2002. Hierarchical Orderings of Textual Units. In Proceedings of the COLING-ACL ’02. Morgan Kaufmann, 646–652.
Digital Library
[24]
Alexander Mehler, Tolga Uslu, and Wahed Hemati. 2016. Text2voronoi: An Image-driven Approach to Differential Diagnosis. In Proc. of the 5th Workshop on Vision and Language, hosted by ACL 2016 (VL@ACL 2016).
[25]
Marie-Theres Nagel, Svenja Schäfer, Olga Zlatkin-Troitschanskaia, Christian Schemer, Marcus Maurer, Dimitri Molerov, Susanne Schmidt, and Sebastian Brückner. 2020. How do university students’ web search behavior, website characteristics, and the interaction of both influence students’ critical online reasoning?. In Frontiers in Education, Vol.5. Frontiers Media SA, 565062.
[26]
T.H. Nelson. 1965. Complex information processing: a file structure for the complex, the changing and the indeterminate. In Proceedings of the 1965 20th National Conference (Cleveland, Ohio, USA) (ACM ’65). Association for Computing Machinery, New York, NY, USA, 84–100. https://doi.org/10.1145/800197.806036
Digital Library
[27]
Mark E.J. Newman. 2010. Networks: An Introduction. Oxford University Press, Oxford.
[28]
OpenAI. 2023. GPT-4 Technical Report. arxiv:2303.08774[cs.CL]
[29]
CharlesEgerton Osgood, GeorgeJ. Suci, and PercyH. Tannenbaum. 1957. The measurement of meaning. University of Illinois Press, Urbana, IL.
[30]
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, 2019. Language models are unsupervised multitask learners. OpenAI blog 1, 8 (2019), 9.
[31]
Burghard Rieger. 1984. Semantic Relevance and Aspect Dependency in a Given Subject Domain. In Proceedings of the COLING-ACL ’84. 298–301.
[32]
Simon Rowberry. 2011. Vladimir Nabokov’s pale fire: the lost ’father of all hypertext demos’?. In Proceedings of the 22nd ACM Conference on Hypertext and Hypermedia (Eindhoven, The Netherlands) (HT ’11). Association for Computing Machinery, New York, NY, USA, 319–324. https://doi.org/10.1145/1995966.1996008
Digital Library
[33]
Hendrik Strobelt, Sebastian Gehrmann, Michael Behrisch, Adam Perer, Hanspeter Pfister, and AlexanderM. Rush. 2019. Seq2seq-Vis: A Visual Debugging Tool for Sequence-to-Sequence Models. IEEE Transactions on Visualization and Computer Graphics 25, 1 (2019), 353–363. https://doi.org/10.1109/TVCG.2018.2865044 arxiv:1804.09299[cs.CL]
Digital Library
[34]
Jesse Vig. 2019. A Multiscale Visualization of Attention in the Transformer Model. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, MartaR. Costa-jussà and Enrique Alfonseca (Eds.). Association for Computational Linguistics, Florence, Italy, 37–42. https://doi.org/10.18653/v1/P19-3007
[35]
Ziwei Xu, Sanjay Jain, and Mohan Kankanhalli. 2024. Hallucination is Inevitable: An Innate Limitation of Large Language Models. arxiv:2401.11817[cs.CL]
[36]
Catherine Yeh, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, and Martin Wattenberg. 2023. AttentionViz: A Global View of Transformer Attention. arxiv:2305.03210[cs.HC]
[37]
Jun Yuan, Changjian Chen, Weikai Yang, Mengchen Liu, Jiazhi Xia, and Shixia Liu. 2021. A survey of visual analytics techniques for machine learning. Comput. Vis. Media 7, 1 (2021), 3–36. https://doi.org/10.1007/S41095-020-0191-7
[38]
Haiyan Zhao, Hanjie Chen, Fan Yang, Ninghao Liu, Huiqi Deng, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, and Mengnan Du. 2024. Explainability for Large Language Models: A Survey. ACM Trans. Intell. Syst. Technol. 15, 2, Article 20 (feb 2024), 38pages. https://doi.org/10.1145/3639372
Digital Library
Index Terms
HyperCausal: Visualizing Causal Inference in 3D Hypertext
Human-centered computing
Visualization
Visualization systems and tools
Information systems
Information systems applications
Recommendations
- Feral hypertext: when hypertext literature escapes control
HYPERTEXT '05: Proceedings of the sixteenth ACM conference on Hypertext and hypermedia
This paper presents a historical view of hypertext looking at pre-web hypertext as a domesticated species bred in captivity, and arguing that on the web, some breeds of hypertext have gone feral. Feral hypertext is no longer tame and domesticated, but ...
Read More
- Reading and writing fluid Hypertext Narratives
HYPERTEXT '02: Proceedings of the thirteenth ACM conference on Hypertext and hypermedia
We describe a new way to present and author hypertext narratives. The Fluid Reader constructs a unified interactive text from the content of multiple nodes and allows a reader to explore alternative paths within it. The Fluid Reader has been available ...
Read More
- Visualizing semantics in passwords: the role of dates
VizSec '12: Proceedings of the Ninth International Symposium on Visualization for Cyber Security
We begin an investigation into the semantic patterns underlying user choice in passwords. Understanding semantic patterns provides insight into how people choose passwords, which in turn can be used to inform usable password policies and password ...
Read More
Comments
Information & Contributors
Information
Published In
HT '24: Proceedings of the 35th ACM Conference on Hypertext and Social Media
September 2024
415 pages
ISBN:9798400705953
DOI:10.1145/3648188
Copyright © 2024 ACM.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [emailprotected].
Sponsors
- SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
Published: 10 September 2024
Permissions
Request permissions for this article.
Check for updates
Author Tags
- 3D hypertext
- large language models
- visualization
Qualifiers
- Research-article
- Research
- Refereed limited
Conference
HT '24
Sponsor:
- SIGWEB
Acceptance Rates
Overall Acceptance Rate 378 of 1,158 submissions, 33%
Contributors
Other Metrics
View Article Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
Total Citations
Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 25 Aug 2024
Other Metrics
View Author Metrics
Citations
View Options
Get Access
Login options
Check if you have access through your login credentials or your institution to get full access on this article.
Sign in
Full Access
Get this Publication
View options
Media
Figures
Other
Tables