版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、ThemeRiver: Visualizi ng Theme Cha nges over TimeSusa n Havre, Beth Hetzler, and Lucy NowellBattelle Pacific Northwest Divisio nRichla nd, Washi ngton 99352 USA1+509+375-6948susa n.havre | beth.hetzler | lucy .no wellp AbstractProceed ings of theIEEE Symposium on In formatio n Visualizati on 2
2、000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEThemeRiver? is a prototype system that visualizes thematic variati ons over time with in a large collecti on of documents. The “ river ” flows from left to right through time, cha nging width to depict cha nges in thematic stre ngth of tempora
3、lly associated docume nts. Colored “ currents ” flowing within the river narrow or wide n to in dicate decreases or in creases in the stre ngth of an in dividual topic or a group of topics in the associated docume nts. The river is show n with in the con text of a timeli ne and a corresp onding text
4、ual prese ntati on of exter nal eve nts.Keywords : visualization metaphors, trend analysis, timeli ne1. IntroductionIn exploratory information visualization, one goal is to present information so that users can easily discern patter ns. Patter ns reveal tre nds, relati on ships, anomalies, and struc
5、ture in the data, and may help usersProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEBay of PigsCastr
6、o confiscatesAmerican refineriesCuba and Sovist U.S- imposes relations resume embargo on Cuba(63)weapons(62)Proceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 20
7、00 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEE合®),3ovi&t (4 9)ch(13)Figure 1: ThemeRiver? uses a river metaphor to represent theme changes over time.i - - :-Proceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEE
8、confirm knowledge or hypotheses. Perhaps more importantly, they also raise unexpected questions leading users to new insights. The challenge is to create visualizations that enable users to find patterns quickly and easily. ThemeRiver, shown in Figure 1, is a prototype system designed to reveal temp
9、oral patterns in text collections.Information visualization systems such as Envision 13, BEAD 1, LyberWorld 3, 4 and SPIRE 18 represent each document or group of documents with a glyph or icon, portraying various document attributes. Various methods have been explored for showing change over time in
10、 document-centric visualizations. See Section 3 below.However, a user may be less interested in documents themselves than in theme changes within the whole collection over time. For example, how did Shakespeare themes change during various periods of his life or in relation to contemporary events? S
11、uch information is difficult, if not impossible, to glean from most visualizations. A visualization that focuses on themes, rather than documents, could be more useful for such exploration.ThemeRiver provides users with a macro-view of thematic changes in a corpus of documents over a serial dimensio
12、n. It is designed to facilitate the identification of trends, patterns, and unexpected occurrence or nonoccurrence of themes or topics. In our prototype, we use time as the serial dimension. We provide contextual information through a timeline and markers for cooccurring events of interest. Figure 1
13、 shows a sample ThemeRiver visualization. This paper describes the design of ThemeRiver, walks through a sample information exploration session, and discusses results of formative usability testing.2. DesignOur major design goal was to provide a visualization of theme change over time. Consider usin
14、g a histogram to visualize these changes. In a histogram (such as the one shown in Figure 2), each bar represents a time slice, and color variations and size within the bar represent the relative strength of themes specific to that slice. However, understanding the histogram requires users to work a
15、t integrating the themes across time because the bars are anchored to a baseline and the position of a particular theme within the bars may vary considerably.Like a histogram, ThemeRiver uses variations in width to represent variations in strength or degree ofrepresentation. However, it connects the
16、 strength values in adjacent time slices with smooth and continuous curves. The horizontal flow of the river represents the flow of time. Colored currents that run horizontally within the river represent themes. Each vertical section of the river corresponds to an ordered time slice.The width of eac
17、h current changes to reflect the thematic strength for each time slice. For example, in Figure 1 the theme “ soviet ” increases in relative strength in June 1960 as indicated by the widening of the upper bright orange current. “ Soviet ” loses relative strength in July and August; thus the same curr
18、ent narrows in the next two time slices.“ Soviet ” then increasessignificantly in relative strength in September; the current widens proportionately.Currents maintain their integrity as a single entity over tim'e.sIf a theme ceases to occur in the documents for a period of time and then recurs,
19、the current likewise disappears and then reappears. Consistent color and relative position to other themes make theme currents easy to recognize. In Figure 1, the lower purple band depicts the changes in relative strength of the theme “ cane. ” The “ cane ” current occurs grows and shrinks over time
20、; “ cane ” occurs most strongly in March 1961.We believe that ThemeRiver' s continuous curveshave much to do with its usability. The Gestalt School of Psychology 8, founded in 1919 in Germany, theorized that with perception,“ the whole is greaterthan the sum of the parts.” Simply put, during the
21、perception process humans do not organize individual, low-level, sensed elements, but sense more complete “ packages ” that represent objects or patterns. In his recent book 6, Hoffman presents a compelling discussion of how our perceptual processes identify curves and silhouettes, recognize parts,
22、and group them together into objects. Numerous aspects of the image influence our ability to perceive these parts and objects, including similarity, continuity, symmetry, proximity, and closure. For example, it is easier to perceive objects that are bounded by continuous curves than those that conta
23、in abrupt changes 17.The vertical proximity of the river currents makes it easy for users to judge the relative width of currents and thus the relative strength of the themes. Similarly, symmetry around the horizontal axis of the river, a current, or group of currents makes it easier for users to pe
24、rceive flow patterns and changes. Widths of currents combine to show cumulative widening and narrowing, representing changing strength for the selected set of themes as a whole.Proceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEE
25、Ethe ability to show parent-child relationships with lines between related time bars 10. The DIVA system 12 uses animation to show how particular measured values change in relation to the temporal flow of a video. To help groups collaborating to create a document or other artifact, the Timewarp syst
26、em developed at Xerox PARC 2 lets users view and edit multiple timelines of the changing state of that artifact. The metaphor used is similar to a state diagram, with lines connecting state nodes and branches. Additional work on timelines includes Karam ' s 7 and Kullberg' s 9.We know of no
27、other systems that use the river metaphor to depict the passage of time. However, Tufte 16 presents a similar idea in an artist' s illustration showingtrends in music. In that illustration, width represents sales and proximity indicates influence of preceding styles. Our work differs in several
28、aspects, such as the use of color, the inclusion of contextual events, and the ability to generate the visualization automatically from a potentially very large collection of documents.Values for theme strength can be calculated various ways. For example, they might represent the number of documents
29、 containing the word. Because the river loses its continuity and structure if there are too few or too many themes, we created several theme subsets for exploration.We have implemented a proof-of-principle prototype and used it to explore data from multiple sources. Figure 1 portrays data from a col
30、lection of speeches, interviews, articles, and other text associated with Fidel Castro. The visualization includes the river, a timeline below the river, and markers for related historical events along the top. With ThemeRiver, users may? display topic and event labels? display time and event grid l
31、ines? display the raw data points? choose among drawing algorithms for the currents and river.Users may also display the associated time or theme name by simply moving the mouse across the image. In addition, users may pan and zoom to see other time periods or parts of the river and to see more deta
32、il or broader context. In this sample data set, we found several interesting correspondences between themes and events, such as the expansion of the“ oilbefore Castro confiscated American oil refineries (see Figure 1).3. Related Work4. Usability EvaluationEarly in ThemeRiver' s development, we c
33、arried outthaemsiemjpulset formative usability evaluation with two users.Questions we wanted to answer with this evaluation includedDo users understand the metaphor?Can they identify themes that are more often discussed?Proceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis
34、'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEMany systems include features for viewing time. One common method is to show discrete time slices. For example, in the Spatial Paradigm for Information Retrieval and Exploration (SPIRE) Galaxy visualization 18, users may choose to progressively step throug
35、h time, showing only the icons for documents originating within each specified time period. Another common approach is to show time as an attribute of documents, as done in the Virginia Tech' s Envision system, which lets usersmap various metadata values, including date, to x-axis, y-axis, or co
36、lor, shape, or size graphical encodings 13.More similar to ThemeRiver in intent are systems that focus directly on time. The LifeLines system, developed jointly by the University of Maryland and IBM, has been used to visualize medical records and juvenile criminal records 14, 15. The visualization d
37、isplays time along the x-axis and uses the y-axis to categorize events. Bars depict duration for a given event, and graphical attributes such as color show event attributes. TmViewer uses a similar approach, adding? Does the visualization help them raise new questions about the data? Do they interpr
38、et details of the visualization in ways we had not expected? How does their interpretation of the visualization differ from that of a histogram showing the same data?The data were the Castro collection described above, focusing on the years 1960-1963. We represented the same data both in ThemeRiver
39、and in a histogram that we created using a spreadsheet. (See Figure 2.) We made the content of the histogram as similar as possible to ThemeRiver ' s. For example, the histogram depicted thematic content by months, using the same values that drive ThemeRiver. The month timeline was shown along t
40、he bottom and we added an event line to the histogram like the one in ThemeRiver.Usability evaluation began with a brief explanation of the purpose of the session, followed by an introduction to the data. Both participants viewed the data in both visualizations; one participant started first with th
41、eProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEMvlnidmpvi-ah-Proceed ings of theIEEE Symposium on
42、In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEfMf rM 叶阿iWMl* Ost teycn+5 alfi-vteT-gR yx=如呻boo 2TB WliW cwlor dr drtcn町席etedbon*fdrgorf
43、eodmhqived Fvn rtpBThrfkdb IrdhdtH M网些 nuMddi m «4却忡 n-.i-in OK :fl platen HWtWlwwirqporv ki|4 f.? '石".十 JTMFigure 2: Like ThemeRiverin Figure 1, this histogram uses the Castro collection data andProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-
44、7695- 0 804-9 /00 $10.00 2000 IEEEdepicts changes in thematic content over time.Proceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695-
45、0 804-9 /00 $10.00 2000 IEEEmost discussedhemeRiver easy to un dersta nd. They also foundthat the barshistogram and one with ThemeRiver. We asked each participa nt questi ons about what they observed in each display.Examples of specific questi ons in clude? In July ' 62, what are the three theme
46、s? Where is a new theme introduced?Examples of more gen eral questi ons in clude? What looks interesting here want to explore? How would you like to change or manipulate the view?We captured verbal protocol duri ng this discussi on. At the end, we asked participants to complete a short questi onn ai
47、re, with feedback about the visualizati on and possible enhan ceme nts.From the verbal protocol and from user behavior, we observed that the users had no difficulty in un dersta nd- ing the metaphor. They were able to ide ntify themes that were stro ngly represe nted and able to un dersta nd the rel
48、ati on ship betwee n the width of the curre nt and theme strength. The visualization also triggered questions about the reasons behind certain theme strengths and patter ns. For exploratory visualizati ons, this is a good result; we believe that a visualizati on should help the user ide ntify questi
49、 ons of in terest to explore.Questi onn aire resp on ses showed that users foundThemeRiver useful, particularly for identifying macro tren ds. They told us that it was less useful for ide nti- fying minor trends because the curves tend to de- whae mphasiize very small values. We asked about the valu
50、e of the river metaphor, and users rated it highly as well. They observed that the conn ected ness of the river helped them follow a trend more easily over time tha n in the histogram; this result is compatible with the perception principles described by Ware 17.Users liked some features of the hist
51、ogram and rec- omme nded addi ng them to ThemeRiver. One such feature is the ability to see n umeric values that drive the histogram and river curre nts. One user expressed more trust in the histogram, because she“ knew”were exactly the data values, whereas she was not sure exactly what the data val
52、ues were in ThemeRiver. Her point is a valid one, especially because the curved linesProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7695- 0 804-9 /00 $10.00 2000 IEEEProceed ings of theIEEE Symposium on In formatio n Visualizati on 2000 (In foVis'00)0-7
53、695- 0 804-9 /00 $10.00 2000 IEEEsents a sample usage scenario, illustrating the capabilities of the curre nt vers ion.We used ThemeRiver to explore the 1990 Associated Press (AP) n ewswire data from the TREC5 distri- buti on disks, a set of over 100,000 docume nts (see Figure 3). To explore the sel
54、ected themes in this collection, a user might beg in with a high-level survey of the visualization by panning along the course of the river.The user might look for wider curre nts that sig nal heavy use of a topic, such as the one for“ baghdad ” in Figure3. Changes in the color distribution of the r
55、iver signal cha nges in themes. We see such a cha nge in August 1990, when the“ kuwait ” current, which had vanished inlate July, sudde nly appears and rapidly wide ns. The user could also look for narrow currents in the river thatsig nal relatively light use of particular themes.In an earlier paper
56、, Hetzler et al. 5 explored the AP data set with a variety of our visual an alysis tools, focus ing on large theme cha nges surrou nding the Iraqi in vasi on of Kuwait on August 2. ThemeRiver also reflects these large theme cha nges. Near the right side of Figure 3, we see several curre nts that exp
57、a nd dramatic-of ThemeRiver do require that we in terpolate betwee n data points to produce the curves. We have added the capability for users to see the exact data points on dema nd.Although users liked the abstracti on to the whole collecti on and thus away from in dividual docume nts, both users suggested add ing features to access documents if desired. They wan ted the ability to see the total number of documents during any time period and to get the text of each docume nt on dema nd. They wan ted to select a curre nt and see the docume nts that con tributed to it.Users also wan ted
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 跌倒预防措施、预防病人跌倒坠床管理制度
- 2026浙江嘉兴市秀洲区招聘社区工作者33人笔试参考试题及答案详解
- 2026中智集团招聘招投标专员1人笔试参考题库及答案详解
- 服装厂规章制度如何与企业文化相结合
- 3D打印建筑材料科技公司生产车间防尘防毒管理制度
- 林下生态养蜂管护技师考试试卷及答案
- 2025年中信重工社会招聘1751人笔试历年参考题库附带答案详解
- 2025山东泰安市泰山城建投资集团有限公司带职级人员招聘8人笔试历年参考题库附带答案详解
- 2025安徽芜湖凤鸣控股集团及其子公司选调10人笔试历年参考题库附带答案详解
- 2025威海市环翠区国有资本运营有限公司公开招聘工作人员(15名)笔试历年参考题库附带答案详解
- 《社区老年人营养管理服务规范》
- 国家公路网交通标志调整工作技术指南
- 行政复议法-形考作业2-国开(ZJ)-参考资料
- 手术室交接制度
- (正式版)YBT 6328-2024 冶金工业建构筑物安全运维技术规范
- 丰田车系卡罗拉(双擎)轿车用户使用手册【含书签】
- 2023年武汉市教师招聘考试真题
- 江苏双金纺织品有限公司新建年产2万锭纺纱、3188吨纱染生产项目验收监测报告
- YY/T 0681.3-2010无菌医疗器械包装试验方法第3部分:无约束包装抗内压破坏
- 拉线的制作详细课件
- 走向精确勘探的道路
评论
0/150
提交评论