English to Chinese: The value of visualizing data | |
Source text - English What do the paths that millions of visitors take through a web site look like? How do the 3.1 billion A, C, G, and T letters of the human genome compare to those of the chimp or the mouse? Out of a few hundred thousand files on your computer’s hard disk, which ones are taking up the most space, and how often do you use them? By applying methods from the fields of computer science, statistics, data mining, graphicdesign, and visualization, we can begin to answer these questions in a meaningful way that also makes the answers accessible to others.
All of the previous questions involve a large quantity of data, which makes it extremely difficult to gain a “big picture” understanding of its meaning. The problem is further compounded by the data’s continually changing nature, which can result from new information being added or older information continuously being refined. This deluge of data necessitates new software-based tools, and its complexity requires extra consideration. Whenever we analyze data, our goal is to highlight its features in order of their importance, reveal patterns, and simultaneously show features that exist across multiple dimensions.
This book shows you how to make use of data as a resource that you might otherwise never tap. You’ll learn basic visualization principles, how to choose the right kind of display for your purposes, and how to provide interactive features that will bring users to your site over and over again. You’ll also learn to program in Processing, a simple but powerful environment that lets you quickly carry out the techniques in this book. You’ll find Processing a good basis for designing interfaces around large data sets, but even if you move to other visualization tools, the ways of thinking presented here will serve you as long as human beings continue to process information the same way they’ve always done.
| Translation - Chinese 上百万浏览者访问网页时,这些网络通路是什么样的?将人类基因组里31亿嘌呤,嘧啶和黑猩猩,老鼠作对比会如何?电脑硬盘上成千上万的文件里,哪一个所占空间最大,使用频率有多高?通过借助计算机科学,统计学,数据挖掘,图形设计以及可视化的理论方法,我们就能通过有意义的方式来获取答案,或者,更接近答案。
上文中提出的问题,涉及到巨大的数据量,很难能得以从“全貌”来窥其意义。而且,由于新的数据会不断的递增,而旧的数据会不断被精化处理,这种频繁变化性更增加了理解的难度。 数据的泛滥使得一种新的软件工具的出现成为必要,而且数据的复杂性需要特别的考虑到。只要分析数据,目的就是要按这些数据的重要性,显示形式的特点进行突出,同时也展示数据在于多维空间的各种特性。
这本书将向你展示怎样用一种可能你从未使用过的方法来利用数据。你能了解到可视化的基本原理;怎样选择最适合自己想法的展示方式;怎样利用交互式特性来吸引客户经常性访问你的网站。并且,你能学到怎样去编程,在这个简洁、强大的环境下,你能很快的运用此书中的技术。只要人类继续按常规来处理信息,即使你使用其它可视化工具时,你会发现用Processing作为基础设计大型数据集界面基础的这种思路会让你受益无穷。
|
English to Chinese: Why Data Display Requires Planning General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English Each set of data has particular display needs, and the purpose for which you’re using the data set has just as much of an effect on those needs as the data itself. There are dozens of quick tools for developing graphics in a cookie-cutter fashion in office programs,on the Web, and elsewhere, but complex data sets used for specialized applications require unique treatment. Throughout this book, we’ll discuss how the characteristics of a data set help determine what kind of visualization you’ll use. | Translation - Chinese 每一个数据集,都有其特定的展示需要。显示需求除了由数据集的用途决定以外,还被数据集本身所影响。Office 软件,互联网等地方存在着许多毫无创意的快速图形图象处理工具,由于作为用途特殊的复杂数据集需要特殊的对待,此书中,我们将论述怎样利用数据集的特性来决定你所需的可视化方法。 |
English to Chinese: Too Much Information General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English When you hear the term “information overload,” you probably know exactly what it means because it’s something you deal with daily. In Richard Saul Wurman’s book Information Anxiety (Doubleday), he describes how the New York Times on an average Sunday contains more information than a Renaissance-era person had access to in his entire lifetime.
But this is an exciting time. For $300, you can purchase a commodity PC that has thousands of times more computing power than the first computers used to tabulate the U.S. Census. The capability of modern machines is astounding. Performing sophisticated data analysis no longer requires a research laboratory, just a cheap machine and some code. Complex data sets can be accessed, explored, and analyzed by the public in a way that simply was not possible in the past.
The past 10 years have also brought about significant changes in the graphic capabilities of average machines. Driven by the gaming industry, high-end 2D and 3D graphics hardware no longer requires dedicated machines from specific vendors, but can instead be purchased as a $100 add-on card and is standard equipment for any machine costing $700 or more. When not used for gaming, these cards can render extremely sophisticated models with thousands of shapes, and can do so quickly enough to provide smooth, interactive animation. And these prices will only decrease—within a few years’ time, accelerated graphics will be standard equipment on the aforementioned commodity PC.
| Translation - Chinese 当你听到“信息过载”这个词的时候,你大概已经知道它的意义,因为这可能是你每天都需要去应对的东西。在理查德•索尔•乌尔曼的《信息焦虑》(Doubleday出版社出版)一书中,他记述了纽约时报一个普通的周日版所包含信息量远比文艺复兴时期一个人一生的所经历的还多。
显然,这是一个让人兴奋的时代。花300美元,你就能买台比世界上第一台用于美国人口普查制表的计算机运算能力强大千倍的电脑。现代机器的能力是令人惊叹的。对数据精细进行分析不再需要研究室,一个廉价的机器加上一些代码就能搞定。繁复的数据集被大众所获取,研究,以及分析,这在以前是无法实现的。
过去的10年中,普通计算机的图像显示能力也得到了大幅度的提高。在游戏产业的驱动下,想要使用高端的二维,三维图形硬件,你将不再需要从特约供应商那里购买专业的机器,只需花上100美元买个相当于700美元标准配置机器使用的扩展卡就行了。如果不被用于游戏,这些卡还能为上形态各异极其复杂的模型进行渲染,而且,还能快速提供流畅的交互式动画。 在几年的时间内,图形加速器将会普遍存在于计算机里,当然,价格只会越来越低。
|
English to Chinese: Data Collection General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English We’re getting better and better at collecting data, but we lag in what we can do with it. Most of the examples in this book come from freely available data sources on the Internet. Lots of data is out there, but it’s not being used to its greatest potential because it’s not being visualized as well as it could be. (More about this can be found in Chapter 9, which covers places to find data and how to retrieve it.)
With all the data we’ve collected, we still don’t have many satisfactory answers to the sort of questions that we started with. This is the greatest challenge of our informationrich era: how can these questions be answered quickly, if not instantaneously? We’re getting so good at measuring and recording things, why haven’t we kept up with the methods to understand and communicate this information?
| Translation - Chinese 现在,我们越来越擅长于对数据进行采集,可是对怎样运用它却相对滞后。这本书中提到的大多数例子都是从互联网上自由下载下来的。有大量的数据囤积着,其潜力却没有被充分地挖掘和利用起来,这正是由于数据没有被尽可能的形象化。(第九章会进行详尽的叙述,包括怎样寻找数据以及怎样重新获取数据。)
数据在手,却无法为提出的问题找到令人满意的答案,在这个信息膨胀的时代里,这是我们所面临的最大挑战。如果这些问题不能马上找到答案,那么又谈何迅速的回答呢?我们精通于记录和测量,同样的,为什么不能去理解,传达这些信息呢?
|
English to Chinese: Thinking About Data General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English We also do very little sophisticated thinking about information itself. When AOL released a data set containing the search queries of millions of users that had been “randomized” to protect the innocent, articles soon appeared about how people could be identified by—and embarrassed by—information regarding their search habits. Even though we can collect this kind of information, we often don’t know quite what it means. Was this a major issue or did it simply embarrass a few AOL users? Similarly, when millions of records of personal data are lost or accessed illegally,what does that mean? With so few people addressing data, our understanding
remains quite narrow, boiling down to things like, “My credit card number might be stolen” or “Do I care if anyone sees what I search?”
| Translation - Chinese 对于信息本身,我们也很少对其进行深入的思考。当美国在线服务公司发布了一个对百万名用户进行的调查的数据集,并宣称为了维持真实性,调查是“随机”进行的消息后,马上有文章指出:有人可能会因此被泄露个人信息,从而造成尴尬。即使我们能收集到此类信息,也不大明白其意义所在。这是主要的问题吗,或者,不过是一部分美国在线服务的用户陷入了窘境中?同样的,当数百万的个人信息丢失或被非法获取,这意味着什么?个人信息太少,我们无法进行全面的理解,归根结底,事情成了:“我的信用卡好像被盗了”,或者是“我是不是在意别人看到我在搜索什么?” |
English to Chinese: Data Never Stays the Same General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English We might be accustomed to thinking about data as fixed values to be analyzed, but data is a moving target. How do we build representations of data that adjust to new values every second, hour, or week? This is a necessity because most data comes from the real world, where there are no absolutes. The temperature changes, the train runs late, or a product launch causes the traffic pattern on a web site to change drastically.
What happens when things start moving? How do we interact with “live” data? How do we unravel data as it changes over time? We might use animation to play back the evolution of a data set, or interaction to control what time span we’re looking at. How can we write code for these situations?
| Translation - Chinese 我们可能习惯于把数据看作不变值来分析,事实上,数据却是在不断的活动。怎样才能将数据每秒,每时,每周的变化表现出来?因为大部分数据都来自于瞬息万变的现实世界,所以,这是必须要实现的。温度的变化,火车延误,或者商品上线导致整个网站发生剧烈的变化。
事物变化会引起什么?怎样才能对“动态”的数据进行交互访问?怎样在数据变化后对其进行拆分?运用动画来表现数据集的变化,或者通过交互法来控制观察的时间段。该怎样来为这些情况编写代码?
|
English to Chinese: What Is the Question? General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English As machines have enormously increased the capacity with which we can create (through measurements and sampling) and store data, it becomes easier to disassociate the data from the original reason for collecting it. This leads to an all-too frequent situation: approaching visualization problems with the question, “How can we possibly understand so much data?”
As a contrast, think about subway maps, which are abstracted from the complex shape of the city and are focused on the rider’s goal: to get from one place to the next. Limiting the detail of each shape, turn, and geographical formation reduces this complex data set to answering the rider’s question: “How do I get from point A to point B?”
Harry Beck invented the format now commonly used for subway maps in the 1930s, when he redesigned the map of the London Underground. Inspired by the layout of circuit boards, the map simplified the complicated Tube system to a series of vertical, horizontal, and 45˚diagonal lines. While attempting to preserve as much of the relative physical layout as possible, the map shows only the connections between stations, as that is the only information that riders use to decide their paths.
When beginning a visualization project, it’s common to focus on all the data that has been collected so far. The amounts of information might be enormous—people like to brag about how many gigabytes of data they’ve collected and how difficult their visualization problem is. But great information visualization never starts from the standpoint of the data set; it starts with questions. Why was the data collected, what’s interesting about it, and what stories can it tell?
The most important part of understanding data is identifying the question that youwant to answer. Rather than thinking about the data that was collected, think about how it will be used and work backward to what was collected. You collect data because you want to know something about it. If you don’t really know why you’re collecting it, you’re just hoarding it. It’s easy to say things like, “I want to know what’s in it,” or “I want to know what it means.” Sure, but what’s meaningful?
The more specific you can make your question, the more specific and clear the visual result will be. When questions have a broad scope, as in “exploratory data analysis” tasks, the answers themselves will be broad and often geared toward those who are themselves versed in the data. John Tukey, who coined the term Exploratory Data Analysis, said “...pictures based on exploration of data should force their messages upon us.”* Too many data problems are labeled “exploratory” because the data collected is overwhelming, even though the original purpose was to answer a specific question or achieve specific results.
One of the most important (and least technical) skills in understanding data is asking good questions. An appropriate question shares an interest you have in the data,tries to convey it to others, and is curiosity-oriented rather than math-oriented. Visualizing data is just like any other type of communication: success is defined by your audience’s ability to pick up on, and be excited about, your insight.
Admittedly, you may have a rich set of data to which you want to provide flexible access by not defining your question too narrowly. Even then, your goal should be to highlight key findings. There is a tendency in the visualization field to borrow from the statistics field and separate problems into exploratory and expository, but for the purposes of this book, this distinction is not useful. The same methods and process are used for both.
In short, a proper visualization is a kind of narrative, providing a clear answer to a question without extraneous details. By focusing on the original intent of the question, you can eliminate such details because the question provides a benchmark for what is and is not necessary.
| Translation - Chinese 因为计算机性能得到极大的提升,所以我们可以(通过测量和采样)创建和存储数据,这样就能更容易的将信息同最初收集信息的目的剥离开来。这将会导致一种情况非常频繁发生,那就是实现可视化的问题,“如何去理解如此多的数据?”
例如,试想下地铁路线图吧,其旨在跳出城市错综复杂的地形,只专注于乘客的目的:从一个地方到达另一个地方。限制地理形态的细节,简化复杂的数据能使乘客的问题得到答案:“我从A点到B点该怎么走?”
1930年,哈里•贝克为伦敦地铁进行重新设计时所发明的这种图示法现在被广泛的用于地铁线路图。这个由线路设计图所启发出的灵感将线路图由复杂的管状系统简化为垂直的,水平的,以及45°角的线条。在保持相对尽量多的物理布局的前提下,地图只呈现出站与站之间的连接,这可是乘客选择路线唯一能依据的信息。
开始一项可视化项目时,常常需要专注于之前收集的所有信息。信息总量可能会非常巨大,人们喜欢抱怨将收集的几十亿字节的信息可视化起来会多困难。海量信息可视化绝不会从数据集自身的立场来出发,是从问题而出发。为什么收集信息,信息有什么意义,能反应出什么东西。
认识数据最重要的一点是明确你想要知道得出什么答案。思考怎样利用数据来为工作服务更胜于去思考这些数据本身。收集数据的目的是你想知道其有用之处。如果你无法明确目的,那就只是在做囤积的工作。说句“我想知道这里头到底有什么,”或者“我想知道这是什么意思”是很简单的,但是,怎么做才有意义的?
问题越详尽,视觉结果就会越精确清晰。就如同“探索性数据分析”课题,精通数据的人常常很重视广泛涉及的问题。提出探索性数据分析理论的约翰•图奇说到:建立在数据上的图片应该给予我们绝对性的信息。因为数据采集得实在是太多,所以太多问题被打上“探索性”的标签,即使采集初始目的是为了给出特定的答案或结果。
理解数据的最重要的技巧(至少是技术上的)之一就是问对问题。一个恰到好处的问题能引出你对数据的兴趣,试着将它传达给其它人,以好奇心为驱使总是好过以数学为驱使的。数据可视化就如同其它类型的沟通一样:成功与否取决于你的听众的理解力,能否为你的洞察力而感到兴奋。
无可否认的是,你可能想要扩大自己的问题,而将丰富的数据集灵活的展现出来。尽管如此,你的目的应该着重于主要的结论。可视化领域里有一个统计学领域里借鉴的思想,就是把问题分为探索性的和解释性的。就本书的宗旨来说,同样的方法和理论都适用这两者,所谓的差别是无用的。
简而言之,恰当的可视化是一种叙述方式,排除无关细节,直接给出清楚的答案。只要专注于最初想要知道的答案,你就能排除这些细节,因为提出的问题本身就是界定什么有用,什么没用的基准。
|
English to Chinese: A Combination of Many Disciplines General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English Given the complexity of data, using it to provide a meaningful solution requires insights from diverse fields: statistics, data mining, graphic design, and information visualization. However, each field has evolved in isolation from the others.
Thus, visual design—-the field of mapping data to a visual form—typically does not address how to handle thousands or tens of thousands of items of data. Data mining techniques have such capabilities, but they are disconnected from the means to interact with the data. Software-based information visualization adds building blocks for interacting with and representing various kinds of abstract data, but typically these methods undervalue the aesthetic principles of visual design rather than embrace their strength as a necessary aid to effective communication. Someone approaching a data representation problem (such as a scientist trying to visualize the results of a study involving a few thousand pieces of genetic data) often finds it difficult to choose a representation and wouldn’t even know what tools to use or books to read to begin.
| Translation - Chinese 鉴于数据的复杂性,想要得出有意义的解决方案需要对不同的领域具有洞察力:统计学,数据挖掘,图形设计,以及信息可视化。然而,每个领域都是在各自领域里独立发展的。
因此,从数据映射领域转为视觉构成领域的视觉设计一般不能给出怎样处理成千上万的数据项目的方法。而数据挖掘领域就有这样的能力,可是,它却不能建立数据交互。将以软件为基础的信息模块用于数据交互,就能把各种抽象化的数据呈现出来,但是,这些方法更倾向于用强大的力量来传达信息,却以牺牲视觉设计的最基本的法则-美感为代价。当有人碰到数据展示方面的困难时(比如一位科学家想要将研究中的几千块遗传基因数据可视化出来),总是很难选择用那种展示方式,甚至于不知道从那个工具,那本书下手。
|
English to Chinese: Data Process General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English We must reconcile these fields as parts of a single process. Graphic designers can learn the computer science necessary for visualization, and statisticians can communicate their data more effectively by understanding the visual design principles behind data representation. The methods themselves are not new, but their isolation within individual fields has prevented them from being used together. In this book, we use a process that bridges the individual disciplines, placing the focus and consideration on how data is understood rather than on the viewpoint and tools of each individual field. | Translation - Chinese 我们必须将这些领域相结合起来成一种处理方式。图形设计者可以学习能帮助进行可视化的计算机科学,统计学者可以通过学习视觉设计理论中的数据展示来提高读取数据的效率。这种理论本身并不是新兴的,只是他们在各自领域的独立发展制约了他们协同工作的可能。在这本书里,我们用了一种方法在这些独立的学科之间架起桥梁,对怎样去认识,理解数据给予了更多的重视和思考,而非把重点放在不同领域的观点和运用什么工具上。 |
English to Chinese: Data Interact General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English Add methods for manipulating the data or controlling what features are visible. Of course, these steps can’t be followed slavishly. You can expect that they’ll be involved at one time or another in projects you develop, but sometimes it will be four of the seven, and at other times all of them.
Part of the problem with the individual approaches to dealing with data is that the separation of fields leads to different people each solving an isolated part of the problem. When this occurs, something is lost at each transition—like a “telephone game” in which each step of the process diminishes aspects of the initial question under consideration. The initial format of the data (determined by how it is acquired and parsed) will often drive how it is considered for filtering or mining. The statistical method used to glean useful information from the data might drive the initial presentation. In other words, the final representation reflects the results of the statisticalmethod rather than a response to the initial question.
Similarly, a graphicdesigner brought in at the next stage will most often respond to specific problems with the representation provided by the previous steps, rather than focus on the initial question. The visualization step might add a compelling and interactive means to look at the data filtered from the earlier steps, but the display is inflexible because the earlier stages of the process are hidden. Furthermore, practitioners of each of the fields that commonly deal with data problems are often unclear about how to traverse the wider set of methods and arrive at an answer.
This book covers the whole path from data to understanding: the transformation of a jumble of raw numbers into something coherent and useful. The data under consideration might be numbers, lists, or relationships between multiple entities.
It should be kept in mind that the term visualization is often used to describe the art of conveying a physical relationship, such as the subway map mentioned near the start of this chapter. That’s a different kind of analysis and skill from information visualization, where the data is primarily numericor symbolic(e.g., A, C, G, and T— the letters of genetic code—and additional annotations about them). The primary focus of this book is information visualization: for instance, a series of numbers that describes temperatures in a weather forecast rather than the shape of the cloud cover contributing to them.
| Translation - Chinese 运用其它方法来更好的处理数据,或者控制其可视特征。当然,这些步骤不能盲目的来遵循。某次做研究项目的时候你可能会用到几成,下次就可能让它们全部上阵。用单独的方法去处理数据会出现的问题之一就是:由于领域的之间的互不相通,需要不同的人解决整个问题的单个不同部分。结果就像玩儿“传话游戏”一样,信息会发生缺失,最初的问题在一步步的处理过程中会遗失掉一部分。数据的初始形式的形成(取决于怎样采集和解析)往往由过滤和挖掘方式而驱使。用统计学方法将有用的信息从数据中分离出来决定了数据初步的呈现模式。换而言之,最后的呈现效果更能反映统计结果,而非只是对初始问题的回答。
同样的,在下一个阶段,图形设计者更多的是对前一阶段中出现的特定的问题进行回答,也非只是集中在初始问题上。可视化过程中,在过滤数据时,令人信服,交互式的意义会增添。因为进程中前几个阶段是隐藏起来的,工作难度也有了增加。
此外,每一个领域的工作人员在处理数据的问题上,往往不清楚怎样将方法贯穿使用来得到答案。
这本书内涵盖了从数据到对数据的理解的全过程:怎样将一系列混乱的原始数字转换成为连贯,有用的东西。这些待处理的数据,可以是数字,表单,或多重实体之间的关系。
需要时刻牢记的是,可视化这个词是常用来描述传达物理关系的一种艺术,如,本章前面提到的地铁线路图。这是和信息可视化不同的分析方法和技术,信息可视化的数据是数字形式的(比如遗传密码里的碱基及对其的注解)。本书将着重于信息可视化:例如,在天气预报里用一系列数字来表述天气,而不是展示出云量来。
|
English to Chinese: What Is the Question? General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English All data problems begin with a question and end with a narrative construct that provides a clear answer. The Zipdecode project (described further in Chapter 6) was developed out of a personal interest in the relationship of the zip code numbering system to geographic areas. Living in Boston, I knew that numbers starting with a zero denoted places on the East Coast. Having spent time in San Francisco, I knew the initial numbers for the West Coast were all nines. I grew up in Michigan, where all our codes were four-prefixed. But what sort of area does the second digit specify? Or the third?
The finished application was initially constructed in a few hours as a quick way to take what might be considered a boring data set (a long list of zip codes, towns, and their latitudes and longitudes) and create something engaging for a web audience that explained how the codes related to their geography.
| Translation - Chinese 所有的数据难题都以一个问题的提出而开始,以建立陈述性结构,给予一个详尽的答案而终止。这个邮编项目(第六章会做更深入的介绍)将以个人对邮编形成与地理区域之间关系所产生出的兴趣而开展。在波士顿居住时,我知道这里的邮编是0打头的,表示这里是东海岸地区。在旧金山的时候,我知道9字头的邮编是表示西海岸地区。我渡过童年的密歇根,邮编则是4字开头的。那么,邮编的第二,第三个字母又是指定什么区域?完整的应用是建立在刚开始时,我们花一两个小时来快速提出关于这一大堆枯燥的数据集需要注意些什么的问题(一长串邮编的列表,城镇,以及城镇的经纬度),然后以一人为例,做出一个显示他所在的地区地理位置和邮编有何关联的展示。 |
English to Chinese: Data Filter | |
Source text - English The next step involves filtering the data to remove portions not relevant to our use. In this example, for the sake of keeping it simple, we’ll be focusing on the contiguous 48 states, so the records for cities and towns that are not part of those states— Alaska, Hawaii, and territories such as Puerto Rico—are removed. Another project could require significant mathematical work to place the data into a mathematical model or normalize it (convert it to an acceptable range of numbers). | Translation - Chinese 本步骤将对数据进行过滤,去掉无用的部分。在这个例子里,为了让其更简单易懂,我们着眼在互相毗邻的48个州上。因此,阿拉斯加,夏威夷,波多黎各等地的信息就会被去除。另外的项目就需要用算法工作将数据放入数学模型或将数据规格化。(将数据转换成易于理解的数字行) |
English to Chinese: Data Mine General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English This step involves math, statistics, and data mining. The data in this case receives only a simple treatment: the program must figure out the minimum and maximum values for latitude and longitude by running through the data (as shown in Figure 1-3) so that it can be presented on a screen at a proper scale. Most of the time, this step will be far more complicated than a pair of simple math operations. | Translation - Chinese 这个步骤会用到数学,统计学,以及数据挖掘。本例中的数据只需要被简单的处理:用程序计算出数据里所有经纬度的最大和最小值,这样就能在屏幕被准确的表现出来。很多时候,这一步比仅仅几个简单的数学计算要复杂的多。 |
English to Chinese: Data Represent General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English This step determines the basicform that a set of data will take. Some data sets are shown as lists, others are structured like trees, and so forth. In this case, each zip code has a latitude and longitude, so the codes can be mapped as a two-dimensional plot, with the minimum and maximum values for the latitude and longitude used for the start and end of the scale in each dimension. This is illustrated in Figure 1-4.
The Represent stage is a linchpin that informs the single most important decision in a visualization project and can make you rethink earlier stages. How you choose to represent the data can influence the very first step (what data you acquire) and the third step (what particular pieces you extract).
| Translation - Chinese 这一步决定了数据集的基本形式。有的数据集是按表格列出来,有的是由树状图表示,或者以其它的图形来呈现。这样的话,由于每一个邮编都有自己的经纬度,就可以用二维图像来进行映射,并将标识出的经纬度最大最小值作为每个图形刻度的顶端和末端。(见图1-4)呈现这一步骤,是非常重要的,它将显示出视觉化项目里唯一的,最重要的结果,而且,也能让你反思之前进行的那些步骤。选择怎样呈现数据能影响到第一步(收集什么数据)以及第三步(筛选出什么数据) |
English to Chinese: Data Refine General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English In this step, graphicdesign methods are used to further clarify the representation by calling more attention to particular data (establishing hierarchy) or by changing attributes (such as color) that contribute to readability.
Hierarchy is established in Figure 1-5, for instance, by coloring the background deep gray and displaying the selected points (all codes beginning with four) in white and the deselected points in medium yellow.
| Translation - Chinese 本步骤中,图形设计将更专注于细节性数据(划分层次),或者调整属性(例如颜色)进一步使得呈现变得精确化,易于理解化。图1-5表示出了层次,例如,将背景色设定为深灰,选定点(所有编码都由4个开始)为白色,未选定点为黄色。 |
English to Chinese: Data Interact General field: Tech/Engineering Detailed field: Computers (general) | |
Source text - English The next stage of the process adds interaction, letting the user control or explore the data. Interaction might cover things like selecting a subset of the data or changing the viewpoint. As another example of a stage affecting an earlier part of the process, this stage can also affect the refinement step, as a change in viewpoint might require the data to be designed differently.
In the Zipdecode project, typing a number selects all zip codes that begin with that number. Figures 1-6 and 1-7 show all the zip codes beginning with zero and nine, respectively.
Another enhancement to user interaction (not shown here) enables the users to traverse the display laterally and run through several of the prefixes. After typing part or all of a zip code, holding down the Shift key allows users to replace the last number typed without having to hit the Delete key to back up.
Typing is a very simple form of interaction, but it allows the user to rapidly gain an understanding of the zip code system’s layout. Just contrast this sample application with the difficulty of deducing the same information from a table of zip codes and city names.
The viewer can continue to type digits to see the area covered by each subsequent set of prefixes. Figure 1-8 shows the region highlighted by the two digits 02, Figure 1-9 shows the three digits 021, and Figure 1-10 shows the four digits 0213. Finally, Figure 1-11 shows what you get by entering a full zip code, 02139—a city name pops up on the display.
In addition, users can enable a “zoom” feature that draws them closer to each subsequent digit, revealing more detail around the area and showing a constant rate of detail at each level. Because we’ve chosen a map as a representation, we could add more details of state and county boundaries or other geographic features to help viewers associate the “data” space of zip code points with what they know about the local environment.
| Translation - Chinese 这一步的处理用到了交互法,让用户自己来控制或者探索数据。交互法涵盖了包括选择数据的子集,着眼点的变化等方面。作为另一个影响初始步骤的阶段,交互阶段对精炼阶段也能产生影响。比如,着眼点的变化会导致数据必须被重新设计。在邮编项目中,键入一个字母,选出所有以这个字母打头的邮编。图1-6和图1-7分别显示了以0和9开头的所有邮编。用户交互法的另一种延伸(这里没有显示出)能让用户横向观摩,略看下前面字母的部分。键入部分或全部的邮编后,单击Shift键就能替换掉最后一个输入的数字,而不需要按Delete键后退。
键入是一种简单的交互形式,虽然如此,用户却能快速的认识、理解邮编系统,对比下用简单的图形呈现和把所有邮编和城市名称用一张表列出来的难易程度吧。用户能继续键入数字来观看以其它字母打头的各地邮编。图1-8显示了以02标明的地区,图1-9显示了021,图1-10显示了0213,最后,图1-11显示了当键入完整的邮编-02139,城市的名字出现在了屏幕上。
此外,用户还能用放大缩小功能来更接近这些数字,获得地区的更多信息,而每一级详细资料的比率的都会显示出来。因为我们选择地图作为呈现载体,更多的州,国家,或者其它地理特征信息可以被加进来,以此帮助用户通过邮编对当地环境进行了解。 |