# 数据可视化 & 数据转化 Visualization and Transformation


# Grade 2 ( 5-7岁 ) 要掌握的程度

Data can be displayed for communication in many ways. People use computers to transform data into new forms, such as graphs and charts.

Examples of displays include picture graphs, bar charts, or histograms. A data table that records a tally of students’ favorite colors can be displayed as a chart on a computer.

Crosscutting Concept: Abstraction Connection Within Framework: K–2.Impacts of Computing.Social Interactions

数据可以通过多种方式展示. 我们可以通过计算机将数据转化为其他形式, 比如图形、表格、条形图、柱状图. 比如说, 我们可以将一个记录学生最喜欢的颜色的统计表, 在电脑中转化为图表.


# Grade 5 ( 8-11岁 ) 要掌握的程度

People select aspects and subsets of data to be transformed, organized, clustered, and categorized to provide different views and communicate insights gained from the data.

Data is often sorted or grouped to provide additional clarity. Data points can be clustered by a number of commonalities without a category label. For example, a series of days might be grouped by temperature, air pressure, and humidity and later categorized as fair, mild, or extreme weather. The same data could be manipulated in different ways to emphasize particular aspects or parts of the data set. For example, when working with a data set of popular songs, data could be shown by genre or artist. Simple data visualizations include graphs and charts, infographics, and ratios that represent statistical characteristics of the data.

Crosscutting Concepts: Abstraction; Human–Computer Interaction Connection Within Framework: 6–8.Impacts of Computing.Social Interactions

我们可以从不同角度对数据或者数据的子集进行转换、组织、聚类、分类, 从中发现各种观点和洞察.

通常, 可以对数据进行分类分组, 来提供额外的解读视角. 可以通过一些共同点对数据进行聚类分组, 而不需要一个类别标签. 比如说, 日期记录可以按照温度、气压、湿度进行分组, 根据这些分组的数据, 再将这些日子归类为尚好的天气、温和的天气、极端的天气.

同样的数据可以用不同的方式进行处理, 来突出强调数据集的某个特征. 比如说, 在处理流行歌曲的数据集时, 可以按照流派或者艺术家来显示数据.

简单的数据可视化包括图形、图表、信息图, 以及用来代表数据统计特征的比率.


# Grade 8 ( 11-14岁 ) 要掌握的程度

Data can be transformed to remove errors, highlight or expose relationships, and/or make it easier for computers to process.

The cleaning of data is an important transformation for reducing noise and errors. An example of noise would be the first few seconds of a sample in which an audio sensor collects extraneous sound created by the user positioning the sensor. Errors in survey data are cleaned up to remove spurious or inappropriate responses. An example of a transformation that highlights a relationship is representing two groups (such as males and females) as percentages of a whole instead of as individual counts. Computational biologists use compression algorithms to make extremely large data sets of genetic information more manageable and the analysis more efficient.

Crosscutting Concept: Abstraction Connection Within Framework: 6–8.Algorithms and Programming.Algorithms

可以对数据进行清理和转化, 去除错误的数据, 突出或者暴露出数据中特定的关系属性, 来让数据更容易被计算机处理.

  • 去除错误: 数据清理是减少噪音和错误的重要处理过程
    • 噪音: 一个典型的噪声案例是, 音频传感器收集的前几秒钟数据中, 通常包含了定位传感器时产生的无关声音
    • 错误: 调查统计中的错误数据也需要被清理掉, 以避免得出错误的或者不恰当的结论
  • 突出特定关系: 比如将两个群体 (比如男性和女性) 表示为整体的占比, 而不是单独的人数, 可以从中获得比率的洞察. 再比如, 计算生物学家使用压缩算法[1]来突出特定关系的数据, 让遗传领域中巨量的大数据更容易管理, 更容易分析


# Grade 12 ( 14-18岁 ) 要掌握的程度

People transform, generalize, simplify, and present large data sets in different ways to influence how other people interpret and understand the underlying information. Examples include visualization, aggregation, rearrangement, and application of mathematical operations.

Visualizations, such as infographics, can obscure data and positively or negatively influence people’s views of the data. People use software tools or programming to create powerful, interactive data visualizations and perform a range of mathematical operations to transform and analyze data. Examples of mathematical operations include those related to aggregation, such as summing and averaging. The same data set can be visualized or transformed to support multiple sides of an issue.

Crosscutting Concepts: Abstraction; Human–Computer Interaction Connection Within Framework: 6–8.Impacts of Computing.Social Interactions

我们可以使用不同的方式对大型的数据集进行转换、归纳、简化、展示, 处理方式的不同最终将影响人们对信息的解读. 这些数据处理方式包括数据可视化、聚合、重新排列、数学运算的应用.

数据可视化, 比如“信息图”, 可以被用来掩盖数据、用来影响人们对统计结果的看法.

人们通常使用软件工具或者编程来创建功能强大的、可交互的数据可视化, 并进行一系列的数学运算来转换和分析数据.

对数据进行数学运算的例子包括那些与聚合相关的运算, 比如求和、求平均数.

同一个数据集, 可以通过不同的可视化方式或者不同的处理, 来支持同一个问题的多个侧面.


  1. 压缩算法: 是指在不改变原有属性的前提下, 降低数据体积的一种算法. 比如说我们常用的压缩软件, 就可以对文件进行压缩, 方便传输和管理, 再通过解压缩获得原来的文件 ↩︎