April 26, 2016

Intro to Sound and Data Sonification


Acoustics: (1) The branch of physics concerned with sound. (2) The properties of a concert hall with respect to the way sound interacts with it.

Psychoacoustics: The branch of psychophysics that studies the sense of hearing. Psychoacoustics defines, qualifies and quantifies sensations in relation to the stimuli (sounds) that cause them.

Electroacoustics: The intersection of acoustics and electronics. Electroacoustics studies the conversion of sound into an electronic signal (called transduction), the manipulation of the electronic signal, and the conversion of the signal back into sound (also transduction).

Sound: a mechanical vibration transmitted through a medium (usually air) to the ear, with an amplitude and frequency capable of being perceived by the auditory system.

IF A TREE FALLS IN THE WOODS with no one around, it does make a sound.

IF A TREE FALLS ON THE MOON, even with someone around, it does not make a sound. (Sound does not travel in a vacuum.)

Bats produce ultra-sound (sound too high for humans to hear). Elephants produce infra-sound (sound too low for humans to hear).

Signal: any other vibration or energy variation that does not fit the definition of sound, even if the vibration or variation represents a sound. We commonly refer to electric and digital signals.

Analog Signal: A smoothly varying signal. In other words, a direct analog” for sound. An electric signal is an analog signal. The grooves on an LP are also a type of analog signal. A cassette tape stares an analog signal magnetically.

Digital Signal: A signal that varies in discreet steps. A digital signal can be created is created by sampling an analog signal at regular interval, called the sampling rate. A digital signal is like a rasterized image: It is a series of numbers, or each number (or sample”) representing the intensity of a signal at a given moment in time. (Whereas a raster image is a series of numbers each representing the color of an image at a given point on the screen or page.) A digital signal can be stored in a variety of ways, including magnetically (on a digital audio tape, or DAT), and optically (on a CD). A digital signal can be transmitted electrically (in a computer chip, or on a specially built electric cable) or optically (fiber optics).

Analog-to-Digital Conversion (ADC): The process of sampling an analog signal in order to create a digital signal.


Digital-to-Analog Conversion (DAC): The process of converting a series of samples to a continuously varying (analog) electric signal.


(NOTE: technically, in the diagrams above, the transition from electric to sound is transduction.”

Sample: (1) An individual number in a digital signal. The sample represents the intensity of a signal at a given time. The sample is to a digital signal as a pixel is to a digital image. (2) The length of time it takes for one sample to go by (depends on the sampling rate, but is usually a small fraction of a millisecond). (3) An entire bit of digitally recorded sound, stored as a series of numbers (A.K.A. a stored digital signal). This is also the popular usage of the term.

Sampling Rate (in Samples per Second): (1) The rate at which an analog signal is sampled in order to create a digital signal. The most common sampling rate is 44,100 samples per second. This is the rate used by CD players. Other common sampling rates are 22,050 samples per second and 48,000 samples per second. Sampling rate is analogous to the resolution of a raster image.

How Acoustic Parameters of Sound Map to Perception (and possible data types)

physical parameter perceptual parameter possible data mapping (Q=Quantatiative, O=Ordinal, N=Nominal)
Frequency (Hz) Pitch, or height” QON
Intensity Loudness (Q)ON
Waveform (spectrum) Tone Color (Q)(O)N
Intensity+Frequency+Waveform in Time Timbre (Q)(O)N

Notice, the term volume” is not used. Loudness” and Intensity are more precise. Volume” is used in psychoacoustics to refer to an esoteric characteristic of sound, which could be described as its fullness.” We will avoid the term volume” for now.

More Definitions


  1. any undesirable, uncomfortable or dangerous sound. Sound pollution refers to this meaning. This is the popular meaning.
  2. The opposite of signal. Parasitic vibrations accompanying a signal that interfere with its clear transmission. The Signal-Noise ratio” refers to this meaning.
  3. an a-periodic sound (a sound without a definable frequency, hence without a definite pitch). This is the opposite of Musical Sound.”

Musical sound or pitched sound: a periodic sound (a sound with a definable frequency, hence with a definite pitch).

Unpitched sound: Noise (definition 3.)

Auditory Perception

As graphic perception must be taken into account when designing scales for visualization, auditory perception must be taken into account when designing scales for sonification of data.

One notable example of how auditory perception should influence scale design is in the construction of pitch scales.

For a discussion, see: http://blog.ericmarty.com/7/perceptually-uniform-pitch-loudness-scales-for-data-sonification

Sonification Examples

General Interest


Simple Mapping of Single Variables

Nick Bearman: temperature to pitch (map mousover)

altitude to pitch (map mouseover)

integers mapped to integer frequency bins
Sorting algorithms (computer science) - scanned to frequency (integers mapped to integer frequencies)

Redundant Mapping of Single Vairables

price to speed+pitch+loudness

Multiple 1:1 Mappings

Listen to wikipedia: Hatnote

Arctic Ice

Sonification of LHC data

Complex Mappings of Single Variables

Brian Foo http://brianfoo.com

smog levels to granular synthesis parameters http://www.theatlantic.com/technology/archive/2012/09/soundscapes-of-smog-researchers-let-you-hear-the-pollution-of-cities-literally/262152/

February 25, 2016

Visualization Tools Built on D3

The D3 / Vega stack” (from the creators of D3):

The in-house family of higher-level tools built on top of Mike Bostock’s D3. Mike Bostock developped D3 at the Stanford Visualization Group, led by Jedff Heer. The lab moved to the University of Washington and became the Interactive Data Lab (IDL). IDL/Stanford Vis Group built the Vega declarative visusaliation language on top of D3, Vega-Lite (a simplified declarative language) on top of that, and is building a small suite of exploratory data analysis and design tools on top of Vega and Vega Lite. (IDL is also behind Tableau and the spinoff company Trifacta that makes it.)

My introduction

The Vega family on GitHub

Third-Party Tools built on D3

Here are a numer of third-party languages, environments and tools built on top of D3.

• Mid-level (js)

nvd3.js — many standard chart types

c3.js — many standard chart types

dimple.js — for business

xcharts.js — simple charts, few options

• Specialized Mid-level (js)

Crossfilter — large, cross-linked multivariate datasets in the browser

cubism.js — scalable, realtime animated, time series visualisations

JSNetworkX — networks

• High-level data exploration

raw by Density Design. Drag and drop editor outputs d3 code.

• Visual Programming Environments

vvvv.js — in-browser version of the VVVV visual programming environment (built on D3).

• Full GUI Web Apps

Web interface for D3 (see below). Free (public charts only). Can export to SVG. Powerful. Private charts require a paid subscirption. tutorials

Layout multiple Plotly charts on a single page and publish.

Compare to:
Tableau - desktop + online drag/drop visualization editor. Publish to web. Pro version is $1000. Free student license.). Tableau Public (also free). tutorials Tableau is NOT built on top of D3, but came out of the same group that made D3, and is based on Grammar of Graphics. So, it is conceptually similar to Plotly (and Vega). Orginally called Polaris, it was commercialized as Tableau when the Stanford/IDL group created the company Trifacta.

D3 wrappers in other languages

rCharts — extensible R wrapper. Supports many charting libraries, including NVD3, Polychart, Morris, Rickshaw, xCharts, HighCharts, and Leaflet for mapping.

Shiny - extensible web application framework for R

D3.py — python library for generating d3-based plots, using the panda module. See also: vincent, a python to Vega translator

Plotly also has APIs for the major scientific computing languages (Matlab, R, Python), so a round-about way to leverage D3 without actually using it.

• Other people’s lists of D3 based tools

Tony Hirst: Climbing the d3.js Visualisation Stack

Marielle Lange: D3lib

Mike McDearmon: Data Visualization Libraries Based on D3.JS

February 24, 2016

Vega Visualization Grammar

Vega is a visualization grammar. You can read about Vega, its relationship to D3, and the family of tools built on top of Vega in my last post: The D3 - Vega Stack”. This post is an introduction to Vega 2.5.

Vega is a declarative format for creating, saving, and sharing interactive visualization designs. A designer declares the elements of a visualization (using the Vega grammar) in a visualization specification, in JSON format, something like this:


Vega does the rest. Vega Runtime can interpret the specification and render it directly in the browser using either SVG or HTML Canvas. (Or, a simple command line application can convert it directly to an SVG file.) An online Vega Editor that shows the spec and the visualization it produces side-by-side makes it easy to write Vega. Check out the examples in the Vega Editor to see some real Vega specs and the visualizations they produce. (Like this bar chart)

Conceptually, the Vega grammar separates the elements of the visualization into these semantic areas:

DATA The data to visualize
DATA TRANSFORMS Grouping, stats, projections, etc.
SCALES Mappings of data to visual parameters
GUIDES Axes & Legends to visualize Scales
MARKS Graphic elements representing actual data

In addition, SIGNALS are dynamic variables that drive interactive behaviours.

The full Vega grammar is described in the wiki. Here is my basic Vega 2.5 grammar cheat sheet:

Top Level Visualization Properties” (container properties)

  • name (optional) - name for this visualization
  • width - width of chart
  • height - height of chart
  • viewport (optional) - [width, height] of scrollable window onto chart
  • padding (optional) - margins
  • background (optional) - background color
  • scene (optional) - stroke and fill the entire scene

Other Top Level Properties (chart properties)

  • data - data to visualize. See Data

  • scales (optional) - Scale transform definitions. See Scales

  • axes (optional) - Axis definitions. Axes are the labels (tick marcs, etc.) that show the scales on the visualization

  • legends (optional) - Legend definitions. See Legends

  • marks - Graphical mark definitions. Marks are the main graphical and text elements of the visualization.

  • signals(optional) - Signals are dynamic variables or interactive events

data - properties

  • name - unique name for the data set
  • format (optional)
  • values, source, or url - The data (manually entered values, named source or URL)
  • transform (optional) - transforms (analysis, filters, etc.) to perform on the data. See Data-Transforms
    • Data Manipulation Transforms:
      • aggregate - perform basic stats
      • bin - sort into quantatitive bins
      • countpattern - find and count occurrences of a text pattern
      • cross - cross-product of two data sets
      • facet - organize data into groups
      • filter - filter data to remove unwanted items
      • fold
      • formula - extend the data set using formulas
      • impute - perform imputation of missing values
      • lookup - extend the data set using a lookup table
      • rank - rank data
      • sort - sort data
      • treeify - compute a tree structure from table data
    • Visual Encoding Transforms:
      • force - Performs force-directed layout for network data
      • geo - Performs cartographic projection
      • geopath - Creates paths for geographic regions
      • hierarchy - Computes tidy, cluster, and partition layouts
      • linkpath - Computes path definition for connecting nodes in a node-link network or tree diagram
      • pie - Computes a pie chart layout
      • stack - Computes layout values for stacked graphs, as in stacked bar charts or stream graphs
      • treemap - Computes a squarified treemap layout for heirarchical or faceted” data.
      • voronoi - Computes voronoi diagram for a set of x,y coordinates.
      • wordcloud - Builds a word cloud from text data
  • modify (optional) - streaming operators to respond to signals. See Streaming-Data

scales - properties

  • name - unique name for the scale
  • type - type of scale
  • domain - The domain of the scale, representing the set of data values
  • domainMin - Min value for scale domain (quantitative scales only)
  • domainMax - Max value for scale domain (quantitative scales only)
  • range - The range of the scale, representing the set of visual values
  • domainMin - Min value for scale range (quantitative scales only)
  • domainMax - Max value for scale range (quantitative scales only)
  • reverse - flip scale range
  • round - round scale range to integers

other properties whose usage varieas according to scale type:

  • points - distribute ordinal values uniformly
  • padding - apply spacing around ordinal points
  • clamp - clamp out-of-range data to the ends of the scale domain
  • nice - force scale to use human-friendly values (whole numbers, minutes, hours, etc.)
  • exponent - set exponent (for exponential scales only)
  • zero - force scale to include zero (quantitative scales only)

axes - properties

  • type - type of axis: x or y
  • scale - name of the scale for this axis
  • orient - axis orientation: top, bottom, left or right (e.g. right to put a y axis on the right side.)
  • title (optional) - title text for the axis
  • titleOffset - offset (in pixels) from the axis at which to place the title
  • format (optional) - formatting pattern for axis labels (number formats, etc.)
  • fomatTyle (optional) - (time, utc, string or number)
  • ticks - number of ticks, for axes showing quantitative scales
  • values - instead of specifying number of ticks, explicitely set each tick value
  • subdivide - number of minor ticks between main ticks (e.g. 9 = decimal subdivision)
  • tickPadding - padding between ticks and text labels
  • tickSize - size of all ticks
  • tickSizeMajor - size of only the major ticks
  • tickSizeMinor - size of only the minor ticks
  • tickSizeEnd - size of only the end ticks
  • offset - offset betwwen axis and edge of the main data rectangle
  • layer - draw axes in front (default) or back of the data
  • grid - draw grid lines (true or false)
  • properties - use for custom axis styling

legends - properties

Legends link to named scales. At least one of the size, shape, fill or stroke parameters must be specified

  • size, shape, fill and/or stroke — scale name determining size, shape, fill or stroke of a data item in the visualization (at least one must be specified)
  • orient — position of legend within the scene: right (default) or left
  • offset - horizontal offset of legend (in pixels) from the data rectangle
  • title (optional) — legend title
  • format (optional) - formatting pattern for legend labels (number formats, etc.)
  • values (optional) - Explicitly set the visible legend values
  • properties - use for custom legend styling

marks - properties

Marks are the basic visual buiding blocks of a visualization. A mark” is a prototype graphic object duplicated and varied for each data point. (e.g. one single rectangle mark generates an entire bar graph.)

  • type - type of mark: rect, symbol, path, arc, area, line, rule, image, text or group.
    Also, the special group type can contain other marks, plus local scales, axes and legends. See Group Marks
  • name (optional) - unique name for the mark instance (can be used for css styling)
  • description (optional) - desciption of mark or comment
  • from — data this mark set should visualize
    • data — name of the data set to use.
    • transform (optional) — array of data transformations to apply
  • properties - object containing sets of mark properties
    • enter — set of properties to apply when data is processed for the first time and a mark instance is newly added to a scene
    • exit (optional) — set of properties to apply when the data linked to a mark instance is removed, and so the mark instance is disappearing. Seldom used.
    • update (optional) — set of properties to apply to already existing mark instances, when needed (such as when data changes or after a hover).
    • hover (optional) — set of properties evaluated when pointer hovers over a mark instance. At the end of the hover, the update property set is triggered.

Within each set, properties are defined in "name":"value" pairs, where "name" is the property and "value" is either a value reference or a production rule. See Marks for full documentation.

  • key (optional) — data field to use as a unique key for data binding for dynamic data
  • delay (optional) — transition delay (milliseconds) for mark updates. Used for animation.
  • ease (optional) — transition ease function: linear, quad, cubic, sin, exp, circle, and bounce. See here for documentation. (default = cubic-in-out)

signals - properties

Signals are dynamic variables that drive interactions.

For a description of how Signals work in Vega, see https://github.com/vega/vega/wiki/Signals

February 23, 2016

The D3 / Vega stack”

D3, written by Michael Bostock, with Vadim Ogievetsky and Jeffrey Heer at the Stanford Vizualization Group (now the University of Washington Interactive Data Lab is a visualization library now widely used as the basis for many of the most powerful online visualizations. According to Heer, it is intentionally designed as a low-level system.

During the early design of D3, we even referred to it as a visualization kernel” rather than a toolkit” or framework.” In addition to custom design, D3 is intended as a supporting layer for higher-level visualization tools.1 — Jeff Heer

There are numerous higher-level (i.e. easier, more conceptual) third-party tools built on top of D3, including pretty sophisticated high-level GUIs like Plot.ly, simpler drag-and-drop editors like Raw and tools leveraging D3 in computing environments like R and Python.

But the creators of D3 have built their own in-house stack” of increasingly higher-level languages and tools, built on D3, known as the Vega” family of tools.

low level: Data Visualization Kernel”

D3 (“Data Driven Documents”) - a JavaScript library for manipulating documents based on data. Users write JavaScript code to produce SVG graphics driven by data.

low-mid level: Visualization Grammar

Vega by the creators of D3 (Interactive Data Lab). A visualization grammar. User language: JSON. User writes a visualization specification” as a JSON file. The Vega renderer draws draws the visualization from the spec in SVG or HTML Canvas in the browser. Online editor: http://vega.github.io/vega-editor/

(You can run Vega locally with Vegaserver)

mid level: Visual Analysis Grammar

Vega-lite by the creators of D3. This simplified version of Vega serves as a visual analysis grammar. User writes a visualization specification” with minimal styling options as a JSON file. Vega-lite then generates a full Vega spec. Online editor: https://vega.github.io/vega-editor/?mode=vega-lite

High level data exploration (GUI)

Voyager built on top of Vega-lite. This web-based GUI automatically builds a set of recommended visualizations from your dataset. Use Voyager to explore your data and possible visualizations.

Pole✭ (Polestar) built on top of Vega-lite. Web-based Drag-and-drop GUI for quickly building visualizations from your dataset. Online editor: http://uwdata.github.io/polestar/#
Compare to Tableau

Full GUIs (no coding)

Lyra A full GUI design environment built on Vega. Currently in Beta, and buggy. Try it here
See also: http://idl.cs.washington.edu/projects/lyra/

State of the Stack

Pole✭ (Polestar), Voyager and Lyra are not yet bug free and ready for prime time, but they represent important pieces of the design stack. Voyager is a visualization recommendation engine that helps organize data exploration, saving loads of time in the process. Polestar is a welcome edition to the list of drag-and-drop visualization builders.

Lyra, in particular, represents an important category of high level design tool that just hasn’t existed yet - a full GUI design environment for data-driven graphics. It still has a long way to go to become the Illustrator of data graphics, but it’s heading in that direction. Once Lyra (or something like it) reaches maturity, then we will see another surge in the prevelance of high quality data visualizations that parallels the surge that the introduction of D3 brought us.

The basis for these advances is Vega, the declarative visualization grammar. Compared to coding in D3, Vega can be learned by non-coders quite easily. No background in JavaScript, or any programming language, is needed. At the same time, Vega is rich enough to be expressive, and not just easy to use.

Vega is well documented on the wiki. Mostly for those like me who want a birds-eye view of Vega, I’ve assembled a Vega 2.2 cheat sheet, which you can read in my next post.

  1. Jeff Heer : https://github.com/vega/vega/wiki/Vega-and-D3, edited May 1, 2014, accessed Aug 26, 2015

November 9, 2015

European Premier of Party Pieces

Back in 2013, in honor of the 100th birthday of John Cage, the Forum of Contemporary Music Leipzig [FZML] asked me and 124 other composers to collectively write an exquisite corpse composition based on an idea of John Cage’s. Each of us wrote one page of manuscript, passing on only the last bar to the next composer. The sequence of composers was chosen by a random process involving tossing coins and consulting the I Ching, one of Cage’s favorite tools.

125 Party Pieces was premiered in New York by Ensemble Either/Or in October 2013 for the finale of the international Cage 100 Festival. The 125 manuscripts were exhibited at Galerie für Zeitgenössische Kunst, Leipzig, Germany in August-Spetember 2013.

Party Pieces will get its European premier in Leipzig on January 20, 2016 (tickets).

Thanks to Pierre Boulez for his patronage of this project and the German Federal Cultural Foundation for the funding.

© Copyright 2015 Éric Marty