Vega Visualization Grammar
Vega is a visualization grammar. You can read about Vega, its relationship to D3, and the family of tools built on top of Vega in my last post: The D3 - Vega “Stack”. This post is an introduction to Vega 2.5.
Vega is a declarative format for creating, saving, and sharing interactive visualization designs. A designer declares the elements of a visualization (using the Vega grammar) in a visualization specification, in JSON format, something like this:
{
"data":[...],
"scales":[...],
"axes":[...],
"marks":[...]
}
Vega does the rest. Vega Runtime can interpret the specification and render it directly in the browser using either SVG or HTML Canvas. (Or, a simple command line application can convert it directly to an SVG file.) An online Vega Editor that shows the spec and the visualization it produces side-by-side makes it easy to write Vega. Check out the examples in the Vega Editor to see some real Vega specs and the visualizations they produce. (Like this bar chart)
Conceptually, the Vega grammar separates the elements of the visualization into these semantic areas:
DATA | The data to visualize |
DATA TRANSFORMS | Grouping, stats, projections, etc. |
SCALES | Mappings of data to visual parameters |
GUIDES | Axes & Legends to visualize Scales |
MARKS | Graphic elements representing actual data |
In addition, SIGNALS are dynamic variables that drive interactive behaviours.
The full Vega grammar is described in the wiki. Here is my basic Vega 2.5 grammar cheat sheet:
Top Level “Visualization Properties” (container properties)
name
(optional) - name for this visualizationwidth
- width of chartheight
- height of chartviewport
(optional) - [width, height] of scrollable window onto chartpadding
(optional) - margins
background
(optional) - background color
scene
(optional) - stroke and fill the entire scene
Other Top Level Properties (chart properties)
data
- data to visualize. See Datascales
(optional) - Scale transform definitions. See Scalesaxes
(optional) - Axis definitions. Axes are the labels (tick marcs, etc.) that show the scales on the visualizationlegends
(optional) - Legend definitions. See Legendsmarks
- Graphical mark definitions. Marks are the main graphical and text elements of the visualization.signals
(optional) - Signals are dynamic variables or interactive events
data
- properties
name
- unique name for the data setformat
(optional)values
,source
, orurl
- The data (manually entered values, named source or URL)transform
(optional) - transforms (analysis, filters, etc.) to perform on the data. See Data-Transforms- Data Manipulation Transforms:
aggregate
- perform basic statsbin
- sort into quantatitive binscountpattern
- find and count occurrences of a text patterncross
- cross-product of two data setsfacet
- organize data into groupsfilter
- filter data to remove unwanted itemsfold
formula
- extend the data set using formulasimpute
- perform imputation of missing valueslookup
- extend the data set using a lookup tablerank
- rank datasort
- sort datatreeify
- compute a tree structure from table data
- Visual Encoding Transforms:
force
- Performs force-directed layout for network datageo
- Performs cartographic projectiongeopath
- Creates paths for geographic regionshierarchy
- Computes tidy, cluster, and partition layoutslinkpath
- Computes path definition for connecting nodes in a node-link network or tree diagrampie
- Computes a pie chart layoutstack
- Computes layout values for stacked graphs, as in stacked bar charts or stream graphstreemap
- Computes a squarified treemap layout for heirarchical or “faceted” data.voronoi
- Computes voronoi diagram for a set of x,y coordinates.wordcloud
- Builds a word cloud from text data
- Data Manipulation Transforms:
modify
(optional) - streaming operators to respond to signals. See Streaming-Data
scales
- properties
name
- unique name for the scaletype
- type of scale- ordinal (
ordinal
), - date/time (
time
orutc
), or - quantitative (
linear
,log
,pow
,sqrt
,quantile
,quantize
, orthreshold
)
- ordinal (
domain
- The domain of the scale, representing the set of data valuesdomainMin
- Min value for scale domain (quantitative scales only)domainMax
- Max value for scale domain (quantitative scales only)range
- The range of the scale, representing the set of visual valuesdomainMin
- Min value for scale range (quantitative scales only)domainMax
- Max value for scale range (quantitative scales only)reverse
- flip scale rangeround
- round scale range to integers
other properties whose usage varieas according to scale type:
points
- distribute ordinal values uniformlypadding
- apply spacing around ordinal pointsclamp
- clamp out-of-range data to the ends of the scale domainnice
- force scale to use human-friendly values (whole numbers, minutes, hours, etc.)exponent
- set exponent (for exponential scales only)zero
- force scale to include zero (quantitative scales only)
axes
- properties
type
- type of axis:x
ory
scale
- name of the scale for this axisorient
- axis orientation:top
,bottom
,left
orright
(e.g. right to put a y axis on the right side.)title
(optional) - title text for the axistitleOffset
- offset (in pixels) from the axis at which to place the titleformat
(optional) - formatting pattern for axis labels (number formats, etc.)fomatTyle
(optional) - (time
,utc
,string
ornumber
)ticks
- number of ticks, for axes showing quantitative scalesvalues
- instead of specifying number of ticks, explicitely set each tick valuesubdivide
- number of minor ticks between main ticks (e.g. 9 = decimal subdivision)tickPadding
- padding between ticks and text labelstickSize
- size of all tickstickSizeMajor
- size of only the major tickstickSizeMinor
- size of only the minor tickstickSizeEnd
- size of only the end ticksoffset
- offset betwwen axis and edge of the main data rectanglelayer
- draw axes infront
(default) orback
of the datagrid
- draw grid lines (true
orfalse
)properties
- use for custom axis styling
legends
- properties
Legends link to named scales. At least one of the size, shape, fill or stroke parameters must be specified
size
,shape
,fill
and/orstroke
— scale name determining size, shape, fill or stroke of a data item in the visualization (at least one must be specified)orient
— position of legend within the scene:right
(default) orleft
offset
- horizontal offset of legend (in pixels) from the data rectangletitle
(optional) — legend titleformat
(optional) - formatting pattern for legend labels (number formats, etc.)values
(optional) - Explicitly set the visible legend valuesproperties
- use for custom legend styling
marks
- properties
Marks are the basic visual buiding blocks of a visualization. A “mark” is a prototype graphic object duplicated and varied for each data point. (e.g. one single rectangle mark generates an entire bar graph.)
type
- type of mark:rect
,symbol
,path
,arc
,area
,line
,rule
,image
,text
orgroup
.
Also, the specialgroup
type can contain other marks, plus local scales, axes and legends. See Group Marks
name
(optional) - unique name for the mark instance (can be used for css styling)description
(optional) - desciption of mark or commentfrom
— data this mark set should visualizedata
— name of the data set to use.transform
(optional) — array of data transformations to apply
properties
- object containing sets of mark propertiesenter
— set of properties to apply when data is processed for the first time and a mark instance is newly added to a sceneexit
(optional) — set of properties to apply when the data linked to a mark instance is removed, and so the mark instance is disappearing. Seldom used.update
(optional) — set of properties to apply to already existing mark instances, when needed (such as when data changes or after a hover).hover
(optional) — set of properties evaluated when pointer hovers over a mark instance. At the end of the hover, theupdate
property set is triggered.
Within each set, properties are defined in "name":"value"
pairs, where "name"
is the property and "value"
is either a value reference or a production rule. See Marks for full documentation.
key
(optional) — data field to use as a unique key for data binding for dynamic datadelay
(optional) — transition delay (milliseconds) for mark updates. Used for animation.ease
(optional) — transition ease function: linear, quad, cubic, sin, exp, circle, and bounce. See here for documentation. (default =cubic-in-out
)
signals
- properties
Signals are dynamic variables that drive interactions.
For a description of how Signals work in Vega, see https://github.com/vega/vega/wiki/Signals