Data Description, Inc.
site map download order
 
  Data Desk
Data Desk/XL
Viz!on
ActivStats
ActivEpi
ProgramLive
KeyDonor

Product Updates
Demo Downloads
Data Desk Templates
SHOE file format

Product Registration
Customer Service
 

Specialized Plots Templates


 Density Plot

Creator: Matthew C. Hutcheson

Posting date: 12/08/95

The density plot assigns a grid to the scatterplot and counts the number of points in each cell of that grid, then displays them three-dimensionally with the count as the third dimension. Adding lines to this three-dimensional plot creates a fabric-like mesh across the plot revealing the density of "overstrikes" (points piled on top of one another) and clusters in the data.

The user has control over the underlying scatterplot grid. Use finer grids on larger datasets to separate out the smallest clusters within the data. The plot is also colored so peaks turn red while valleys are green or blue. Combined with the ability to rotate, this graphical display is wonderful for presentations and revealing patterns in the data.



 Gabriel's Biplot

Creator: Matthew C. Hutcheson and Robert Gittins

Posting date: 10/08/96

A biplot (Gabriel, 1971; 1981) is a display of an approximation of a matrix which is useful for studying multivariate data. The name biplot serves to indicate that this is a joint display of the rows and columns of a matrix; rows (samples) are generally represented by points and columns (variables) by arrows or vectors. This template allows you to view both at the same time.


 Globe 3D

Creator: unknown; Modified by: Matthew C. Hutcheson

Posting date: 12/20/95

The Globe 3D template takes latitude and longitude (in degrees) as input and creates a three-dimensional rotating globe. This plot provides an exciting look at geographic data. Furthermore, you can add colors by variables of interest to reveal interesting patterns in the globe.



 Index Plots

Creator: Matthew C. Hutcheson

Posting date: 12/08/95

The index plot is often used when plotting distance measures such as Mahalanobi's distance or Hadi's distance. Each observation has a line drawn from it's value to zero. This creates a series of vertical lines. Using time series as the x-axis, then dips or gaps or peaks in the sequence of verticle lines reveals places where something interesting may be happening in that time series.

Experiment with this plot in which the y and x are the residuals and the predicted values from a regression or linear model. You may be surprised with the patterns revealed.


 Multiple Time Series Plot

Creator: Chris Noble

Posting date: 04/05/96

This template plots up to six dependent variables as functions of a single independent variable in one graph, and allows for easy rescaling of the vertical axis for each variable. This extends the multiple line plots in several ways. Relationships between variables measured in entirely different units or scales can be examined without creating derived variables, and the independent variable can have uneven intervals (it is not just the case number), and, in fact, need not even be sorted. It is meant to be useful in biological time-series data which often have measurements unevenly spaced in time.


 Parallel Coordinate Plot

Creator: Matthew C. Hutcheson

Posting date: 12/08/95

The parallel coordinate plot is designed to view multidimensional data. Each variable (up to 8 here) is standardized to [0,1] and then plotted as several side-by-side dotplots. Lines connect each case in one variable (or dotplot) to it's corresponding case in every other variable.

In the past few years, this plot has become quite popular. A fascinating spirograph-like patterns appear if you plot two sorted random normals (one sorted ascending and one descending) or Cauchy. Note: You can generate Cauchy by taking a Normal(0, 1) divided by another Normal(0, 1).


 Quadwise Plot

Creator: Matthew C. Hutcheson

Posting date: 12/08/95

The Quadwise plot is designed (in an attempt) to view four dimensional data. This template plots two y-variables and two x-variables. y1 vs. x1 is plotted on the left-hand side of the "quadwise" scatterplot, and y2 vs. x2 is plotted on the right-hand side. Lines connect each case in the left hand scatterplot with its corresponding case in the right hand scatterplot.


 Simple Regression Intervals

Creator: John H. Walker

Posting date: 8/21/00

This template draws prediction and confidence interval bands for a simple regression of Y vs. X on a scatterplot of the data. It also calculates the exact endpoints of these intervals for a user-defined X-value. The user has control over the confidence level, the X-value for the calculated interval, and the color of the lines on the interval plot.


 Ternary Plots

Creator: Ken Helmold

Posting date: 11/03/03

Ternary plots are a way of displaying the distribution of a three-component mixture. The ternary display is a triangle with sides scaled from 0 to 1. Each side represents one of the three components. The plot is particularly useful to chemists and geochemists.

The data must be expressed as a fraction (i.e., 0.5). Data expressed in percentage (i.e., 50%) should be converted to fractions before plotting. Data in the three variables are automatically normalized to 1 before plotting. Data that are already normalized are not affected by this process.

The user can choose between 4 ternary diagrams that differ in the internal divisions of the triangle. The different divisions are based largely on usage by geologists but can be adapted to any field of study:

  1. Standard - the triangle is not divided into internal divisions. This is the most general plot.
  2. Folk - internal divisions based on the sandstone classification scheme of Folk (1980). Used largely by geologists and sedimentary petrologists - has little use outside this field.
  3. Polymict - triangle divided into 4 areas of equal area. The central division is polymictic because it consists of a variety of components.
  4. Folkpoly - triangle divided into 3 areas of equal area. Based on the classification scheme of Folk (1980) for subdividing the lithic fraction of sandstones into sedimentary, volcanic and metamorphic components. Used largely by geologists and sedimentary petrologists, but can be adapted for general use.


 Update Scatterplot/Snake

Creator: Matthew C. Hutcheson

Posting date: 12/18/95

This graphical display contains two different functions. It is an updating scatterplot as well as a scatterplot snake. The updating scatterplot was developed here in Data Desk (a paper is currently being written). Plot y vs. x. Another "ordering variable" determines which data is displayed at any one time. This ordering variable is converted to be from [0, 1]. Control the data that is displayed using a 'location' and 'bandwidth' parameter. For example, if you have 100,000 observations, plotting them all at once is just a mess. You can use the density plot discussed above and use this plot to get an understanding of the large dataset.

For example, you might set the location slider to 0 and set the bandwidth slider to 0.05. Then, slide the location slider from 0 up to 1. As you move the slider, the plot continually updates and only displays the points between location +/- bandwidth as determined by the ordering variable. Initially, the data between 0 and 0.05 are displayed. Once you get to, say, location = 0.50, then data is displayed that lies between 0.45 and 0.55 of the ordering variable. In other words, the middle 10% (55-45) of the data is displayed on the plot.

It is useful to use random numbers as the ordering variable. Then, as you move the location and bandwidth parameters, you get a basic unstructured view of the data. Then, replace the random variable with a "real" ordering variable (say income) and update through the data.

A scatterplot snake is also programmed into this plot. Displaying lines dynamically as you move through the ordering variable. This implementation is much more powerful that other programs because you have control of both the location and the bandwidth instead of the starting the snake and letting it go to the end. If you want to do that, just set the location = 0, then increase the bandwidth from 0 to 1. I like to set the bandwidth to an amount that doesn't put so many lines on the plot that it is distracting, then use the location parameter to move through the data.


 ZipMap

Creator: Matthew C. Hutcheson

Posting date: 02/15/96

This template is useful for displaying data geographically. It contains a database of latitude and longitude associated with five digit zip codes for the continental US. If you have a variable that contains 5-digit zip codes, drop it into the "socket" and click on the button named Display Map.

 

Specialized Plots
Templates primarily concerned with displaying data.

Specialized Statistics
Templates that perform specialized computations.

Advanced Analyses
Templates that perform whole analyses, usually involving both statistics and plots.

Teaching/Illustration
Templates and simulations whose primary purpose is educational in nature.

Quick Pick
A list of template names only, designed for quick access to specific templates.