Skip to Main Content
Main site homepage

QM Course Guide

Organizing Data with Tables and Graphs

We gather data to tell us something about a population, but a spreadsheet full of raw data doesn’t tell us much. 

To analyze the data we collect, we always follow the same 3-step strategy:

  1. Make a graph

Choose best graph based on level of measurement

  1. Identify patterns and deviations

Look at:

  • shape
  • center
  • spread
  1. Choose a numerical summary

Use a few numbers to describe:

  • measures of center

(mean, median, mode)

  • measures of spread

(standard deviation, five-number summary)

In this section, we look at the first two steps for distributions of single variables.

1. We choose the best table or graph to display the data.

2. We identify patterns and deviations in the data. (This helps us choose the best numerical summaries in Step 3.)

Tables and Graphs

Frequency Tables  

A frequency distribution is one way to organize raw data. 

It shows two things:

  • the categories of the variable
  • how many times (or the frequency) that value is recorded as a response. 

This video shows how to construct and interpret a frequency table.

Types of graphs and their uses

The most common graphs for categorical variables are:

•          pie charts

•          bar graph

The most common graphs for quantitative variables are:

•          histograms

•          stemplots

This video gives an excellent overview of these graphs.

Choosing the Best Graph

It’s important to choose a graph that is appropriate for your data set.

Before you create a graph, identify the type of variable:

  • qualitative (categorical)
  • quantitative (numeric)

This video can help you chose an appropriate graph to display the distribution of your variable.

Graphs by Level of Measurement

Graphs for Categorical variables

Pie charts are good:

Bar charts are good:

Pareto charts are good:

Dot plots are good:

  • when there are just a few categories
  • nominal variables
  • to compare frequencies between variables
  • ordinal variables
  • to easily see largest and smallest frequencies
  • nominal variables only
  • when you need to tally data by hand

https://ec.europa.eu/eurostat/web/products-eurostat-news/-/DDN-20180920-1

Moore, Statistics: Concepts and Controversies, 9e, 2017 by W. H. Freeman and Co

 Moore, Statistics: Concepts and Controversies, 9e, 2017 by W. H. Freeman and Co

https://www.pinterest.co.uk/pin/406168460117709867/

How to construct a pie chart

Create a pie chart in SPSS

How to construct a bar graph

Create a Bar Chart in SPSS

How to construct a Pareto chart

Create a Pareto chart in SPSS

How to construct a dot plot

Create a dot plot in SPSS

Graphs for Quantitative variables

Histograms are good:

frequency polygons are good:

Stem-and-leaf plots are good:

Boxplots are good:

  • for large data sets
  • *most common graph for quantitative variables
  • to compare distributions

  • for small data sets
  • see details of distribution
  • for skewed distributions

Moore, Statistics: Concepts and Controversies, 9e, 2017 by W. H. Freeman and Co

https://courses.lumenlearning.com/introstats1/chapter/histograms-frequency-polygons-and-time-series-graphs/

Moore, Statistics: Concepts and Controversies, 9e, 2017 by W. H. Freeman and Co

https://www.onlinemath4all.com/analyzing-box-plots-worksheet.html

How to construct a histogram

Create a histogram in SPSS

How to construct a frequency polygon

Create a frequency polygon in SPSS

How to construct a stem and leaf plot

Create and interpret a stemplot in SPSS

How to construct a box plot

Create a boxplot in SPSS

Graphs to show change over time 

Time-series graphs are good:

  • to show change over time

Moore, Statistics: Concepts and Controversies, 9e, 2017 by W. H. Freeman and Co

How to construct a time-series graph

Identifying Patterns and Deviations in a Graph

In Step 1, we choose the best graph to display the data.

Now, in Step 2, we identify patterns and deviations in the graph. 

An outline that shows that shape, center, and spread constitute the data pattern; outliers are exceptions to the pattern.

https://courses.lumenlearning.com/wmopen-concepts-statistics/chapter/dotplots-2-of-2/

To find patterns and deviations, we look at:

shape

if the data distribution is relatively symmetric or not

center

where most of the data values cluster in the data distribution

variation

how far the values spread from the center in the data distribution (and, if there are outliers)

Shape of a distribution

To describe the shape of a distribution, look at:

  • number of modes
  • whether it is symmetric or skewed

 Number of modes:

http://www.lynnschools.org/classrooms/english/faculty/documents/tim_serino/Printable_Assignments/24_notes__describing_quantitative_data.pdf

This video briefly describes how to identify whether distribution is symmetric or skewed.  

Symmetric or skewed distribution?

Symmetric

Skewed left (negatively)

Skewed right (positively)

data values are evenly distributed around center of unimodal distribution

← →left and right hand sides of distribution show a mirror image

data values are more spread out on left side

         ←the tail goes to the left

data values are more spread out on right side

         the tail goes to the right→

mode, mean, and median are the same

outliers pull mean towards the left

outliers pull the mean to the right

 

all images from Statistical Reasoning for Everyday Life, 5e

Center

The center is the location where most of the data values cluster in a distribution. Think about it as a “typical” value of the data set. 

Spread (Variation)

Variation, or spread, describes how far the values are spread out from the center of the data distribution (and, if there are outliers)

In the picture below, you can see increasing variation in each image as you move from left to right.  The center of the data stays the same, but the values get more spread out.

Small variation

Moderate variation

Large variation

https://www.spss-tutorials.com/standard-deviation/

Outliers

An outlier is a value in a data set that is either very high or very low when compared to the other values.

An outlier increases variation in a data set.

To find an outlier, we must first create a graph.

 Tip:  An outlier strongly affects the mean of a data set, but does not effect the median. 

 

https://statisticsbyjim.com/basics/histograms/                                                                                                            https://online.stat.psu.edu/stat462/node/170/                                                            

How To Spot A Bad Graph

Sometimes, graphs may not present an accurate display of the data.  This may be accidental or intentional.