##### Table of contents

- Story
- Introduction
- Abstract
- Bio
- Slides
- Slide 1 Six World Series
- Slide 2 What To Measure?
- Slide 3 NASA (1972 Dec 7) Apollo 17
- Slide 4 Location
- Slide 5 Metadata
- Slide 6 Are the Data True?
- Slide 7 Sources
- Slide 8 Years
- Slide 9 V-Axis: range (var)
- Slide 10 V-Axis: [0,2 x mean]
- Slide 11 Linear Fit and R-Square
- Slide 12 Residuals
- Slide 13 Autocorrelation
- Slide 14 Autocorrelation of Differences
- Slide 15 Exponential Smoothing Forecasts
- Slide 16 Nonlinear Fit of Solar
- Slide 17 What Causes What?
- Slide 18 Precip Predicted by Temp
- Slide 19 Per Capita
- Slide 20 Units
- Slide 21 Causal Paths
- Slide 22 What Is To Be Done?
- Slide 23 Carbon Emissions
- Slide 24 Key Questions
- Slide 25 References
- Slide 26 Contact Information

- Data Science for Six World Series: Time Series Analysis and Forecasting

- Slides
- Spotfire Dashboard
- Research Notes

- Story
- Introduction
- Abstract
- Bio
- Slides
- Slide 1 Six World Series
- Slide 2 What To Measure?
- Slide 3 NASA (1972 Dec 7) Apollo 17
- Slide 4 Location
- Slide 5 Metadata
- Slide 6 Are the Data True?
- Slide 7 Sources
- Slide 8 Years
- Slide 9 V-Axis: range (var)
- Slide 10 V-Axis: [0,2 x mean]
- Slide 11 Linear Fit and R-Square
- Slide 12 Residuals
- Slide 13 Autocorrelation
- Slide 14 Autocorrelation of Differences
- Slide 15 Exponential Smoothing Forecasts
- Slide 16 Nonlinear Fit of Solar
- Slide 17 What Causes What?
- Slide 18 Precip Predicted by Temp
- Slide 19 Per Capita
- Slide 20 Units
- Slide 21 Causal Paths
- Slide 22 What Is To Be Done?
- Slide 23 Carbon Emissions
- Slide 24 Key Questions
- Slide 25 References
- Slide 26 Contact Information

- Data Science for Six World Series: Time Series Analysis and Forecasting

- Slides
- Spotfire Dashboard
- Research Notes

## Story

Data Science DC Meetup, Thursday, October 29, 2015

#### Introduction

For our October Data Science DC Meetup, we're talking about the World Series. No, not that World Series* -- time-series data about the world! Long-time DSDC attendee Lee De Cola will be showing how to think about, analyze, and forecast important world-wide metrics that have been collected for over 50 years. Expect to learn about rigorous ways to analyze temporal data. And maybe there'll be baseball puns.

#### Abstract

This presentation will use statistical and visualization features of R to explore yearly time series that characterize key global changes since 1950, a period sometimes called the Great Acceleration. Linear regression supplemented with autocorrelation diagnostics can provide most of the key descriptive information about these data, while nonlinear estimation and exponential smoothing can be used to provide forecasts. However, forecasting – in the sense of providing point predictions of future values – should be used descriptively and as providing warnings. Widespread understanding of these data is of surpassing importance to the future welfare of life on Earth.

Participants who would like to follow along are welcome to bring computers loaded with R and to download *in advance* (WiFi will not be available) the data used for this presentation at ldecola.net/projects/global/ .

#### Bio

Lee De Cola runs DATA to Insight a data visualization consulting and training enterprise. For 21 years he was a research scientist at the U.S. Geological Survey in Reston Virginia, where he used GIS and applied statistics to understand landscape dynamics and the health of regions and their inhabitants. Lee has published on land cover analysis, spatial epidemiology, urban systems complexity, fractals in geography, and urbanization in Africa. He has taught at a number of local institutions of higher education as well as in Nigeria, Vermont, West Virginia, and California. Lee volunteers at local public schools, and enjoys playing the clarinet, kayaking, and sailing. Follow him on Twitter@ldecola.

#### Slides

##### Slide 7 Sources

http://lasp.colorado.edu/home/sorce/data/tsi-data/

http://data.giss.nasa.gov/gistemp/

http://www.epa.gov/climatechange/sci...ipitation.html

http://www.census.gov/population/int...population.php

http://cdiac.ornl.gov/trends/emis/tre_glob.html

#### Data Science for Six World Series: Time Series Analysis and Forecasting

Data Science was possible because:

1. The 6 data sets were readily downloadable as a CSV and correctly formatted for time series analysis (time as rows and parameters as columns;

2, The 6 data sets were readily imported into Spotfire with 7 tabs as shown in the screen captures slides below; and

3. The Holt-Winters Forecast uses TIBCO Spotfire Enterprise Runtime for R to compute the Holt-Winters filtering of a time series or anything that can be coerced to a time series. This is an exponentially weighted moving average filter of the level, trend, and seasonal components of a time series. The smoothing parameters are chosen to minimize the sum of the squared one-step ahead prediction errors.

The output of a Holt-Winters Forecast is three different curves: a fitted curve showing the general variation of the measure of interest, a forecast curve predicting the future trend and a confidence interval showing how the insecurity increases the further away from the known values the prediction reaches.

TIBCO Enterprise Runtime for R and open-source R return different prediction intervals for multiplicative seasonal models. TIBCO Enterprise Runtime for R assumes that the seasonal and error components are multiplicative in effect and it uses the formula for prediction variance found in section 6.4.2 of Hyndman, et al, 2008. See the references listed in the References section.

The used parameters can be shown in labels or tooltips.

References for Holt-Winters Forecast

Rob J Hyndman and George Athanasopoulos (2013), Forecasting: principles and practice. http://otests.com/fpp/7/1.

Rob J. Hyndman, Anne B Koehler, J. Keith Ord, and Ralph D. Snyder (2008), Forecasting with Exponential Smoothing: the state space approach, Springer.

Please see our November 2, 2015 Data Science for Random Forests: TIBCO Enterprise Runtime for R

Thank you, Lee for an excellent data set! I have enjoyed our association in the past and wish you well with your Data to Insight work and meetup. Best regards, Brand

## Slides

## Spotfire Dashboard

**Please Note: The Holt-Winters Forecast lines/curves shown in the Spotfire screen captures above do not appear in the Spotfire Web Player **

For Internet Explorer Users and Those Wanting Full Screen Display Use: Web Player Get Spotfire for iPad App

## Comments