Table of contents
  1. Story
    1. Free Data Visualization and Analysis Tools Comparison
    2. OMB Data Visualization Tool Requirements Analysis
  2. Emails
  3. Spotfire
  4. Research Notes
  5. Unleash the agility of R for the Enterprise
    1. Benefits of TERR
    2. Technical Advantages of TERR
    3. TERR Developer's Edition and TERR Community
    4. Integration options for TERR
    5. TIBCO and TERR
  6. Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R
    1. Predictive Analytics with TIBCO Spotfire
      1. Benefits of TIBCO Spotfire and Predictive Analytics
      2. Predictive Analytics in the Spotfire Platform
    2. TIBCO Spotfire Statistics Services
      1. Predictive Analytics Ecosystem
      2. Benefits
      3. TIBCO Spotfire full analytic application authoring
      4. In-Database Predictive Analytics via Teradata Aster
    3. TIBCO Enterprise Runtime for R (TERR)
      1. TERR Developer’s Edition and TERR Community
      2. Integration Options for TERR
        1. TERR in Spotfire
        2. TERR in Statistics Services
        3. Embeddable TERR Engine
      3. TIBCO and the R Community
    4. Predictive Modeling Tools
      1. Custom Advanced Analytic Tools
    5. For more information
  7. Chart and image gallery: 30+ free tools for data visualization and analysis
  8. 22 free tools for data visualization and analysis
    1. Want to see all the tools at once?
    2. Data cleaning
      1. Data Wrangler
      2. Google Refine
    3. Statistical analysis
      1. The R Project for Statistical Computing
    4. Visualization applications and services
      1. Google Fusion Tables
      2. Impure
      3. Tableau Public
      4. Many Eyes
      5. VIDI
      6. Zoho Reports
    5. Code help: Wizards, libraries, APIs
      1. Choose (under development)
      2. Exhibit
      3. Google Chart Tools
      4. JavaScript InfoVis Toolkit
      5. Protovis
    6. GIS/mapping on the desktop
      1. Quantum GIS (QGIS)
    7. Web-based GIS/mapping
      1. OpenHeatMap
      2. Open Layers
      3. OpenStreetMap
    8. Temporal data analysis
      1. Time Flow
    9. Text/word clouds
      1. IBM Word-Cloud Generator
    10. Social and other network analysis
      1. Gephi
      2. NodeXL

Free Data Visualization and Analysis Tools

Last modified
Table of contents
  1. Story
    1. Free Data Visualization and Analysis Tools Comparison
    2. OMB Data Visualization Tool Requirements Analysis
  2. Emails
  3. Spotfire
  4. Research Notes
  5. Unleash the agility of R for the Enterprise
    1. Benefits of TERR
    2. Technical Advantages of TERR
    3. TERR Developer's Edition and TERR Community
    4. Integration options for TERR
    5. TIBCO and TERR
  6. Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R
    1. Predictive Analytics with TIBCO Spotfire
      1. Benefits of TIBCO Spotfire and Predictive Analytics
      2. Predictive Analytics in the Spotfire Platform
    2. TIBCO Spotfire Statistics Services
      1. Predictive Analytics Ecosystem
      2. Benefits
      3. TIBCO Spotfire full analytic application authoring
      4. In-Database Predictive Analytics via Teradata Aster
    3. TIBCO Enterprise Runtime for R (TERR)
      1. TERR Developer’s Edition and TERR Community
      2. Integration Options for TERR
        1. TERR in Spotfire
        2. TERR in Statistics Services
        3. Embeddable TERR Engine
      3. TIBCO and the R Community
    4. Predictive Modeling Tools
      1. Custom Advanced Analytic Tools
    5. For more information
  7. Chart and image gallery: 30+ free tools for data visualization and analysis
  8. 22 free tools for data visualization and analysis
    1. Want to see all the tools at once?
    2. Data cleaning
      1. Data Wrangler
      2. Google Refine
    3. Statistical analysis
      1. The R Project for Statistical Computing
    4. Visualization applications and services
      1. Google Fusion Tables
      2. Impure
      3. Tableau Public
      4. Many Eyes
      5. VIDI
      6. Zoho Reports
    5. Code help: Wizards, libraries, APIs
      1. Choose (under development)
      2. Exhibit
      3. Google Chart Tools
      4. JavaScript InfoVis Toolkit
      5. Protovis
    6. GIS/mapping on the desktop
      1. Quantum GIS (QGIS)
    7. Web-based GIS/mapping
      1. OpenHeatMap
      2. Open Layers
      3. OpenStreetMap
    8. Temporal data analysis
      1. Time Flow
    9. Text/word clouds
      1. IBM Word-Cloud Generator
    10. Social and other network analysis
      1. Gephi
      2. NodeXL

  1. Story
    1. Free Data Visualization and Analysis Tools Comparison
    2. OMB Data Visualization Tool Requirements Analysis
  2. Emails
  3. Spotfire
  4. Research Notes
  5. Unleash the agility of R for the Enterprise
    1. Benefits of TERR
    2. Technical Advantages of TERR
    3. TERR Developer's Edition and TERR Community
    4. Integration options for TERR
    5. TIBCO and TERR
  6. Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R
    1. Predictive Analytics with TIBCO Spotfire
      1. Benefits of TIBCO Spotfire and Predictive Analytics
      2. Predictive Analytics in the Spotfire Platform
    2. TIBCO Spotfire Statistics Services
      1. Predictive Analytics Ecosystem
      2. Benefits
      3. TIBCO Spotfire full analytic application authoring
      4. In-Database Predictive Analytics via Teradata Aster
    3. TIBCO Enterprise Runtime for R (TERR)
      1. TERR Developer’s Edition and TERR Community
      2. Integration Options for TERR
        1. TERR in Spotfire
        2. TERR in Statistics Services
        3. Embeddable TERR Engine
      3. TIBCO and the R Community
    4. Predictive Modeling Tools
      1. Custom Advanced Analytic Tools
    5. For more information
  7. Chart and image gallery: 30+ free tools for data visualization and analysis
  8. 22 free tools for data visualization and analysis
    1. Want to see all the tools at once?
    2. Data cleaning
      1. Data Wrangler
      2. Google Refine
    3. Statistical analysis
      1. The R Project for Statistical Computing
    4. Visualization applications and services
      1. Google Fusion Tables
      2. Impure
      3. Tableau Public
      4. Many Eyes
      5. VIDI
      6. Zoho Reports
    5. Code help: Wizards, libraries, APIs
      1. Choose (under development)
      2. Exhibit
      3. Google Chart Tools
      4. JavaScript InfoVis Toolkit
      5. Protovis
    6. GIS/mapping on the desktop
      1. Quantum GIS (QGIS)
    7. Web-based GIS/mapping
      1. OpenHeatMap
      2. Open Layers
      3. OpenStreetMap
    8. Temporal data analysis
      1. Time Flow
    9. Text/word clouds
      1. IBM Word-Cloud Generator
    10. Social and other network analysis
      1. Gephi
      2. NodeXL

Story

Free Data Visualization and Analysis Tools Comparison

Word

Introduction: I got the email inquiry below and responded using this earlier work based on 22 tools and an updated analysis of 42 tools compared to Spotfire 5.5 as follows:

 

Category Number of Tools Spotfire 5.5 Comment
Data cleaning 2 Yes I call it pre-conditioning
Statistical analysis 1 Yes S-Plus (R)
Visualization app/service 13 Yes Tufte's adjacent visualizations
Framework 1 Yes I call it a Data Ecosystem
Library 10 Yes I call it a Library and a Telescope
GIS/mapping: Desktop 1 Yes Acquisition of Maporama
GIS/mapping: Web 4 Yes Acquisition of Maporama
Temporal data analysis 1 Yes Advanced statistical methods
Word clouds 1 Probably I need to try to do this!
Network analysis 2 Yes Module developed for the CIA
CSV file analysis 1 Yes Like Excel but for larger row numbers
Create sortable, searchable tables 3 Yes Default with faceted search and filtering
Data reformatting 2 Yes Pivot, etc.
Analysis and charting 1 Yes Advanced statistical methods and visualization types
Data acquisition, data reformatting 1 Yes 30+ data connectors

Conclusion: Spotfire 5.5 is a comprehensive Tool that supports the 15 Categories, provides Multi-purpose Visualizations that are dynamically linked to one another, a Desktop and Browser-based Mapping Platform for all 4 Skill Levels (see below), with Data Stored or Processed In-memory and In-Database, that is Designed for Web Publishing in the Cloud.

The Skill Levels are represented as numbers from easiest to most difficult to learn and use:

  1. Users who are comfortable with basic spreadsheet tasks 
  2. Users who are technically proficient enough not to be frightened off by spending a couple of hours learning a new application
  3. Power users
  4. Users with coding experience or specialized knowledge in a field like GIS or network analysis.

OMB Data Visualization Tool Requirements Analysis

  • Background:
    • Computerworld: 22 free tools for data visualization and analysis, Sharon Machlis, April 20, 2011.
    • Federal News Radio Interview: Lessons for Managing Dashboards, November 10, 2011.
    • Computerworld: Chart and image gallery: 30+ free tools for data visualization and analysis, Sharon Machlis, March 13, 2013.​
  • OMB Request:
    • Expert familiar with the strengths and weaknesses of all these tools.
  • My Conclusion:
    • Spotfire 5.5 is a comprehensive Tool that supports the 15 Categories, provides Multi-purpose Visualizations that are dynamically linked to one another, a Desktop and Browser-based Mapping Platform for all 4 Skill Levels, with Data Stored or Processed In-memory and In-Database, that is Designed for Web Publishing in the Cloud.
  • My OMB Data Visualization Tool Requirements Analysis:
    • Used common data sets to see if 11 Gartner Magic Quadrant Leaders (see below) could produce dynamically-linked, small adjacent multiples (Ed Tufte). They could not!​

Gartner Magic Quadrant Leaders (2013):

Emails

From: Nina Preuss [mailto:nina.preuss@tcg.com]

Sent: Wednesday, July 03, 2013 3:41 PM
To: bniemann@cox.net
Cc: Lisa Alferieff
Subject: Data visualization tools comparison guru

Brand, You may remember speaking with my husband, Don Preuss in the past; he suggested I reach out to you.

I'm working with some OMB folks looking into procurement of a data visualization tool for a new OMB mostly internal facing web site that will be providing a lot of federal data to analysts and empowering them to use a visualization tool to derive new understandings. 

They have looked quite a bit already into: Tableau (Yes), TIBCO Spotfire (Yes), Pentaho (Long ago), and even Excel/Sharepoint (plus Jackbe (Yes) if this is the case, or maybe Socrata (Yes)). 

They are interested in us bringing in a SME familiar with the strengths and weaknesses of all these tools.  Are you familiar with a majority of these tools or mostly TIBCO?  If the former, would you have any time to come into a meeting with OMB next week for a 1 hour meeting and then go from there?   If the latter, do you have a recommendation for someone who has a up-to-date broad perspective on these tools?

I can be reached at:  202-742-8479 or via email and will be working on Friday.

Best,

Nina Preuss, PMP
Turner Consulting Group, Yes, it can be done!
TCG Tel:  202-742-8479 Cell:  410-991-9337
http://www.tcg.com

On Jul 3, 2013 5:24 PM, "Brand Niemann" <bniemann@cox.net> wrote:
Nina, Thanks for thinking of me for this.

Are you familiar with a majority of these tools or mostly TIBCO? Both, but feel TIBCO Spotfire is the best for “providing a lot of federal data to analysts and empowering them to use a visualization tool to derive new understandings”

My comparison of tools: http://semanticommunity.info/Data_Science/Free_Data_Visualization_and_Analysis_Tools

And my Spotfire library: https://silverspotfire.tibco.com/us/...niemann/Public

If the former, would you have any time to come into a meeting with OMB next week for a 1 hour meeting and then go from there? Yes

If the latter, do you have a recommendation for someone who has a up-to-date broad perspective on these tools? Yes, me

When would you like to talk?

Dr. Brand Niemann
Director and Senior Data Scientist
Semantic Community
http://semanticommunity.info
http://breakinggov.com/author/brand-niemann/
703-268-9314

From: Nina Preuss [mailto:nina.preuss@tcg.com]
Sent: Wednesday, July 03, 2013 5:56 PM
To: Brand Niemann
Cc: Lisa Alferieff
Subject: RE: Data visualization tools comparison guru

Great news. Will review your links and get back to you next week. Enjoy the holiday.
Please excuse typos as this is coming from my phone.

From: Brand Niemann [mailto:bniemann@cox.net]
Sent: Wednesday, July 03, 2013 10:58 PM
To: 'Nina Preuss'
Cc: 'Lisa Alferieff'
Subject: RE: Data visualization tools comparison guru

Thank you. I just updated my analysis of 42 tools compared to Spotfire.

I just remembered the following as well: Federal News Radio Interview: Lessons for Managing Dashboards, November 10, 2011, Wiki and Slides

MORE EMAILS TO BE ADDED

Spotfire

For Internet Explorer Users and Those Wanting Full Screen Display Use: Web Player Get Spotfire for iPad App

Research Notes

Source: http://www.gartner.com/technology/re...t=130206&st=sb

Gartner BI Magic Quadrant: Spotfire Strenghts and Cautions Excerpts

Tibco Spotfire (Complete): Strengths and Cautions

Tibco Spotfire is a flexible and easy-to-use platform for business user data discovery and analysis, for authoring analytic applications, for publishing interactive and visual dashboards, and for building predictive models and applications. Tibco Spotfire's interactive visualization capabilities are now enabled by a hybrid, in-memory and newly added direct query access approach to support, and they leverage larger enterprise-managed datasets than previously possible.

Tibco Spotfire's strong product vision has been and continues to be a key strength. Its focus on advanced and real-time analytic applications and dashboards delivered to mobile devices contributes to its vision. Unlike the other data discovery platforms (for example, QlikView and Tableau), Tibco Spotfire is leveraging the acquisition of Insightful and its newly released runtime engine for R for data mining, as well as its integration with Tibco middleware, Tibco Software's acquisition of LogLogic, and Tibco Software's social capability, tibbr, to broaden the possible spectrum of end-user-driven interactive analysis and data sources, and to incorporate business events and predictive analytics.

Because of Tibco Spotfire's ease of use, more users can leverage the benefits of more advanced analytics. This paradox typifies why data discovery capabilities in general, and Tibco Spotfire in particular, are so compelling for business users, and why they are proliferating.

Tibco Spotfire's cloud version of its software (Sliver) allows business users to author and share Tibco Spotfire visualizations and dashboards without having to install the software on-premises.

Tibco Spotfire is well-suited to building analytic content ranging from basic interactive visualizations and dashboards to advanced interactive analytic applications, but the perception of Tibco Spotfire's license cost and packaging continues to be a factor limiting its consideration beyond high-end requirements. As a result, Tibco Spotfire is not included on shortlists as frequently as its primary competitors — QlikTech and Tableau in particular — when basic, mainstream data discovery capabilities are required, even though Spotfire's awareness has increased substantially over the past year. 

Its true sweet spot is in providing a flexible, easy-to-use environment for advanced analysis.

Unleash the agility of R for the Enterprise

Source: http://spotfire.tibco.com/en/discover-spotfire/what-does-spotfire-do/predictive-analytics/tibco-enterprise-runtime-for-r-terr.aspx

Also Source: TIBCO Offers Free Software and Online Resource for Predictive Analytics
Posted Jun 4, 2013 - June 4, 2013 Issue
http://www.dbta.com/Articles/Editori...ics-90012.aspx

Cutting edge analytics with enterprise speed, reliability and support

 
Visibility into the unknown

TIBCO Enterprise Runtime for R (TERR)

TERR, a key component of Spotfire Predictive Analytics, is an enterprise-grade analytic engine that TIBCO has built from the ground up to be fully compatible with the R language, leveraging our long-time expertise in the closely related S+ analytic engine. This allows customers to continue to develop in open source R, but to then integrate and deploy their R code on a commercially-supported and robust platform—without the need to rewrite their code.

Prototypes are often developed in R, but then typically re-implemented in another language for production purposes because R was not built for enterprise usage. TERR brings enterprise-class scalability and stability to the agile R-language, and enables statisticians to broadly share their analyses through TIBCO Spotfire Statistics Services or by directly embedding the TERR engine.

TERR enables customers to rapidly iterate from prototyping to production without wasting time and effort recoding and retesting their analyses, allowing them to more rapidly respond to opportunities and threats, and easily integrate standardized predictive analytics consistently across organization.

Download the White Paper My Note: See Below

Benefits of TERR

  • Apply consistent models across multiple applications and uses, from prototyping to production--eliminating uncertainty when disparate analytic models
  • Easily compare multiple analytic approaches to find the hidden insights and to make the best decisions--and then broadly leverage these insights across the organization
  • Eliminate time/resources spent re-implementing R code for production, or spent prototyping on an unwieldy platform--Reducing the need for multiple analytic platforms
  • Rapid cycle from prototyping to production to deliver faster time to insight/market. Continual refinement of models and consistent application across organization means everyone is using the right, best analytic
big data predictive analytics, real time analytics
Self-service data discovery

Technical Advantages of TERR

  • Higher performance and far more robust memory management that open source R—so that performance is more linear as larger data in analyzed.
  • TERR’s higher performance and more robust memory management enable seamless deployment of R code, opening new possibilities in contexts where the performance of open source R previously limited the production usage of R code.
  • Models can be built in open source R, then deployed to TERR--in R’s native format for model objects--for high-performance scoring/prediction.  TERR frees developers from the limitations associated with representing an R model in a model markup language or SQL.
  • Licensable for embedding and redistribution
  • Architected for ongoing investment, to ensure that TERR meets analytic needs both now and in the future.
  • Broad coverage of core R functionality and CRAN packages, giving customers access to cutting edge analytics in a production environment

TERR Developer's Edition and TERR Community

  • The free Developer Edition of TERR available through theTERR Community Site enables customers to test their R code prior to deployment and integration
  • The Developer's Edition is a fully-featured TERR engine, limited to Non-Production use. 
  • TERR Community site provides a forum for the feedback, support and collaboration of R/TERR users, and detailed information on topics such as TERR’s coverage of R functionality and CRAN packages.
  • TERR does not aim to replace open source R on developer desktops.  R developers can continue to develop code in open source R, using their preferred development environments.  Code written in open source R can typically be run in TERR without modification.
Linegraph
rule-driven visualizations

Integration options for TERR

  • TERR in TIBCO Spotfire for tools and  applications powered by advanced analytics
  • TERR in Statistics Services for distributed analytics
  • Embeddable TERR Engine for tight, custom  integration and grid deployment

TIBCO and TERR

TERR has wide integration across the TIBCO platform, enabling customers to deploy consistent analytics across their organization:

  • TERR is embedded in Spotfire Professional, where it powers the Predictive Modeling Tools
  • TERR can also be embedded in Spotfire applications, locally under Spotfire Professional, or remotely through Statistics Services.
  • TERR can be embedded in TIBCO Business Events for Complex Event Processing, to provide a scalable, high-throughput, low-latency analytic service, for real-time applications such as Fraud Detection and Customer Scoring. 
  • TERR can post messages to TIBCO's Enterprise Social Networking product, tibbr, enabling TERR/R users to collaborate with their peers, share their results to contribute to ongoing discussions, and easily send notifications of long-running analytic results. 
database analysis, database analytics engines

Predictive Analytics with TIBCO Spotfire and TIBCO Enterprise Runtime for R

Source: http://spotfire.tibco.com/en/discover-spotfire/what-does-spotfire-do/predictive-analytics/~/media/content-center/whitepapers/predictive-analytics-with-spotfire-and-terr.pdf (PDF)

Predictive Analytics with TIBCO Spotfire

TIBCO Spotfire is the premier data discovery and analytics platform, which provides powerful capabilities for our customers, such as dimension-free data exploration through interactive visualizations, and data mashup to quickly combine disparate data to gain insights masked by data silos or aggregations. Another key strength of TIBCO Spotfire platform is broad predictive analytic functionality. Predictive analytics have entered the mainstream of business analytics in the last few years, and generally can be described as improving future decision-making by enabling learning from an organization’s past collective experience.

Benefits of TIBCO Spotfire and Predictive Analytics

  • Easily provide targeted, relevant predictive analytics to business users
    • Ensure compliance and proper usage
    • Get the answer when needed
  • Increase confidence and effectiveness in decision-making
    • Reduce uncertainty
    • Discover meaningful patterns, important data
    • Maximize ROI
  • Anticipate and react to emerging trends
  • Reduce/manage risk
    • Use scenario planning, forecasts, and fraud detection
  • Forecast specific behavior, preemptively act on it
    • Increase upsell, decrease churn

Predictive Analytics in the Spotfire Platform

There are three main aspects of predictive analytics in the Spotfire platform:
  • TIBCO Spotfire Statistics Services (TSSS) provides a predictive analytics ecosystem and enables seamless integration of your existing investments in R, S+, SAS, MATLAB into Spotfire and custom applications, as well as leveraging in-database predictive analytics through Teradata Aster to empower more effective decision-making across your organization
  • TIBCO Enterprise Runtime for R (TERR) provides an enterprise-class environment for running R scripts and packages, both within Spotfire and across an organization, enabling you to combine the agility of open source R with the speed and reliability of an enterprise platform
  • Predictive Modeling Tools in Spotfire provide deep predictive insights into your data as part of ad hoc analysis without requiring any statistical programming

TIBCO Spotfire Statistics Services

With TIBCO Spotfire Statistics Services, technical and business professionals gain the benefits of a full predictive analytics ecosystem. They gain more confidence in their decisions by using the latest, most relevant predictive analytics available in R, S+, SAS, or MATLAB – without requiring deep expertise in statistics. Organizations increase efficiency by leveraging their existing investments in predictive analytics, giving
decision makers self-service access to easy-to-interpret analytic results through Spotfire applications. Scarce statistical resources deploy and control access to a centralized repository of R, S+, SAS, or MATLAB functions, ensuring only the most appropriate and statistically valid analytic methods are used.
 
For advanced users, TIBCO Spotfire Statistics Services also complements and enhances the usage of S+ and R by allowing statisticians to easily visualize the results of their models and analysis and to deploy these models inside Spotfire applications from a central location.

Predictive Analytics Ecosystem

SpotfirePredictiveAnalyticsFigure1.png

Benefits

  • Easily provide targeted, relevant advanced analytics to large, diverse communities of users, combined with the interactive visualization of Spotfire
    • Integrate R,S+, SAS, and MATLAB into Spotfire and custom applications
    • Enable your users to utilize powerful analytic capabilities without needing a stats background
    • Enable statisticians to ensure compliance and proper usage, while making their work more widely and easily available
    • Utilize the analytic power of Teradata Aster for in-database predictive analytics for applications such as determining the effectiveness of content, website “golden path” analysis, and viewer engagement
  • Leverage your existing analytic investments and skills to improve decision-making across your organization
  • Integrate with the new, enterprise-class, R-compatible statistical engine: TIBCO Enterprise Runtime for R (TERR), which can be run locally under Spotfire Professional for offline use, or remotely through TSSS
  • Tightly integrate with the Spotfire platform, as well as with open C# and Java APIs for integration of advanced analytics into custom applications
  • Ensure enterprise reliability, with features such as clustering and load-balancing
  • Enable users to get started rapidly using the provided OOTB (out-of-the-box) predictive analytics, and to quickly learn from templates and examples

SpotfirePredictiveAnalyticsFigure2.png

TIBCO Spotfire full analytic application authoring

Building Spotfire applications that leverage predictive analytics is a quick and easy process. It starts with the data scientist prototyping an analytic in their environment of choice (R, SAS, etc.), and then deploying the analytic to Statistics Services. As part of this process, the data scientist specifies the types of inputs and outputs the analytic expects. This makes it available to the Spotfire application developer who, without
any coding or requiring any deep understanding of the details of the analytic, uses the information provided by the data scientist to integrate the analytic into a Spotfire application. This application can then be quickly shared to a wide community of users across your organization.
 
SpotfirePredictiveAnalyticsFigure3.png

In-Database Predictive Analytics via Teradata Aster

The most recent addition (as of Spotfire 5.5) to the predictive analytic ecosystem is Teradata Aster. Spotfire users can now utilize Teradata Aster to do in-database predictive analytics on Big Data from Spotfire applications or TERR scripts. They can use the analytic power of Aster for applications such as determining the effectiveness of web content, website “golden path” analysis, and viewer engagement, all without unnecessarily moving the data from the database.
 
This connection is implemented as a TERR package called AsterDB, which generates the SQL/Map Reduce scripts needed to access the powerful functionality within Aster. This makes it easy to leverage these in-database advanced analytics from both TERR scripts and Spotfire applications, using example templates provided with TSSS.
 
SpotfirePredictiveAnalyticsFigure4.png

TIBCO Enterprise Runtime for R (TERR)

TERR is an enterprise-grade analytic engine that TIBCO built from the ground up to be fully compatible with the R language, leveraging our long-time expertise in the closely related S+ analytic engine. This allows customers to continue to develop in open source R, but to then integrate and deploy their R code on a commercially-supported and robust platform – without the need to rewrite their code.
 
TERR enables organizations to:
  • Apply consistent models across multiple applications and uses, from prototyping to production
    • Eliminating uncertainty when analytic models implemented on disparate platforms disagree
  • Easily compare multiple analytic approaches to find the hidden insights and to make the best decisions
    • And then broadly leverage these insights across the organization
  • Eliminate time/resources spent re-implementing R code for production, or time spent prototyping on an unwieldy platform
    • Reducing the need for multiple analytic platforms
  • Rapidly cycle from prototyping to production to deliver faster time to insight/market
    • And continually refine models and provide consistent application across the organization so that everyone is using the right, best analytic
 
The main technical advantages of TIBCO Enterprise Runtime for R are:
  • Higher performance and far more robust memory management – so that performance is linear as larger data in analyzed
  • Fully TIBCO IP, so that TERR is licensable for embedding and redistribution (unlike Open Source R, which is GPL, a particularly viral form of open source licensing)
  • An engine architected as a platform for ongoing investment to ensure analytic needs can be met both now and in the future
  • Broad coverage of core R functionality and CRAN packages
 
All these features were developed with the goal of delivering analytic power AND agility, so that customers can develop in open source R, and deploy/scale/integrate using Enterprise Runtime for R, without having to recode their analytics. People often build prototypes in R, but then typically re-implement in another language for production purposes because R was not built for enterprise usage. TERR brings enterprise-class
scalability and stability to the agile R language, and enables statistics to broadly share their analyses through TIBCO Spotfire Statistics Services or by directly embedding the TERR engine.
 
TERR enables customers to rapidly iterate from prototyping to production without wasting time and effort recoding and retesting their analyses, allowing them to more rapidly respond to opportunities and threats, and easily integrate standardized predictive analytics consistently across organization.

TERR Developer’s Edition and TERR Community

A free Developer Edition of TERR is available through the TERR Community Site. This enables customers to test their R code prior to deployment and integration on a full-featured version of the TERR engine, freely available for non-production use. The TERR Community site also provides a forum for feedback, support, and collaboration of R/TERR users, and detailed information on topics such as TERR’s coverage of R functionality and CRAN packages.
 
The TERR Developer’s Edition is currently a console-only version because we expect R users to continue to develop in their R environment of choice, and then to test their code in TERR Developer’s Edition prior to deployment and integration. The TERR Community does include advice on using TERR with some popular R interfaces, such as ESS-Emacs and Notepad++.

Integration Options for TERR

TERR provides three levels of integration options:
TERR in Spotfire
  • For: Ad hoc tools and interactive applications powered by advanced
  • Full benefits of Spotfire Analytics platform:
    • Interactive visualization & data discovery
    • Easily build and share applications, leverage broad data access, etc.
TERR in Statistics Services
  • For: Distributed analytics
    • Managed pools of engines
    • Load balancing, queuing, failover, parallelization, etc.
    • High level APIs for loose custom integration, data i/o (C#, Java)
    • Central management of analytics, R packages
Embeddable TERR Engine
  • Custom (tight) integration, batch, existing grids, etc.
    • Faster than R, more robust, better memory management, fully supported
    • Low level APIs for tight integration
 
TERR has wide integration across the TIBCO platform, enabling customers to deploy consistent analytics across their organization:
  • TERR is embedded in Spotfire Professional, where it powers the predictive modeling tools
  • TERR can also be embedded in Spotfire applications, to make the power of advanced analytics available to all Spotfire users to enhance their decision-making. To enable this, it can be called locally under Spotfire Professional, or remotely through Statistics Services
  • TERR can be embedded in TIBCO Business Events for complex event processing, to provide a scalable, high-throughput, low-latency analytic service for real-time applications such as fraud detection and customer scoring
  • TERR can post messages to TIBCO’s enterprise social networking product, tibbr, enabling TERR/R users to collaborate with their peers, share their results to contribute to ongoing discussions, and easily send notifications of long-running analytic results

TIBCO and the R Community

As the commercial provider of S+ since the acquisition of Insightful in 2008, we are uniquely suited to contribute to the R community, building on the long history of collaboration within the joint R/S+ community. Our contributions include:
  • The freely available TERR Developer Edition for non-production use by all R users
  • The TERR Community, a support and collaboration forum
  • As we develop new functionality in TERR, or port existing functionality from S+, the frequent release of these capabilities as CRAN packages or directly to R authors, so that R users can continue to develop in OS R, and deploy in TERR
  • This includes S+SeqTrial, S+FlexBayes, and time series packages
  • Benefactor of the R Foundation
  • Co-sponsor of useR! conferences
  • Ongoing contributor to R Core, and as we develop TERR, sharing of our observations and identified bugs with the R Core team, so that open source R development can benefit from our investment
 
TIBCO is committed to helping our customers leverage their investments in R and S+, and other analytic environments, so they can solve high value problems and make better decisions across their organization.
  • Deployment of R/S+, as well as SAS and MATLAB scripts through TIBCO Spotfire
  • Collaboration with partners building customer solutions on R and S+
  • Ongoing support of S+, R’s commercial sibling, based on the same S language originally developed at Bell Labs
 
TIBCO strongly supports open source software (OSS) development, which sparks creativity, innovation and productivity, benefiting users, developers, and vendors. Beyond our work with the R Community, TIBCO has made major contributions to he OSS community, including releasing Ajax General Interface source code to the Dojo Foundation, and the core of the TIBCO PageBus to the OpenAjax Alliance.

Predictive Modeling Tools

Predictive modeling tools in Spotfire provide deep predictive insights into your data as part of ad hoc analysis. These tools support a full workflow for “real” predictive modeling and enable you to create, evaluate, and iterate predictive models while leveraging the full interactivity and powerful visualizations of the Spotfire platform. You also can test models on existing data, apply predictions to new data, and embed predictive models in
applications – all without requiring any R/S+ programming.
 
These predictive modeling tools – Linear and Logistic Regression, Classification, and Regression Trees – are available in Spotfire Professional. They are located directly in the Spotfire menu and execute locally, using the TERR engine behind the scenes. Statistics Services is not required to use the tools for ad hoc analysis, but it does enable models developed with these tools to be deployed to applications running in the Spotfire
Web Player.
 
SpotfirePredictiveAnalyticsFigure5.png

Custom Advanced Analytic Tools

Spotfire is an extensible platform, so customers can also create their own, custom advanced analytic tools in the Spotfire platform, through C# coding in the Spotfire interface and through scripting their analytics. These analytics can be written in R and run locally using TERR, or written in any of the analytic engines that comprise the Spotfire Predictive Analytics ecosystem and executed remotely using Statistics Services. This enables Spotfire users to provide custom tools highly targeted to specific users and workflows to improve their productivity, and leverage their existing investments in R scripts and other advanced analytics.
 
SpotfirePredictiveAnalyticsFigure6.png

For more information

TIBCO Spotfire Website: http://spotfire.tibco.com/
TIBCO Spotfire Demo Library: http://spotfire.tibco.com/home/demos
Trends and Outliers - TIBCO Spotfire’s Business Intelligence Blog: http://spotfire.tibco.com/blog/
 
TIBCO Software Inc. (NASDAQ: TIBX) is a provider of infrastructure software for companies to use on-premise or as part of cloud
computing environments. Whether it’s efficient claims or trade processing, cross-selling products based on real-time customer
behavior, or averting a crisis before it happens, TIBCO provides companies the two-second advantage® – the ability to capture the
right information, at the right time and act on it preemptively for a competitive advantage. More than 4,000 customers worldwide
rely on TIBCO to manage information, decisions, processes and applications in real time. Learn more at http://www.tibco.com.
 
 
TIBCO Spotfire
212 Elm Street
Somerville, MA 02144
US: 1-866-240-0491
EMEA: +44-800-520-0443
Fax: +1 617-702-1700
 
©2013, TIBCO Software Inc. All rights reserved. TIBCO, the TIBCO logo, and Spotfire are trademarks or registered trademarks
of TIBCO Software Inc. in the United States and/or other countries. All other product and company names and marks in this
document are the property of their respective owners and mentioned for identification purposes only

Chart and image gallery: 30+ free tools for data visualization and analysis

Source: http://www.computerworld.com/s/article/9214755/Chart_and_image_gallery_30_free_tools_for_data_visualization_and_analysis

March 13, 2013 12:45 PM ET

Features: You can sort the chart by clicking on any column header once to sort in ascending order and a second time to sort by descending (browser JavaScript required).

Skill levels are represented as numbers from easiest to most difficult to learn and use:

  1. Users who are comfortable with basic spreadsheet tasks
  2. Users who are technically proficient enough not to be frightened off by spending a couple of hours learning a new application
  3. Power users
  4. Users with coding experience or specialized knowledge in a field like GIS or network analysis.     Next page: Screenshots of several tools

Data visualization and analysis tools


Tool
Category Multi-purpose
visualization

Mapping  

Platform
Skill
level   
Data stored
or processed
Designed for
Web publishing?
1. Data Wrangler Data cleaning No No Browser 2 External server No
2. OpenRefine (formerly Google Refine) Data cleaning No No Browser 2 Local No
3. R Project Statistical analysis Yes With plugin Linux, Mac OS X, Unix, Windows XP or later 4 Local No
4. Google Fusion Tables Visualization app/service Yes Yes Browser 1 External server Yes
5. Impure Visualization app/service Yes No Browser 3 Varies Yes
6. Many Eyes Visualization app/service Yes Limited Browser 1 Public external server Yes
7. Tableau Public Visualization app/service Yes Yes Windows 3 Public external server Yes
8. VIDI Visualization app/service Yes Yes Browser 1 External server Yes
9. Zoho Reports Visualization app/service Yes No Browser 2 External server Yes
10. Choosel Framework Yes Yes Chrome, Firefox, Safari 4 Local or external server Not yet
11. Exhibit Library Yes Yes Code editor and browser 4 Local or external server Yes
12. Google Chart Tools Library and Visualization app/service Yes Yes Code editor and browser 2 Local or external server Yes
13. JavaScript InfoVis Toolkit Library Yes No Code editor and browser 4 Local or external server Yes
14. D3 Library Yes Yes Code editor and browser 4 Local or external server Yes
15. Quantum GIS (QGIS) GIS/mapping: Desktop No Yes Linux, Unix, Mac OS X, Windows 4 Local With plugin
16. OpenHeatMap GIS/mapping: Web No Yes Browser 1 External server Yes
17. OpenLayers GIS/mapping: Web, Library No Yes Code editor and browser 4 local or external server Yes
18. OpenStreetMap GIS/mapping: Web No Yes Browser or desktops running Java 3 Local or external server Yes
19. TimeFlow Temporal data analysis No No Desktops running Java 1 Local No
20. IBM Word-Cloud Generator Word clouds No No Desktops running Java 2 Local As image
21. Gephi Network analysis No No Desktops running Java 4 Local As image
22. NodeXL Network analysis No No Excel 2007 and 2010 on Windows 4 Local As image
23. CSVKit CSV file analysis No No Linux, Mac OS X or Linux with Python installed 3 Local No
24. DataTables Create sortable, searchable tables No No Code editor and browser 3 Local or external server Yes
25. FreeDive Create sortable, searchable tables No No Browser 2 External server Yes
26. Highcharts* Library Yes No Code editor and browser 3 Local or external server Yes
27. Mr. Data Converter Data reformatting No No Browser 1 Local or external server No
28. Panda Project Create searchable tables No No Browser with Amazon EC2 or Ubuntu Linux 2 Local or external server No
29. PowerPivot** Analysis and charting Yes No Excel 2010 and some 2013 versions on Windows 3 Local No
30. Weave Visualization app/service Yes Yes Flash-enabled browsers; Linux server on backend 4 Local or external server Yes
31. Statwing Visualization app/service Yes No Browser 1 External server Not yet
32. Infogr.am Visualization app/service Yes Limited Browser 1 External server Yes
33. Datawrapper Visualization app/service Yes No Browser 1 Local or external server Yes
34. Cascading Tree Sheets Library Yes Yes Browser 1 Local or external server Yes
35. Dataset Library No No Browser 4 Local or external server Yes
36. Leaflet Library No Yes Browser 4 Local or external server Yes
37. Searchable Fusion Table Map Template Library No Yes Browser 3 Local or external server Yes
38. Tabletop Library No No Browser 3 Local or external server Yes
39. Data Explorer** Data acquisition, data reformatting No No Excel 2010 and 2013 on Windows 2 Local No
40. eSpatial GIS/mapping No Yes Browser 2 External Yes
41. Jolicharts Visualization app/service Yes Yes Browser 1 External Yes
42. Silk Visualization app/service Yes Yes Browser 1 External Yes

*Highcharts is free for non-commercial use and $80 for most single-site-wide licenses. **While add-ons are free, Excel (which is required to run them) is not.

22 free tools for data visualization and analysis

Got data? These useful tools can turn it into informative, engaging graphics.

By Sharon Machlis
April 20, 2011 06:00 AM ET

Sharon Machlis is online managing editor at Computerworld. Her email address is smachlis@computerworld.com. You can follow her on Twitter Twitter@sharon000, on Facebook or by subscribing to her RSS feeds: articles Machlis RSS | blogs Machlis RSS.

Read more about Applications in Computerworld's Applications Topic Center.

 
Computerworld - You may not think you've got much in common with an investigative journalist or an academic medical researcher. But if you're trying to extract useful information from an ever-increasing inflow of data, you'll likely find visualization useful -- whether it's to show patterns or trends with graphics instead of mountains of text, or to try to explain complex issues to a nontechnical audience.

There are many tools around to help turn data into graphics, but they can carry hefty price tags. The cost can make sense for professionals whose primary job is to find meaning in mountains of information, but you might not be able to justify such an expense if you or your users only need a graphics application from time to time, or if your budget for new tools is somewhat limited. If one of the higher-priced options is out of your reach, there are a surprising number of highly robust tools for data visualization and analysis that are available at no charge.

Here's a rundown of some of the better-known options, many of which were demonstrated at the Computer-Assisted Reporting (CAR) conference last month. Others are not as well known but show great promise. They range from easy enough for a beginner (i.e., anyone who can do rudimentary spreadsheet data entry) to expert (requiring hands-on coding). But they all share one important characteristic: They're free. Your only investment: time.

Data cleaning

Before you can analyze and visualize data, it often needs to be "cleaned." What does that mean? Perhaps some entries list "New York City" while others say "New York, NY" and you need to standardize them before you can see patterns. There might be some records with misspellings or numerical data-entry errors. The following two tools are designed to help get your data in tip-top shape to be analyzed.

Data Wrangler

DataWrangler

What it does: This Web-based service from Stanford University's Visualization Group is designed for cleaning and rearranging data so it's in a form that other tools such as a spreadsheet app can use.

Click on a row or column, and DataWrangler will suggest changes. For example, if you click on a blank row, several suggestions pop up such as "delete row" or "delete empty rows."

There's also a history list that allows for easy undo -- a feature that's also available in Google Refine (reviewed next).

What's cool: Text editing is especially easy. For example, when I selected "Alabama" in one row of sample data headlined "Reported crime in Alabama" and then selected "Alaska" in the next group of data, it led to a suggestion to extract every state name. Hover your mouse over a suggestion, and you can see affected rows highlighted in red.

Free data analysis

 
DataWrangler helps format table data so it can be better used and analyzed by other applications.
Click to view larger image.

Drawbacks: I found that unexpected changes occurred as I attempted to explore DataWrangler's options; I constantly had to click "clear" to reset. And not all suggestions are useful ("promote row to header" seemed an odd suggestion when the row was blank) or easy to understand ("fold split 1 using 2 as key").

And while the fact that DataWrangler is a Web-based service makes it convenient to use, don't forget that it sends your data off to an external site -- which means it isn't an option for sensitive internal information. However, there are plans for a future release of a stand-alone desktop version. Another important thing to keep in mind is that DataWrangler is currently alpha code, and its creators say it's "still a work in progress."

Skill level: Advanced beginner.

Runs on: Any Web browser.

Learn more: There's a screencast on the Data Wrangler home page. Also, see this post on using DataWrangler to format data (from Tableau Public's blog).

Google Refine

Google Refine

What it does: Google Refine can be described as a spreadsheet on steroids for taking a first look at both text and numerical data. Like Excel, it can import and export data in a number of formats including tab- and comma-separate text files and Excel, XML and JSON files.

Free data analysis

 
Google Refine can make data 'cleaner' by helping to find errors or different versions of the same proper names. Click to view larger image.

Refine features several built-in algorithms that find text items that are spelled differently but actually should be grouped together. After importing your data, you simply select edit cells --> cluster and edit and select which algorithm you want to use. After Refine runs, you decide whether to accept or reject each suggestion. For example, you could say yes to combining Microsoft and Microsoft Corp., but no to combining Coach Inc. with CQG Inc. If it's offering too few or too many suggestions, you can change the strength of the suggestion function.

There are also numerical options that offer quick and easy overviews of data distributions. This functionality can reveal anomalies that might be the result of data input errors -- such as $800,000 instead of $80,000 for a salary entry, or it could expose inconsistencies -- such as differences in the way compensation data is reported from entry to entry, with some showing, say, hourly wages and others showing weekly pay or yearly salaries.

Beyond data housekeeping, Google Refine offers some useful analysis tools, such as sorting and filtering.

What's cool: Once you get used to which commands do what, this is a powerful tool for data manipulation and analysis that strikes a good balance between functionality and ease of use. The undo/redo list of every action you've taken lets you roll back when needed. And text functions handle Java-syntax regular expressions, allowing you to look for patterns (such as, say, three numbers followed by two digits) as well as specific text strings and numbers.

Finally, while this is a browser-based application, it works with files on your desktop, so your data remains local.

Drawbacks: Although Google Refine looks like a spreadsheet, you can't do typical spreadsheet calculations with it; for that, you must export to a conventional spreadsheet application. If you've got a large data set, carve out some time in your day to go through all of Refine's suggested changes, since it can take a while. And, depending on the data set, be prepared when looking for text items to merge: You're likely to get either a lot of false positives or missed problems -- or both.

Skill level: Advanced beginner. Knowledge of data analysis concepts is more important than technical prowess; power Excel users who understand data-cleaning needs should be comfortable with this.

Runs on: Windows, Mac OS X (if it appears to do nothing after loading on a Mac, point a browser manually to http://127.0.0.1:3333/ ), Linux.

Learn more: These three screencasts give a good overview of why and how you'd use Refine; there's also fairly detailed documentation on the Google Code project area.

Statistical analysis

Sometimes you need to combine graphical representation of your data with heftier numerical analysis.

The R Project for Statistical Computing

The R Project for Statistical Computing

What it does: R is a general statistical analysis platform (the authors call it an "environment") that runs on the command line. Need to find means, medians, standard deviations, correlations? R can handle that and much more, including "linear and generalized linear models, nonlinear regression models, time series analysis, classical parametric and nonparametric tests, clustering and smoothing," according to the project website.

Free data analysis

 
The R Project for Statistical Computing provides a wide range of data analysis options.
Click to view larger image.

R also graphs, charts and plots results. There are numerous add-ons to this open-source project that significantly extend functionality. For users who prefer a GUI, Peter Aldhous, San Francisco bureau chief for New Scientist magazine, suggests RExcel, which offers access to the R engine through Excel.

What's cool: There is a great deal of functionality in R, including quite a number of visualization options as well as numerical and spatial analysis.

Drawbacks: The fact that R runs on the command line means that users will have to take the time to learn which commands do what, and not all users will be comfortable with a text-only interface. In addition, Aldhous says those dealing with large data sets may hit a memory barrier (if so, there's acommercial option from Revolution Analytics).

Skill level: Intermediate to advanced. Comfort with command-line prompts and a knowledge of statistics are a musts for the core application.

Runs on: Linux, Mac OS X, Unix, Windows XP or later.

Learn more: Try R for Statistics: First Steps (PDF) by Peter Aldhous, Hands-on R, a step-by-step tutorial (PDF) by Jacob Fenton, and the project's own An Introduction to R. The R Statistics blog has a number of visualization samples.

Visualization applications and services

These tools offer a number of different visualization options. While some stick to conventional charts and graphs, many offer a range of other choices such as treemaps and word clouds. A few offer geographical mapping as well, although if you're interested in maps, our sections on GIS/mapping focus specifically on that.

Google Fusion Tables

Google Fusion Tables

What it does: This is one of the simplest ways I've seen to turn data into a chart or map. You can upload a file in several different formats and then choose how to display it: table, map, heatmap, line chart, bar graph, pie chart, scatter plot, timeline, storyline or motion (animation over time). It's somewhat customizable, allowing you to change map icons and style info windows.

Free data analysis

 
Google Fusion Tables is a user-friendly tool that makes it easy to map data.
Click to view interactive map.

There are some data editing functions within Fusion Tables, although changing more than a few individual cell entries can quickly become tedious. You can also join tables (which is important when the data you want to map is in multiple tables), and filter, sort and add columns and so on. There are also options to allow others to make comments on the data itself.

Mapping goes beyond just placing points, as many of us are accustomed to with Google Maps. Fusion tables can also map multiple polygons with variations in color based on underlying data, such as this intensity mapshowing the percentage of households with Internet access by state from 2007 U.S. Census bureau data.

The Knight Digital Media Center notes that a handy undocumented feature allows the use of Fusion Table's "templating" export to generate a JSON filefrom data in other formats. JSON is required by some APIs and JavaScript libraries.

Unlike IBM's Many Eyes, Google lets you designate your data as private or unlisted as well as public, although your data still resides on Google's servers -- a benefit or drawback, depending on whether server bandwidth costs or data privacy is more important to you.

What's cool: Fusion Tables offers relatively quick charting and mapping, including geographic information system (GIS) functions to analyze data by geography. The service also automatically geocodes addresses, which is useful when trying to place numerous points on a map. This is an excellent tool for beginners and advanced beginners to use to get comfortable with analyzing data; it's also a good fit for people who don't program. For more advanced users, there's an API.

Drawbacks: Functionality, customization and data capacity are all limited compared with desktop applications or custom code, and interacting with large data sets on the site can be sluggish. And it has its limitations -- the site choked on March 11, the day of the devastating earthquake and tsunami in Japan. (It is still a Google Labs beta project.)

Skill level: Beginner.

Runs on: Any Web browser.

Learn more: A Google Fusion Tables tour and several tutorials are available. We've also got some examples of what it can do in our story "H-1B Visa Data: Visual and Interactive Tools." Also see the Fusion Tables Example Gallery.

Impure

Impure

What it does: Impure is sort of a Yahoo Pipes for data visualization, designed for creating numerous types of highly polished graphical representations of data using a drag-and-drop workspace. The service includes a library of objects and various methods, and -- as with Yahoo Pipes -- it allows you to click and drag to connect modules so that the output of one becomes the input of another. It was developed by Spanish analytics firm Bestiario.

What's cool: Impure offers a highly visual interface for the task of creating visualizations -- which is not as common as you might expect. It has a sleek user interface and numerous modules, including quite a few APIs that are designed to pull data from the Web. It features numerous visualization types that are searchable by keywords like numerictablesnodesgeometry and map. And although it saves your workspaces to the Web, you can copy and save the code behind your workspaces locally, so you can back up your work or maintain your own libraries of code snippets.

Drawbacks: Users of Impure face a surprisingly steep learning curve despite its drag-and-drop functionality. The documentation is detailed in some areas, but lacking in others. For instance, while it was easy to find a list of APIs, it was more difficult to find basic instructions on how to use the workspace -- or even figure out that there was a workspace, let alone how to use the various objects and methods.

Once you save your workspace, it's on the public Web, although it's unlikely that anyone else will be able to find it unless you share the URL. And I found some of the samples not all that helpful in understanding the underlying data, even if they were visually striking.

Skill level: Intermediate.

Runs on: Any Web browser.

Learn more: To get started, I'd suggest the videos "Interface Basics" (7 minutes) and "Workspaces and Code." You can find a sample called The Pay Gap Between Men and Women Mapped at the website of British newspaper The Guardian.

Tableau Public

Tableau Public

What it does: This tool can turn data into any number of visualizations, from simple to complex. You can drag and drop fields onto the work area and ask the software to suggest a visualization type, then customize everything from labels and tool tips to size, interactive filters and legend display.

Free data analysis

 
Tableau Public can turn data into any number of visualizations, from simple to complex.
Click to view interactive graphic.

What's cool: Tableau Public offers a variety of ways to display interactive data. You can combine multiple connected visualizations onto a single dashboard, where one search filter can act on numerous charts, graphs and maps; underlying data tables can also be joined. And once you get the hang of how the software works, its drag-and-drop interface is considerably quicker than manually coding in JavaScript or R for most users, making it more likely that you'll try additional scenarios with your data set. In addition, you can easily perform calculations on data within the software.

Drawbacks: In the free version of Tableau's business intelligence software, your visualization and data must reside on Tableau's site. Whenever you save your work, it gets sent up to the public website -- which means you can't save work in progress without running the risk that it will be seen before it's ready (while Tableau's site won't deliberately expose your work, it relies on security by obscurity -- so someone could see your work if they guess your URL). And once it's saved, viewers are invited to download your entire workbook with data. Upgrading to a single-user desktop edition costs $999.

Not surprisingly, all that functionality comes at a cost: Tableau's learning curve is fairly steep compared to that of, say, Fusion Tables. Even with the drag-and-drop interface, it'll take more than an hour or two to learn how to use the software's true capabilities, although you can get up and running doing simple charts and maps before too long.

Skill level: Advanced beginner to intermediate.

Runs on: Windows 7, Vista, XP, 2003, Server 2008, 2003.

Learn more: There are seven short training videoson the Tableau site, where you can also find downloadable data files that you can use to follow along.

You can see a sample in our article "Tech Unemployment Climbs; Self-employment Steady."

Many Eyes

Many Eyes

A pioneer in Web-based data visualization, IBM's Many Eyes project combines graphical analysis with community, encouraging users to upload, share and discuss information. It's extremely easy to use and very well documented, including suggestions on when to use what kind of visual data representation. Many Eyes includes more than a dozen output options -- from charts, graphics and word clouds to treemaps, plots, network diagrams and some limited geographic maps.

You'll need a free account to upload and post data, although anyone can browse. Formatting is basic: For most visualizations, the data must be in a tab-separated text file with column headers in the first row.

It took me about three minutes to create a bar chart of top H-1B visa employers.

Free data analysis

 
It takes just a few minutes to create online charts like this with Many Eyes.
Click to view larger image.

It took perhaps another minute to create a treemap of the same data.

Free data analysis

 
Many Eyes offers a number of ways to visualize data, such as treemaps.
Click to view larger image.

What's cool: Visualization can't get much easier, and the results look considerably more sophisticated than you'd expect based on the minimal amount of effort needed to create them. Plus, the list of possible visualization types includes explanations of the types of data each one is best suited for.

Drawbacks: Both your visualizations and your data sets are public on the Many Eyes site and can be easily downloaded, shared, reposted and commented upon by others. This can be great for certain types of users -- especially government agencies, nonprofits, schools and other organizations that want to share visualizations on someone else's server budget -- but an obvious problem for others. (IBM does offer a contact form for businesses interested in hosting their own version of the software.) In addition, customization is limited, as is data file size (5MB).

Skill level: Beginner.

Runs on: Java and any modern Web browser that can display Flash.

Learn more: IBM's website features pages explaining data formatting for Many Eyes and visualization choices.

You can see some featured visualizations on the Many Eyes home page or browse through some of the tens of thousands of uploads. One interesting map shows popular surnames in the U.S. from the 2000 Census by Martin Wattenberg, one of the creators of Many Eyes.

VIDI

VIDI

What it does: Although VIDI's website bills this as a tool for the Drupal content management system, graphics created by the site's visualization wizard can be used on any HTML page -- no Drupal required.

Upload your data, select a visualization type, do a bit of customization selection, and your chart, timeline or map is ready to use via auto-generated embed code (using an iframe, not JavaScript or Flash).

Free data analysis

 
Graphics created by VIDI's visualization wizard can be used on any HTML page -- no Drupal required. 
Click to view interactive graphic.

What's cool: This is about as easy as Many Eyes -- with more mapping options and no need to make your visualization and data set public on its website. There are quick screencasts explaining each visualization type and several different color customization options. And the file-size limit of 30MB is six times larger than Many Eyes' 5MB maximum.

Drawbacks: Oddly, the visualization wizard was a lot easier to use than the embed code -- my embedded iframe didn't display while trying to preview it on the VIDI website; I needed to save the visualization and go to the "My VIDI" page to get embed code that actually worked. Also, as with any cloud service, if you're using this for Web publishing, you'll want to feel confident that the host's servers can handle your traffic and will be available longer than your need to display the data.

Skill level: Beginner.

Runs on: Any Web browser.

Learn more: The VIDI home page features a link to an 11-minute video tutorial.

It took me less than five minutes to create a sample: a map of earthquakes of 7.0 magnitude or more since Jan. 1, 2000.

Zoho Reports

Zoho Reports

What it does: One of the more traditional corporate-focused business analytics offerings in this group, Zoho Reports can take data from various file formats or directly from a database and turn it into charts, tables and pivot tables -- formats familiar to most spreadsheet users.

What's cool: You can schedule data imports from sources on the Web. Data can be queried using SQL and can be turned into visualizations, and the service is set up for Web publishing and sharing (although if it's accessed by more than two users, you will need a paid account).

Free data analysis

 
Zoho Reports provides traditional business charts and graphs. 
Click to view larger image.

Drawbacks: Visualization options are fairly basic and limited. Interacting live with the Web-based data can be sluggish at times. Data files are limited to 10MB. I found the navigation confusing at times -- for example, after I saved a copy of a sample database, I was told it was in the folder "My reports," yet I had a hard time finding that.

Skill level: Advanced beginner.

Runs on: Any Web browser.

Learn more: There are video demos and samples on Zoho's website.

Code help: Wizards, libraries, APIs

Sometimes nothing can substitute for coding your own visualization -- especially if the look and feel you're after can't be achieved without an existing desktop or Web app. But that doesn't mean you need to start from scratch, thanks to a wide range of available libraries and APIs.

Choose (under development)

Choosel

What it does: This open-source Web-based framework is designed for charts, clouds, graphs, timelines and maps. Right now, it is geared more for developers who create applications than it is for end users who need to save and/or embed their work; but there's an interactive online demo that lets you quickly upload some data to visualize.

Free data analysis

 
Still under development, Choosel has potential as an easy way to create online graphics.
Click to view larger image.

What's cool: As with Tableau Public, you can have more than one visualization on a page and connect them so that, for example, mousing over items on a chart will highlight corresponding items on a map.

Drawbacks: This is not yet an application that end users can use to store and share their work. And I found the online demo to be finicky about uploading data -- even after I corrected field formats for dates (dd/mm/yyyy) and location (latitude/longitude) as documented, my data wouldn't load until I had another text field added (rather than just having numerical fields). It was also unclear how to customize labels. This project shows promise if it's further developed and documented.

Skill level: Expert

Runs on: Chrome, Safari and Firefox.

Learn more: There's a short video called Choosel -- Timeline and Basic Features and a sample titled Earthquakes With 1,000 or More Deaths Since 1900.

Exhibit

Exhibit

What it does: This spin-off of the MIT Simile Project is designed to help users "easily create Web pages with advanced text search and filtering functionalities, with interactive maps, timelines and other visualization." Billed as a publishing framework, the JavaScript library allows easy additions of filters, searches and more. The Easy Data Visualization for Journalists page offers examples of the code in use at a number of newspaper websites.

Of course, "easy" is in the eye of the beholder -- what's easy for the professionals at MIT who created Exhibit might not be that simple for a user whose comfort level stops at Excel. Like most JavaScript libraries, Exhibit requires more hand-coding than services such as Many Eyes and Google Fusion Tables. On the other hand, Exhibit has clear documentation for beginners, even those with no JavaScript experience.

What's cool: For those who are comfortable coding, Exhibit offers a number of views -- maps, charts, timeplots, calendars and more -- as well as customized lenses (ways to format an individual record) and facets (properties that can be searched or sorted). You're much more likely to get the exact presentation you want with Exhibit than, say, Many Eyes. And your data stays local unless and until you decide to publish.

Drawbacks: For newcomers unused to coding visualizations, it takes time to get familiar with coding and library syntax.

Skill level: Expert.

Learn more: There are a number of examples you can look at, including Red Sox-Yankees Winning Percentages Through the YearsU.S. Cities by Population and others.

Note: There are numerous other JavaScript libraries to help create visualizations, such as the recently released Data-Driven Documents and the jQuery Visualize plug-in. Six Revisions' list of 20 Fresh JavaScript Data Visualization Libraries gives you an idea of how many there are to choose from.

Google Chart Tools

Google Chart Tools

What it does: Unlike Google Fusion Tables, which is a full-fledged, self-contained application for uploading and storing data, and generating charts and maps, Chart Tools is designed to visualize data residing elsewhere, such as your own website or within Google Docs.

Free data analysis

 
Google Chart Tools offers both a wizard and an API for creating Web graphics from data.
Click to view larger image.

Google offers both a Chart API using a "simple URL request to a Google chart server" for creating a static image and a Visualization API that accesses a JavaScript library for creating interactive graphics. Google offers a comparison of data size, page load, skills needed and other factors to help you decide which option to use.

For the simpler static graphics, there's a wizard to help you create a chart from some sample formats; it goes as far as helping you input data row by row, although for any decent-size data set -- say, more than half a dozen or so entries -- it makes more sense to format it in a text file.

The visualization API includes various types of charts, maps, tables and other options.

What's cool: The static image chart is reasonably easy to use and features aLive Chart Playground, which allows you to tweak code and see your results in real time.

The more robust API lets you pull data in from a Google spreadsheet. You can create icons that mix text and images for visualizations, such as this weather forecast note, and what it calls a "Google-o-meter" graphic. The Visualization API also has some of the best documentation I've seen for a JavaScript library.

Drawbacks: The static charts tool requires a bit more work than some of the other Web-based services, and it doesn't always offer lots of extras in return. And for the API, as with other JavaScript libraries, coding is required, making this more of a programming tool than an end-user business intelligence application.

Skill level: Advanced beginner to expert.

Runs on: Any Web browser.

Learn more: See Getting Started With Charts and Interactive Charts. There are also samples in the Google Visualization API Gallery.

JavaScript InfoVis Toolkit

JavaScript InfoVis Toolkit

What it does: InfoVis is probably not among the best known JavaScript visualization libraries, but it's definitely worth a look if you're interested in publishing interactive data visualizations on the Web. The White House agrees: InfoVis was used to create the Obama administration's Interactive Budget graphic.

What sets this tool apart from many others is the highly polished graphics it creates from just basic code samples. InfoVis creator Nicolas García Belmonte, senior software architect at Sencha Inc., clearly cares as much about aesthetic design as he does about the code, and it shows.

InfoViz

 
This sunburst of a directory tree shows some of the visualization capabilities of the JavaScript InfoVis Toolkit. You can see a larger, interactive version on the InfoVis website.

What's cool: The samples are gorgeous and there's no extra coding involved to get nifty fly-in effects. You can choose to download code for only the visualization types you want to use to minimize the weight of Web pages.

Drawbacks: Since this is not an application but a code library, you must have coding expertise in order to use it. Therefore, this might not be a good fit for users in an organization who analyze data but don't know how to program. Also, the choice of visualization types is somewhat limited. Moreover, the data should be in JSON format.

Skill level: Expert.

Runs on: JavaScript-enabled Web browsers.

Learn more: See demos with source code.

Protovis

Protovis

What it does: Billed as a "graphical toolkit for visualization," this project from Stanford University's Visualization Group is one of the more popular JavaScript libraries for turning data into visuals; it's designed to balance simplicity with control over the display.

What's cool: One of the best things about Protovis is how well it's documented, with plenty of examples featuring visualization and sample code. There are also a large number of sample visualization types available, including maps and some statistical analyses. This is a robust tool, capable of building graphics like this color-coded U.S. map with timeline slider.

Drawbacks: As is the case with other JavaScript libraries, it's pretty much essential for users to have knowledge of JavaScript (or at least some other programming language). While it's possible to copy, paste and modify code without really understanding what it's doing, I find it difficult to recommend that approach for nontechnical end users.

Skill level: Expert.

Runs on: JavaScript-enabled Web browsers.

Learn more: Try the How-to: Get Started Guide. You can also find examples of the types of graphics you can build with Protovis at the Protovis Gallery.

GIS/mapping on the desktop

There's a wide range of business uses for geographic information systems (GIS), ranging from oil exploration to choosing sites for new retail stores. Or, as The Miami Herald did for its Pulitzer Prize-winning coverage of Hurricane Andrew, you can compare maximum wind speeds with damage reports and building information (and perhaps discover, for example, that the worst damage didn't happen in the areas suffering the heaviest winds, but in areas with a lot of new, shoddy construction).

Quantum GIS (QGIS)

Quantum GIS (QGIS)

What it does: This is full-fledged GIS software, designed for creating maps that offer sophisticated, detailed data-based analysis of a geographic regions.

The best-known desktop GIS software is probably Esri's ArcView, a robust, well-supported application that costs quite a bit of money. The open-source QGIS is an alternative to ArcView.

Free data analysis

 
Quantum GIS (QGIS) offers full-fledged geospatial visualization and analysis on the desktop.
Click to view larger image.

As OpenOffice is to Microsoft Office, QGIS is to ArcView. ArcView enthusiasts argue that Esri's offering is a couple of years ahead of open-source alternatives, has a better-developed interface, enjoys commercial support and is better suited for print output. But QGIS users say the open-source alternative is an excellent program that does a great deal of useful GIS work -- and may even be better than ArcView when it comes to generating maps for the Web, thanks to a plug-in dedicated to generating HTML image maps.

What's cool: QGIS has an enormous amount of GIS functionality, including the ability to create maps, overlay various types of data, do spatial analysis, publish to the Web and more. It can also be enhanced with plug-ins that add support for numerous undertakings, including geocoding, managing underlying table data, exporting to MySQL and generating HTML image maps.

Drawbacks: As with any sophisticated GIS application, learning to use this software entails a serious commitment of time and training. Even in hour-long hands-on sessions with first ArcView and then QGIS, I noticed things that were easier to do in the commercial option. For example, ArcView had a one-click "normalize" function to immediately calculate, say, the percentage of people 65 and over versus the total population from a data table with both columns; in QGIS, I needed to pull up a "field calculator" and create a new column with the formula to do that calculation myself.

Runs on: Linux, Unix, Mac OS X, Windows. (This is one case where installation is more complicated on OS X, since it requires manual installation of several dependencies. There's a one-click installer for Windows.)

Skill level: Intermediate to expert.

Learn more: Timothy Barmann of The Providence Journal posted two very useful tutorials for the CAR conference that are still available: Introduction to QGIS and The Latest in Mapping With JavaScript and jQuery. Barmann also offers a sample: Rhode Island's Ethnic Mosaic. Another resource to help you get started: QGIS Tutorial Labs from Richard E. Plant, professor emeritus at the University of California, Davis.

Note: If you're interested in GIS and want to consider other free software options, download this PDF listing of Open Source/Non-Commercial GIS Products. And if you're looking for a free open-source desktop GIS program that might be fairly easy to use, Jacob Fenton, director of computer-assisted reporting at American University's Investigative Reporting Workshop, recommends taking a look at the System for Automated Geoscientific Analyses (SAGA) site. Finally, if analyzing geographic data in a conventional database sounds interesting, PostGIS "spatially enables" the PostgreSQLrelational database, according to the site.

Web-based GIS/mapping

Most of us are familiar with mapping tools from major companies like Google(which has a number of third-party front ends such as Map A List, an add-on that adds info to a Google Map from a spreadsheet). There's also Yahoo Maps Web Services and Bing Maps -- all with APIs. But there are numerous other options from smaller organizations or lone open-source enthusiasts that were designed from the ground up to map geographic data.

OpenHeatMap

OpenHeatMap

What it does: This user-friendly website generates color-coded maps; the colors change depending on underlying info such as population change or average income. It can also place markers on a map, varying the size of the markers based on a data table.

Free data analysis

 
OpenHeatMap is extremely easy to use for creating data-based maps, although there are still occasional bugs in this well-thought-out service. Click to view interactive graphic.

In addition to providing the Web-based service, author Pete Warden has also packaged OpenHeatMap as a jQuery plug-in for those who don't want to rely on hosting at OpenHeatMap.com. However, not all data formats work correctly when hosted locally. "My recommended way is to embed the maps from the site," Warden wrote via Skype chat.

What's cool: It is astonishingly easy to create a color-coded map from many types of location data -- even IP addresses (just use the column header ip_address).

It took me about 60 seconds to create a basic mapfrom a spreadsheet of magnitude 7 or higher earthquakes around the world since Jan. 1, 2000, then a couple of minutes more to customize the rollover box to display both date and magnitude. (You can see a larger version on OpenHeatMap.com.)

Marker transparency, size and color are extremely simple to customize; you can also upload your own marker image, and customize what appears in the tooltips rollover by adding a tooltip column to your data source.

OpenHeatMap automatically figures out and maps locations based on a wide range of place definitions, relying on how the location columns are named -- "address," "country," "fips_code" (used by the U.S. Census Bureau), "zip_code_area" (for five-digit ZIP codes), "lat" (latitude), "lon" (longitude) and so on.

This is a well-thought-out interface from a onetime Apple engineer. (Warden said he worked on several software projects at Apple, including Final Cut Studio.)

Drawbacks: There's no way to delete data once it's been uploaded (you can get around this by using a Google Spreadsheet as a data source), and editing time is limited to as long as your browser is open and you haven't started a new map. Embedded OpenHeatMap.com-hosted maps may be slow to load.

The documentation doesn't make it clear whether you can set where the map is centered or what the default zoom level should be; Warden told me by e-mail that the system remembers where you last positioned and zoomed the map before saving. And this feature still can occasionally be buggy, although Warden is responsive to bug reports.

Skill level: Beginner.

Runs on: Web browsers enabled for Flash or HTML 5 Canvas.

Learn more: Its title notwithstanding, the four-minute video "How OpenHeatMap Can Help Journalists" offers a clear explanation for anyone interested in using the service. You can also view samples on theOpenHeatMap Gallery and check out this Guardian interactive map of where Facebook is used.

Open Layers

OpenLayers

What it does: OpenLayers is a JavaScript library for displaying map information. It's aimed at providing functionality similar to those big companies' code libraries -- but with open-source code. OpenLayers works with OpenStreetMap and other maps, as this tutorial about use with Google shows.

Other projects build on it to add functionality or ease of use, such as GeoExt, which adds more GIS capabilities. For users who are comfortable hand-coding JavaScript and prefer not to use a commercial platform such as Google or Bing, this can be a compelling option.

Drawbacks: OpenLayers is not yet as developed or as easy to use as, say, Google Maps. The project page notes that it is "still undergoing rapid development."

Skill level: Expert.

Runs on: Any Web browser.

Learn more: Try this OpenLayers Simple Example. A good sample isUshahidi's Haiti map.

There are other JavaScript libraries for overlaying information on maps, such as Polymaps. And there are a number of other mapping platforms, such as Google Maps, which offers numerous mapping APIsYahoo Maps Web Services, with its own APIs; the Bing Maps platform and APIs; andGeoCommons.

OpenStreetMap

OpenStreetMap

What it does: OpenStreetMap is somewhat like the Wikipedia of the mapping world, with various features such as roads and buildings contributed by users worldwide.

What's cool: The main attraction of OpenStreetMap is its community nature, which has led to a number of interesting uses. For example, it is compatible with the Ushahidi mobile platform used to crowdsource information after the earthquakes in Haiti and Japan. (While Ushahidi can use several different providers for the base map layer, including Google and Yahoo, some project creators feel most comfortable sticking with an open-source option.)

Drawbacks: As with any project accepting public input, there can be issues with contributors' accuracy at times (such as the helicopter landing pad someone once placed in my neighborhood -- it's actually quite a few miles away). Although, to be fair, I've encountered more than one business listing on Google Maps that was woefully out of date. In addition, the general look and feel of the maps isn't quite as polished as commercial alternatives.

Skill level: Advanced beginner to intermediate.

Runs on: Any Web browser.

Learn more: See the Quick Tutorial on the OpenLayers site.

Temporal data analysis

If time is an important component of your data, traditional timeline visualizations may show patterns, but they don't allow for sophisticated analysis or a great deal of interaction. That's where this project comes in.

Time Flow

TimeFlow

What it does: This desktop software is for analyzing data points that involve a time component. In a demo I wrote about last summer, creators Fernanda Viégas and Martin Wattenberg -- the pair behind the Many Eyes project who are now working at Google -- showed how TimeFlow can generate visual timelines from text files, with entries color- and size-coded for easy pattern spotting. It also allows the information to be sorted and filtered, and it gives some statistical summaries of the data.

Free data analysis

 
TimeFlow offers a number of different ways to easily visualize data with an important time component.
Click to view larger image.

What's cool: TimeFlow makes it incredibly easy to interact with data in various ways, such as switching views or filtering by criteria such as date ranges or earthquakes of magnitude 8 or more. The timeline view offers a slider so you can zero in on a time period. While many applications can plot bar graphs, fewer also offer calendar views. And unlike Web-based Google Fusion Tables, TimeFlow is a desktop application that makes it quick and painless to edit individual entries.

Drawbacks: This is an alpha release designed to help individual reporters doing investigative work. There are no facilities for publishing or sharing results other than taking a screen snapshot, and additional development appears unlikely in the near future.

Skill level: Beginner.

Runs on: Desktop systems running Java 1.6, including Windows and Mac OS X.

Learn more: Check out Top tips.

Note: If you're looking to publish visualized timelines, better options include Google Fusion Tables, VIDI or the SIMILE Timeline widget.

Text/word clouds

Some data visualization geeks think word clouds are either not very serious or not very original. You can think of them as the tiramisu of visualizations -- once trendy, now overused. But I still enjoy these graphics that display each word from a text file once, with the size of the words varying depending on how often each one appears in the source.

IBM Word-Cloud Generator

IBM Word-Cloud Generator

What it does: Several tools mentioned previously can create word clouds, including Many Eyes and the Google Visualization API, as well as the website Wordle (which is a handy tool for making word clouds from websites instead of text files). But if you're looking for easy desktop software dedicated to the task, IBM's free Word-Cloud desktop application fits the bill.

What's cool: This is a quick, fun and easy way to find frequency of words in text.

Drawbacks: Because it's trying to ignore words such as "a" and "the," the basic configuration can miss some important terms. In my tests, it didn't know the difference between "it" and "IT," and completely missed "AT&T."

Skill level: Advanced beginner. This app runs on the command line, so users should have ability to find file paths and plug them into a sample command.

Runs on: Windows, Mac OS X and Linux running Java.

Learn more: Check the examples that come with the download.

Social and other network analysis

These tools use a pre-Facebook/Twitter definition of "social network analysis" (SNA), referring to the discipline of finding connections between people based on various data sets. Investigative journalists have used such tools to, for example, find links between people who are involved in development projects or who are members of various boards of directors.

An understanding of statistical theories of network node analysis is necessary in order to use this category of software. Since I've only had a very basic introduction to that discipline, this is one category of tools I did not test hands-on. But if you're seeking software to do such analysis, one of these might meet your needs.

Gephi

Gephi

What it does: Billed as a Photoshop for data, this open-source beta project is designed for visualizing statistical information, including relationships within networks of up to 50,000 nodes and half a million edges (connections or relationships) as well as network analyses of factors such as "betweenness," closeness and clustering coefficient.

Free data analysis

 
Gephi can visualize networks of up to 50,000 nodes.
Click to view larger image.

Runs on: Windows, Linux, Mac OS X running Java 1.6.

Learn more: Try this Quick Start tutorial (PDF).

NodeXL

NodeXL

What it does: This Excel plug-in displays network graphs from a given list of connections, helping you analyze and see patterns and relationships in the data.

NodeXL merges the older and current definitions of SNA. It's "optimized for analyzing online social media -- it includes built-in connections to query the APIs of Twitter, Flickr and YouTube, allowing you to draw networks of users and their activity," according to Peter Aldhous, San Francisco bureau chief forNew Scientist magazine.

It also handles e-mail and conventional network analysis files (including data created by the popular -- but not free -- analysis tool UCINET).

Runs on: Excel 2007 and 2010 on Windows.

Learn more: Download this detailed free NodeXL tutorial (PDF) or these basic step-by-step instructions on analyzing your own Facebook social network(PDF). One Facebook app for downloading your own friend information for use in NodeXL is Name Gen Web.

Page statistics
25113 view(s) and 31 edit(s)
Social share
Share this page?

Tags

This page has no custom tags.
This page has no classifications.

Comments

You must to post a comment.

Attachments