Table of contents
  1. Story
  2. Story
  3. Story
  4. Slides
    1. Slide 1 Big Data for Government Symposium
    2. Slide 2 Big Data in Space and Earth Sciences
    3. Slide 3 Astronomy Example
    4. Slide 4 LSST = Large Synoptic Survey Telescope 1
    5. Slide 5 LSST = Large Synoptic Survey Telescope 2
    6. Slide 6 LSST = Large Synoptic Survey Telescope 3
    7. Slide 7 LSST Key Science Drivers: Mapping the Dynamic Universe
    8. Slide 8 LSST Key Science Drivers: Observing Strategy
    9. Slide 9 LSST Summary
    10. Slide 10 The LSST Big Data Challenges
    11. Slide 11 What is Big Data Science good for? 1
    12. Slide 12 Characteristics of Big Data Science
    13. Slide 13 Rationale for Big Data Science
    14. Slide 14 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 1
    15. Slide 15 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 2
    16. Slide 16 Insufficient Variety: stars & galaxies are not separated in this parameter
    17. Slide 17 Sufficient Variety: stars & galaxies are separated in this parameter
    18. Slide 18 The 3 important D’s of Big Data Variety
    19. Slide 19 This graph says it all …
    20. Slide 20 Data-Driven Discovery
    21. Slide 21 Correlation Discovery
    22. Slide 22 Initial impression 1
    23. Slide 23 Initial impression 2
    24. Slide 24 Novelty Detection = Surprise Discovery!
    25. Slide 25 Novelty Detection = improved discovery
    26. Slide 26 Association Discovery between events on the Sun and around the Earth​
    27. Slide 27 Association Discovery = finding interesting co-occurring associations
    28. Slide 28 Knowledge Discovery for Multi-source Data
    29. Slide 29 Space Weather Example
    30. Slide 30 What is Big Data Science good for? 2
    31. Slide 31 Case Study - Mars Rovers 1
    32. Slide 32 Case Study - Mars Rovers 2
    33. Slide 33 Mars Rover 3
    34. Slide 34 Big Data Science meets Big Science Simulations
    35. Slide 35 The Big Picture of Big Data in Space and Earth Sciences
  5. Spotfire Dashboard
  6. Research Notes
  7. Web Addresses
  8. TAEM Interview with Dr. Kirk Borne of George Mason University
    1. TAEM-Professor Borne, please tell our readers about your educational training for your fields
    2. TAEM-You are also a member of your university’s SPACS program. Please tell us about it and the goals that it has set forth
    3. TAEM-What part do you play in its agenda?
    4. TAEM-Your specialties are listed as a Data Scientist, Astrophysicist, Big Data Science Consultant, and Public Speaker
    5. TAEM-Please tell us in detail about Transdisciplinary Data Science.
    6. TAEM-How alike are Big Data and Large Data Bases, and how do they interact ?
    7. TAEM-How does Data Mining and knowledge discovery figure into these ?
    8. TAEM-We understand that you basically invented the idea of astro-informatics
    9. TAEM-We understand that your findings involving Large Database Astronomy has pinpointed Groups and Clusters of Galaxies
    10. TAEM-What advice and support can you give to NASA for its future programs of space exploration?
    11. TAEM-What programs are you working on that would pertain to this, and what research projects are you planning on for the future?
    12. TAEM-What information can you give us so that the many students who read our publication learn more about you and the school’s programs ?
    13. TAEM-Dr. Borne, it has been a sincere honor to be able to interview you for our magazine
  9. Sample Project Summary
    1. Intellectual merit
    2. Broader impacts
  10. CV Dr. Kirk D. Borne
    1. Title
    2. Professional Preparation
    3. Appointments
    4. Related Products
    5. Other Significant Products
    6. Synergistic Activities
    7. Collaborators and Co-Editors (last 4 years)
    8. Graduate and Postdoctoral Advisors and Advisees

Learn more about page titles
Big Data Science for Astronomy & Space

Last modified
Table of contents
  1. Story
  2. Story
  3. Story
  4. Slides
    1. Slide 1 Big Data for Government Symposium
    2. Slide 2 Big Data in Space and Earth Sciences
    3. Slide 3 Astronomy Example
    4. Slide 4 LSST = Large Synoptic Survey Telescope 1
    5. Slide 5 LSST = Large Synoptic Survey Telescope 2
    6. Slide 6 LSST = Large Synoptic Survey Telescope 3
    7. Slide 7 LSST Key Science Drivers: Mapping the Dynamic Universe
    8. Slide 8 LSST Key Science Drivers: Observing Strategy
    9. Slide 9 LSST Summary
    10. Slide 10 The LSST Big Data Challenges
    11. Slide 11 What is Big Data Science good for? 1
    12. Slide 12 Characteristics of Big Data Science
    13. Slide 13 Rationale for Big Data Science
    14. Slide 14 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 1
    15. Slide 15 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 2
    16. Slide 16 Insufficient Variety: stars & galaxies are not separated in this parameter
    17. Slide 17 Sufficient Variety: stars & galaxies are separated in this parameter
    18. Slide 18 The 3 important D’s of Big Data Variety
    19. Slide 19 This graph says it all …
    20. Slide 20 Data-Driven Discovery
    21. Slide 21 Correlation Discovery
    22. Slide 22 Initial impression 1
    23. Slide 23 Initial impression 2
    24. Slide 24 Novelty Detection = Surprise Discovery!
    25. Slide 25 Novelty Detection = improved discovery
    26. Slide 26 Association Discovery between events on the Sun and around the Earth​
    27. Slide 27 Association Discovery = finding interesting co-occurring associations
    28. Slide 28 Knowledge Discovery for Multi-source Data
    29. Slide 29 Space Weather Example
    30. Slide 30 What is Big Data Science good for? 2
    31. Slide 31 Case Study - Mars Rovers 1
    32. Slide 32 Case Study - Mars Rovers 2
    33. Slide 33 Mars Rover 3
    34. Slide 34 Big Data Science meets Big Science Simulations
    35. Slide 35 The Big Picture of Big Data in Space and Earth Sciences
  5. Spotfire Dashboard
  6. Research Notes
  7. Web Addresses
  8. TAEM Interview with Dr. Kirk Borne of George Mason University
    1. TAEM-Professor Borne, please tell our readers about your educational training for your fields
    2. TAEM-You are also a member of your university’s SPACS program. Please tell us about it and the goals that it has set forth
    3. TAEM-What part do you play in its agenda?
    4. TAEM-Your specialties are listed as a Data Scientist, Astrophysicist, Big Data Science Consultant, and Public Speaker
    5. TAEM-Please tell us in detail about Transdisciplinary Data Science.
    6. TAEM-How alike are Big Data and Large Data Bases, and how do they interact ?
    7. TAEM-How does Data Mining and knowledge discovery figure into these ?
    8. TAEM-We understand that you basically invented the idea of astro-informatics
    9. TAEM-We understand that your findings involving Large Database Astronomy has pinpointed Groups and Clusters of Galaxies
    10. TAEM-What advice and support can you give to NASA for its future programs of space exploration?
    11. TAEM-What programs are you working on that would pertain to this, and what research projects are you planning on for the future?
    12. TAEM-What information can you give us so that the many students who read our publication learn more about you and the school’s programs ?
    13. TAEM-Dr. Borne, it has been a sincere honor to be able to interview you for our magazine
  9. Sample Project Summary
    1. Intellectual merit
    2. Broader impacts
  10. CV Dr. Kirk D. Borne
    1. Title
    2. Professional Preparation
    3. Appointments
    4. Related Products
    5. Other Significant Products
    6. Synergistic Activities
    7. Collaborators and Co-Editors (last 4 years)
    8. Graduate and Postdoctoral Advisors and Advisees

  1. Story
  2. Story
  3. Story
  4. Slides
    1. Slide 1 Big Data for Government Symposium
    2. Slide 2 Big Data in Space and Earth Sciences
    3. Slide 3 Astronomy Example
    4. Slide 4 LSST = Large Synoptic Survey Telescope 1
    5. Slide 5 LSST = Large Synoptic Survey Telescope 2
    6. Slide 6 LSST = Large Synoptic Survey Telescope 3
    7. Slide 7 LSST Key Science Drivers: Mapping the Dynamic Universe
    8. Slide 8 LSST Key Science Drivers: Observing Strategy
    9. Slide 9 LSST Summary
    10. Slide 10 The LSST Big Data Challenges
    11. Slide 11 What is Big Data Science good for? 1
    12. Slide 12 Characteristics of Big Data Science
    13. Slide 13 Rationale for Big Data Science
    14. Slide 14 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 1
    15. Slide 15 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 2
    16. Slide 16 Insufficient Variety: stars & galaxies are not separated in this parameter
    17. Slide 17 Sufficient Variety: stars & galaxies are separated in this parameter
    18. Slide 18 The 3 important D’s of Big Data Variety
    19. Slide 19 This graph says it all …
    20. Slide 20 Data-Driven Discovery
    21. Slide 21 Correlation Discovery
    22. Slide 22 Initial impression 1
    23. Slide 23 Initial impression 2
    24. Slide 24 Novelty Detection = Surprise Discovery!
    25. Slide 25 Novelty Detection = improved discovery
    26. Slide 26 Association Discovery between events on the Sun and around the Earth​
    27. Slide 27 Association Discovery = finding interesting co-occurring associations
    28. Slide 28 Knowledge Discovery for Multi-source Data
    29. Slide 29 Space Weather Example
    30. Slide 30 What is Big Data Science good for? 2
    31. Slide 31 Case Study - Mars Rovers 1
    32. Slide 32 Case Study - Mars Rovers 2
    33. Slide 33 Mars Rover 3
    34. Slide 34 Big Data Science meets Big Science Simulations
    35. Slide 35 The Big Picture of Big Data in Space and Earth Sciences
  5. Spotfire Dashboard
  6. Research Notes
  7. Web Addresses
  8. TAEM Interview with Dr. Kirk Borne of George Mason University
    1. TAEM-Professor Borne, please tell our readers about your educational training for your fields
    2. TAEM-You are also a member of your university’s SPACS program. Please tell us about it and the goals that it has set forth
    3. TAEM-What part do you play in its agenda?
    4. TAEM-Your specialties are listed as a Data Scientist, Astrophysicist, Big Data Science Consultant, and Public Speaker
    5. TAEM-Please tell us in detail about Transdisciplinary Data Science.
    6. TAEM-How alike are Big Data and Large Data Bases, and how do they interact ?
    7. TAEM-How does Data Mining and knowledge discovery figure into these ?
    8. TAEM-We understand that you basically invented the idea of astro-informatics
    9. TAEM-We understand that your findings involving Large Database Astronomy has pinpointed Groups and Clusters of Galaxies
    10. TAEM-What advice and support can you give to NASA for its future programs of space exploration?
    11. TAEM-What programs are you working on that would pertain to this, and what research projects are you planning on for the future?
    12. TAEM-What information can you give us so that the many students who read our publication learn more about you and the school’s programs ?
    13. TAEM-Dr. Borne, it has been a sincere honor to be able to interview you for our magazine
  9. Sample Project Summary
    1. Intellectual merit
    2. Broader impacts
  10. CV Dr. Kirk D. Borne
    1. Title
    2. Professional Preparation
    3. Appointments
    4. Related Products
    5. Other Significant Products
    6. Synergistic Activities
    7. Collaborators and Co-Editors (last 4 years)
    8. Graduate and Postdoctoral Advisors and Advisees

Story

Big Data Science for Astronomy and Space

Astronomy is the way man devised to originally understand the universe and space is what man now used to describe everything that is out there.

The Large Sky Survey Telescope (LSST), which is the primary NSF project for the decade, will be the most powerful astronomical imaging device ever built. One of the reasons for LSST is to track objects 100 m objects that could wipe out cities.

Interestingly, a group of former astronauts announce recently an effort to accomplish the same goal by space means, not a telescope.

Far More Asteroids Have Hit The Earth Than We Thought, Astronauts Say

Bad news, earthlings. A former NASA scientist says it's mere happenstance that an Armageddon-style asteroid hasn't hit a densely populated area in the last few years.

On Tuesday, the B612 Foundation, which is devoted to preventing the next deep impact, will present data from a nuclear-weapons test warning satellite showing that far more asteroids have hit earth in the past few years than previously thought, the organization announced on its website.

The data, collected from a nuclear missile detection system that picks up large blasts on earth, shows that since 2001, asteroids have caused 26 explosions on the scale of an atomic bomb.

“This data shows that asteroid impacts are NOT rare, but actually 3-10 times more common than we previously thought,” Ed Lu, one of the astronauts working on the project, said in a statement. "The fact that none of these asteroid impacts shown in the video was detected in advance is proof that the only thing preventing a catastrophe from a 'city-killer' sized asteroid is blind luck."

The silver lining? Scientists are working to deflect any future space rocks from our planet.

Lu, along with fellow ex-astronauts Tom Jones and Bill Anders, has been attempting to develop a better asteroid early-warning system, the Sentinel Infrared Space Telescope, which they hope will become "the principal means by which nearly all asteroid discoveries will be made." In an interview with Wired, Lu explained that the telescope will work by scanning the sky in infrared, which will allow it to calculate the trajectory and velocity of asteroids.

NASA has made efforts to track asteroids, but they haven't been as aggressive as the B612 scientists would like. In 1998, NASA established the Near-Earth Object Program Office to detect potentially hazardous comets and asteroids. In March, the agency announced a contest for scientists to develop asteroid-detecting algorithms, a year after a meteorite explosion in Russia made international headlines.

You can read more about B612 and its mission from Slate's Phil Plait here.

Story

Semantic Data Science Harvard’s ADS System

ADS is *the* repository of all astronomy pubs since recorded history, going back to Ptolemy, including all journals today.

Using journal collections from multiple disciplines is a good plan.

They are funded by NASA and they are doing excellent funded semantics work already on their collection.

The NASA ADS team has their own linked data (articles + data sets) approach that is well developed and mature (and deeply discipline-specific).

Find the ADS presentation I attended in this knowledge base.

Story

Dr. Kirk Borne of George Mason University

Mine Tweets - See Book Developing Analytical Talent

See Workshops on Extremely Large Databases

See recent articles: Kirk Borne is blogging and consulting for MapR (also see Glossary for letter Y).

Creating Linked Data in RDF format in spreadsheets for input to Spotfire, Semantic Insights, and YarcData from publications, journals, books, Web pages, etc.

The published scientific data will be shared via the World Wide Web (WWW) using our Data Management Plan for sharing semantically relevant data. This directly benefits the global community by enabling rapid collaboration and enrichment of our novel engineering techniques with a fuller spectrum of analysis to include intellectual, descriptive, exploratory, inferential, predictive, causal, and mechanistic data analytics.   For example, as our innovations with asteroid mapping deliver new data and discoveries, we will be able to cross reference these new resources with rich media in the Sloan Digital Sky Survey (www.sdss.org) to enable better educational and encyclopedic documentation for the non-scientific community.​

Slides

Slides

Slide 1 Big Data for Government Symposium

http://www.ttcus.com

KirkBorne06172014Slide1.png

Slide 2 Big Data in Space and Earth Sciences

@KirkDBorne

KirkBorne06172014Slide2.png

Slide 3 Astronomy Example

KirkBorne06172014Slide3.png

Slide 4 LSST = Large Synoptic Survey Telescope 1

http://www.lsst.org/

KirkBorne06172014Slide4.png

Slide 5 LSST = Large Synoptic Survey Telescope 2

http://www.lsst.org/

KirkBorne06172014Slide5.png

Slide 6 LSST = Large Synoptic Survey Telescope 3

http://www.lsst.org/

KirkBorne06172014Slide6.png

Slide 7 LSST Key Science Drivers: Mapping the Dynamic Universe

KirkBorne06172014Slide7.png

Slide 8 LSST Key Science Drivers: Observing Strategy

KirkBorne06172014Slide8.png

Slide 9 LSST Summary

http://www.lsst.org/

KirkBorne06172014Slide9.png

Slide 10 The LSST Big Data Challenges

KirkBorne06172014Slide10.png

Slide 11 What is Big Data Science good for? 1

KirkBorne06172014Slide11.png

Slide 12 Characteristics of Big Data Science

KirkBorne06172014Slide12.png

Slide 13 Rationale for Big Data Science

KirkBorne06172014Slide13.png

Slide 14 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 1

http://bit.ly/1hH6sB9

KirkBorne06172014Slide14.png

Slide 15 Characterizing and Exposing the Big Data Hype: 3 V’s or ? 2

http://bit.ly/1hH6sB9

KirkBorne06172014Slide15.png

Slide 16 Insufficient Variety: stars & galaxies are not separated in this parameter

http://bit.ly/1hH6sB9

KirkBorne06172014Slide16.png

Slide 17 Sufficient Variety: stars & galaxies are separated in this parameter

KirkBorne06172014Slide17.png

Slide 18 The 3 important D’s of Big Data Variety

KirkBorne06172014Slide18.png

Slide 19 This graph says it all …

http://www.cs.princeton.edu/courses/...BrunnerDPS.pdf

KirkBorne06172014Slide19.png

Slide 20 Data-Driven Discovery

KirkBorne06172014Slide20.png

Slide 21 Correlation Discovery

KirkBorne06172014Slide21.png

Slide 22 Initial impression 1

KirkBorne06172014Slide22.png

Slide 23 Initial impression 2

KirkBorne06172014Slide23.png

Slide 24 Novelty Detection = Surprise Discovery!

KirkBorne06172014Slide24.png

Slide 25 Novelty Detection = improved discovery

KirkBorne06172014Slide25.png 

Slide 26 Association Discovery between events on the Sun and around the Earth​KirkBorne06172014Slide26.png

Slide 27 Association Discovery = finding interesting co-occurring associations

KirkBorne06172014Slide27.png

Slide 28 Knowledge Discovery for Multi-source Data

KirkBorne06172014Slide28.png

Slide 29 Space Weather Example

KirkBorne06172014Slide29.png

Slide 30 What is Big Data Science good for? 2

KirkBorne06172014Slide30.png

Slide 31 Case Study - Mars Rovers 1

KirkBorne06172014Slide31.png

Slide 32 Case Study - Mars Rovers 2

KirkBorne06172014Slide32.png

Slide 33 Mars Rover 3

KirkBorne06172014Slide33.png

Slide 34 Big Data Science meets Big Science Simulations

KirkBorne06172014Slide34.png

Slide 35 The Big Picture of Big Data in Space and Earth Sciences

KirkBorne06172014Slide35.png

Spotfire Dashboard

Research Notes

TAEM Interview with Dr. Kirk Borne of George Mason University

Source: http://www.eeriedigest.com/wordpress/2013/01/taem-interview-with-dr-kirk-borne-of-george-mason-university/

TAEM- The Arts and Entertainment’ Magazine’s publisher, Joseph J. O’Donnell, issued a challenge in the December 15th issue of our publication to start a ‘grass roots movement’ to support NASA. This support is spreading over the academic world and its start has taken place in the George Mason University faculty and student body. The challenge has centered on not only on Support of NASA, but to give the agency ideas for space exploration for its future programs.

Dr. Kirk Borne, of GMU, is a Data Scientist and Astrophysicist, and is one of the many professors from the school that has stepped forward to offer insights into what can be achieved.

TAEM-Professor Borne, please tell our readers about your educational training for your fields

KB- My undergraduate B.S. degree was in Physics at Louisiana State University, with a lot of math and some astronomy.  My goal was to study astronomy in graduate school, so the math and physics coursework was essential.  I loved all of those topics, and astronomy gave me the opportunity to study them all. I went to graduate school at Caltech, receiving a PhD in astronomy in 1983. I studied under some of the great astronomers of that era. It was a fantastic experience. In the years since then, I worked on NASA’s Hubble Space Telescope project for 10 years and at NASA’s Astronomy Data Center within the Space Science Data Operations Office at the Goddard Space Flight Center for another 10 years, and I have now been at George Mason University since 2003.  All of my research and my work experiences at NASA always involved working with scientific data – this led me to the field of Data Science, which is the application of data methods and algorithms to the study of any discipline.

TAEM-You are also a member of your university’s SPACS program. Please tell us about it and the goals that it has set forth

KB- SPACS is the School of Physics, Astronomy, and Computational Sciences. This is a unique program among universities in the US. Our faculty and students focus on a wide range of research problems involving physics, astronomy, computational science, and data science.

TAEM-What part do you play in its agenda?

KB- I have helped to develop the Data Science curriculum within the school. In that capacity, I am the undergraduate advisor for students in the Computational & Data Sciences B.S. degree program.  I also advise many graduate students within our CSI (Computational Science & Informatics) PhD program. I teach courses related to Data Science, including Scientific Databases, Computational Data Science, Data Mining, and Data Ethics.  In addition to advising and teaching in this program, I also carry out data science research – mostly in astronomy, but covering many other fields.

TAEM-Your specialties are listed as a Data Scientist, Astrophysicist, Big Data Science Consultant, and Public Speaker

Please describe your capacities in these venues and how they are connected.

KB- I have many years of experience working with data, databases, and data science methodology (including data mining, statistics, and visualization).  This experience includes teaching and research, but it also has led me to assist, advise, and consult with other organizations and federal agencies regarding their data activities.  This has gained me some notoriety, so I receive many (10 to 20) invitations to speak at conferences and universities worldwide each year on the topic of Data Science, specifically “Big Data”. My two most amazing experiences in this capacity are these: first, in 2001, I was asked to brief the US President on data mining; and second, in 2011, I was the conference keynote speaker at the Medicare and Medicaid Statistics Conference.  I never imagined such experiences when I was focusing only on my astronomy research years ago.

TAEM-Please tell us in detail about Transdisciplinary Data Science.

KB- Transdisciplinary is different from multi-disciplinary or interdisciplinary in that it refers to the fact that Data Science transcends traditional discipline boundaries.  I can work with financial experts, climate scientists, agriculture specialists, criminologists, drug safety organizations, and library staff on data issues without necessarily requiring me to learn their field or requiring them to learn astronomy. The language of Data Science (databases, data, metadata, statistics, visualization, data mining) transcends those discipline-specific concepts. Data Science enables productive, meaningful, and enlightening research experiences across discipline boundaries for everyone involved.  For me personally, I like to think of myself as a Transdisciplinary Data Scientist because my research on data mining algorithms, data structures, data management methods, and statistics are applicable to almost any discipline.

TAEM-How alike are Big Data and Large Data Bases, and how do they interact ?

KB- Big Data is a concept that conveys many meanings and implications to different audiences. It includes large databases, but it includes tons of other things, including large data collections that are not in databases (such as Internet blogs, social network postings, news reports, online video content, images, audio, publications, articles, and anything else that is in digital or non-digital form).  Large databases are only a small subset of Big Data collections. The Big Data concept also conveys additional meanings, such as the challenges associated with discovery, access, mining, analysis, and interpretation of massive data collections.  These challenges include the large data volume, but also the complexity of the data (as indicated above, the large variety of data types), as well as the enormous rate at which data are being generated in the world today. The data production rate doubles every year, which means that the world will have at least 1000 times more data ten years from now, one million times more data 20 years from now, and so on.  One estimate of the total amount of data created in the world each and every day is three exabytes, which is three million terabytes or three billion gigabytes.  This is more than a thousand times all of the information in all of the books and journals in all of the libraries in the world that have ever been published in the history of humanity. This is the amount that we create each new day this year – we will create roughly this same amount of data every 90 seconds ten years from now, and roughly this same amount every 1/10 of a second 20 years from now! We must teach the next generation how to handle, deal with, cope with, and make use of this information flood.  Big Data therefore refers to the combination of all of these challenges and issues (both technological and human).

TAEM-How does Data Mining and knowledge discovery figure into these ?

KB- The collection of large data into archives and databases is useless unless you intend to use the data for something; and that “something” is discovery of patterns, trends, correlations, features, outliers, anomalies, and unexpected “knowledge” nuggets hidden within these enormous data sets (i.e., finding the proverbial needle in the haystack).  That is what we call data mining, which is also called Knowledge Discovery from Data.  This process is also referred to as “Learning from Data”.  When it is applied to making decisions about future behaviors of consumers, or systems, or physical processes, data mining is called “Predictive Analytics”.  Yes, people use data mining to predict the future.  There are numerous movies and TV shows that highlight these techniques – the techniques are real (even though the shows are fictional).  Finding new knowledge is what science is all about.  The amazing thing now is that businesses, agencies, sports teams, grocery stores, entertainers, social networking sites, and everyone else have realized that they too can discover new knowledge (e.g., predict behaviors or outcomes) from their data collections: ticket sales, purchases, buying patterns, behavior patterns, system logs, and more.

TAEM-We understand that you basically invented the idea of astro-informatics

Please tell our readers about this and some of the Big Data issues that faces science today.

KB- Informatics is the application of Data Science to a specific discipline, though we sometimes simply define Informatics as Data Science. Some science-specific informatics disciplines are well established, including Bioinformatics and Geoinformatics. It occurred to me about 10 years ago that astronomy’s big (and growing) data collections would require a similar methodological subdiscipline of the field of astronomy – I called this Astroinformatics. I used this word for many years without much uptake by the astronomy community. I published a journal paper and another paper for the National Academy of Sciences on Astroinformatics 3 years ago – now, it is a very commonly used term, there are Astroinformatics conferences every year, and there are Astroinformatics committees within the major astronomy professional societies (in the US and internationally). I am a member of all of these committees. It is very exhilarating to see how far the field has come in such a short time.  The real reasons for these informatics approaches to science are the same reasons mentioned earlier: scientists are generating huge amounts of data from their experiments, we want to explore these data collections as effectively and efficiently as possible, we want to discover all of the knowledge that is hidden in these data, and we want to unlock the mysteries of the world and Universe around us that our huge experiments can now reveal to us.

TAEM-We understand that your findings involving Large Database Astronomy has pinpointed Groups and Clusters of Galaxies

How does Massive Databases and Large Sky Surveys make this possible ?

KB- My current research is focused on outlier detection, which I prefer to call Surprise Discovery – finding the unknown unknowns and the unexpected patterns in the data.  These discoveries may reveal data quality problems (i.e., problems with the experiment or data processing pipeline), but they may also reveal totally new astrophysical phenomena: new types of galaxies or stars or whatever. That discovery potential is huge within the huge data collections that are being generated from the large astronomical sky surveys that are taking place now and will take place in the coming decades. I haven’t yet found that one special class of objects or new type of astrophysical process that will win me a Nobel Prize, but you never know what platinum-plated needles may be hiding in those data haystacks.

TAEM-What advice and support can you give to NASA for its future programs of space exploration?

KB- In the context of data science, the most important lesson is that the data generated from all NASA missions be made openly available to the research community in useful and self-explanatory ways, to facilitate new and interesting uses of the data for research and discovery.  It is also helpful if these data are annotated or tagged with rich metadata, which will further enable integration and fusion of data products from multiple missions, thus enabling far greater discovery potential. One of the primary functions of metadata is to provide a short-hand condensed representation of the data product. This helps to address some of the challenges associated with Big Data: making the data more manageable and conveniently usable. We are already familiar with such hierarchical data structures – anyone who has used Google or Bing Maps experiences this – the map of the world is initially presented to you in very low resolution mode, but the resolution getting higher and higher as you drill down to some specific location – you are finally able to view your backyard or some destination at the highest resolution available from some satellite image collection, but you are definitely not viewing the whole Earth at one time at that same high resolution. Exploratory data analysis makes use of such hierarchical data structures, which NASA missions should generate for their science data users.

TAEM-What programs are you working on that would pertain to this, and what research projects are you planning on for the future?

KB- I have been working on Citizen Science projects, such as Galaxy Zoo and Zooniverse.org, in which volunteer citizens tag and annotate our scientific data products (e.g., images, or time series, or model outputs, or whatever) with descriptive characteristics. This characterization of complex data products becomes part of the metadata associated with that data product, thus enabling linkages between different data products and discovery of new patterns, trends, correlations, and behaviors.  I envision extending this to other projects in the future, examining the role of non-experts in metadata generation (characterization) and conducting exploratory research into the “best” pattern recognition methods for discovering interesting, surprising, and informative features in large science data collections.

TAEM-What information can you give us so that the many students who read our publication learn more about you and the school’s programs ?

KB- I invite others to check out some of the research and academic programs within Mason’s SPACS school at http://spacs.gmu.edu and to check out my own research and teaching interests at http://classweb.gmu.edu/kborne . If you are a Twitter user, you can follow me there as I actively tweet about Big Data, Data Science, and Astronomy under the handle @KirkDBorne.

TAEM-Dr. Borne, it has been a sincere honor to be able to interview you for our magazine

We have discovered that, like yourself, the faculty at George Mason University is a virtual well of information and that your school has been one of the premier sources of one of the most well trained student bodies in the nation. We want to thank you for your time and look forward to talking again with you in the very near future. TAEM

Sample Project Summary

Source: Word

Intellectual merit

While the most important projects in modern observational astrophysics are based on robotic telescopes generating massive datasets of astronomical data, the existing computational methodology lags behind and is currently not able to effectively analyze these data. The purpose of this project is to develop and apply algorithms and methods that will mine large datasets of astronomical images and detect peculiar celestial objects of extreme scientific value. Peculiar and irregular galaxies are of paramount scientific interest, and are important for understanding the most fundamental questions about the early, present, and future universe. Since they are very different from the majority of galaxies, peculiar galaxies carry crucial information on the history of interactions of objects, their formation and evolutionary history.

The algorithms will be applied to the most significant digital sky surveys: Sloan Digital Sky Survey (SDSS), which is the most powerful digital sky survey that exists today,  and the Large Synoptic Survey Telescope (LSST), which is the primary NSF project for the decade, and will be the most powerful astronomical imaging device ever built. SDSS acquired images of over 200 million galaxies, while LSST will acquire images of approximately 20 billion galaxies.  Hence, automated detection of rare galaxies through pattern recognition is absolutely essential.  For example, even extremely rare one-in-a-million types of galaxies will occur over 10,000 times in the LSST imaging survey -- we need automated software to find these very rare but critically important objects. JFW is a collaborator on SDSS and initiator of Galaxy Zoo Mergers. KDB is the chair of informatics and statistics of LSST, and also a member of LSST galaxy research collaboration. LS and JFW are members of LSST Informatics and Statistics collaboration.

Broader impacts

The broader impact of the project can be divided into four primary parts:

1) The data products generated by applying these methods to SDSS and LSST will be used by hundreds of scientists (LSST alone has over 300 formal collaborators), and will be publicly available to serve many more. Fundamental scientific questions that can be addressed include the derivation of shapes of dark matter halos with follow-up 2D spectroscopic observations of identified polar ring galaxies, the determination of extragalactic extinction laws from E/S0 galaxies with dust lanes from SDSS observed in the ultraviolet by GALEX (Galaxy Evolution Explorer), etc. The detected gravitational lenses can assist in understanding some of the most fundamental questions about the universe, with impact far beyond the scope of this project.

2) The algorithms will be applied to the already-available SDSS data and the future LSST, but the algorithms and source code will become publicly available and will provide solutions to other large-scale surveys such as the Dark Energy Survey (DES) and Pan-STARRS, which are also among the most important scientific ventures of the decade. Since data analysis is the main bottleneck of digital sky surveys, the developed methods will allow better utilization of the power of these important projects.

3) Due to the volume and quality of SDSS and especially LSST data, it is likely that the proposed project will lead to high-impact discoveries as described in the proposal. In addition to their astrophysical importance, these discoveries and the computational methods that enable them can promote the field of Astroinformatics and increase the awareness of the Computer Science community to this emerging field.

4) The project will include the participation of graduate and undergraduate students, who will contribute to the development of the code, as well as the practical experiments and performance evaluation. This will expose graduate and undergraduate Computer Science students to interdisciplinary research, and will provide them with the opportunity to take part in Computer Science projects that aim at practical solutions to fundamental scientific questions.

Key Words: Astroinformatics; Computational Astrophysics; Digital Sky Surveys; Big data.

CV Dr. Kirk D. Borne

Source: PDF

Title

Professor of Astrophysics & Computational Science
School of Physics, Astronomy, & Computational Sciences
George Mason University, Fairfax, VA 22030
Phone: (703) 993-8402, E-mail: kborne@gmu.edu
WWW: http://classweb.gmu.edu/kborne/

Professional Preparation

1975 B.S., Physics, Summa Cum Laude, Louisiana State University
1980 M.S., Astronomy, Caltech, Pasadena
1983 Ph.D., Astronomy, Caltech, Pasadena (Advisor: James E. Gunn)
1981-83 Postdoctoral Fellow, Astronomy, University of Michigan
1983-85 Carnegie Fellow, Dept. of Terrestrial Magnetism, CIW

Appointments

2011- Professor of Astrophysics and Computational Science, GMU

2003-2011 Associate Professor of Astrophysics and Computational Science, GMU

2002-2007 Adjunct Associate Professor, UMUC Graduate School, Database Systems Technologies Program

2005-2007 Program Manager, SSDOO (Space Science Data Operations Oce) Project, QSS Group Inc., NASA/GSFC

1999 Sabbatical Visitor, STScI

1995-2002 Astrophysics Department Manager, Astrophysics Data Facility and Astronomical Data Center, Raytheon ITSS / Hughes STX, NASA/GSFC

1992-1995 ST-DADS Project Scientist, Hubble Space Telescope Science Institute

1985-1995 Scientist, Hubble Space Telescope Science Institute

1983-1985 Carnegie Fellow, DTM - Carnegie Institution of Washington

1981-1983 Teaching Fellow, Dept. of Astronomy, University of Michigan

Related Products

 1. Borne, K., "A Machine Learning Classification Broker for the LSST Transient Database,"  Astronomische Nachrichten, 329, 255 (2008).

2. Borne, K., "Scientific Data Mining in Astronomy," in Next Generation Data Mining (CRC Press: Taylor & Francis), pp. 91{114 (2009). arXiv.org:0911.0505

3. Das, K., Bhaduri, K., Arora, S, Grin, W., Borne, K., Giannella, C., & Kargupta, H., "Scalable Distributed Change Detection from Astronomy Data Streams using Local, Asynchronous Eigen-Monitoring Algorithms," peer-reviewed proceedings of SIAM Data Mining SDM09 (2009).

4. Borne, K., "Astroinformatics: Data-Oriented Astronomy Research and Education," Journal of Earth Science Informatics, 3, 5 (2010).

5. Bhaduri, K., Das, K., Borne, K., Giannella, C., Mahule, T., & Kargupta, H., \Scalable, Asynchronous, Distributed Eigen-Monitoring of Astronomy Data Streams," Journal of Statistical  Analysis and Data Mining, 4(3), 336 (2011).​

Other Significant Products

1. Borne, K. D., "Distributed Data Mining in the National Virtual Observatory," in SPIE Data Mining & Knowledge Discovery V, vol. 5098, p. 211 (2003).

2. Giannella, C., Dutta, H., Borne, K., Wol , R., & Kargupta, H., "Distributed Data Mining for Astronomy Catalogs," SIAM Scientific Data Mining (2006).

3. Borne, K., & Eastman, T., "Collaborative Knowledge-Sharing for E-Science," AAAI Semantic Web for Collaborative Knowledge Acquisition (2006), available at http://www.aaai.org/Papers/Symposia/...S06-06-017.pdf

4. Dutta, H., Giannella, C., Borne, K., & Kargupta, H., "Distributed Top-K Outlier Detection from Astronomy Catalogs using the DEMAC System," SIAM Scientific Data Mining (2007).

5. Olmedo, O., Zhang, J., Wechsler, H., Poland, A., Borne, K., "Automatic Detection and Tracking of CMEs in Coronagraph Time Series," Solar Physics, 248, 485 (2008).

Synergistic Activities

Senior Science Personnel, National Virtual Observatory Project

Chairman, LSST Informatics and Statistics Science Collaboration Team

Member, ISI Executive Committee for Astrostatistics

Science Organizing Committee, Conference on Intelligent Data Understanding (2010, 2011)

Science Organizing Committee, Statistical Challenges in Modern Astronomy (SCMA 2011)

Collaborators and Co-Editors (last 4 years)

S.Arora (UMBC), T.Axelrod (LSSTC), J.Babu (PSU), A.C.Becker (U.Wash), J.Becla (SLAC), K.Bhaduri (UMBC), T.Boroson (NOAO), K.Bowyer (UND), D.L.Burke (SLAC), D.Carr (GMU), A.Chang (MIT), C.Claver (NOAO), A.Connolly (U.Wash), K.Cook (LLNL), D.Darg (Oxford), K.Das (SGT Inc.), M.DeMaria (NOAA), G.Djorgovski (Caltech), H.Dutta (Columbia), T.Eastman (GSFC), E.Feigelson (PSU), T.Finin (UMBC), L.Fortson (Adler Planetarium and Univ. Minnesota), P.Fox (RPI), S.Fung (NASA), H.Ferguson (STScI), J.Gentle (GMU), C.Giannella (Mitre Corporation), D.K.Gilmore (SLAC), M.Graham (Caltech), A.Gray (Georgia Tech), E.Grayzeck (NASA), J.Green (NASA), W.Grin (UMBC), L.Hall (U.South Florida), T.Hamilton (Shawnee St. U.), R.Hanisch (STScI), Z.Ivezic (U.Wash), S.Jacoby (LSSTC), R.L.Jones (U.Wash), A.Joshi (UMBC), M.Juric (IAS), S.M.Kahn (SLAC), H.Kargupta(UMBC), L.Kerschberg(GMU), K.-T.Lim (SLAC), J.Lin (GMU), C.Lintott(Oxford), Z.Liu (GMU), T.Loredo (Cornell), J.Lotz (NOAO), R.Lupton(Princeton), A.Mahabal(Caltech), T.Mahule(UMBC), T.Matheson(NOAO), T.McGlynn (NASA), R.McGuire (NASA) J.Miller (NASA), D.G.Monet (USNO-FS), T.Narock (NASA,UMBC), O.Olmedo (GMU), W.Pence (NASA), P.Pinto (U.Arizona), A.Poland (GMU), M.J.Raddick (JHU), S.Ridgway (NOAO), N.Samatova (NCSU/ORNL), A.Saha (NOAO), D.Sawyer (NASA), B.Sesar (U.Wash), R.Shaw (NOAO), E.Shaya (UMD), A.Smith (Oxford), C.W.Stubbs (CFA/Harvard), V.Sugumaran (Oakland U.), D.Sun (GMU), A.Szalay (JHU), J.A.Tyson (UCDavis), K.Wagsta (JPL), J.Wallin (GMU), H.Wechsler (GMU), E.Wegman (GMU), R.Weigel (GMU), R.Williams (Caltech), R.Wol (UMBC), R.Yang (GMU), J.Zhang (GMU), X.Zhu (UMBC)

Graduate and Postdoctoral Advisors and Advisees

Graduate students: T. Boggs, R.Dun, D.Ghoshal, C.Grieg, G.Jacobs, P.Nayak, A.Vedachalam; PhD advisor: J.Gunn (Princeton); Postdoctoral sponsors: D.Richstone (U.Michigan), V.Rubin (DTM-CIW)

Page statistics
3057 view(s) and 19 edit(s)
Social share
Share this page?

Tags

This page has no custom tags.
This page has no classifications.

Comments

You must to post a comment.

Attachments