Cloud Computing AND Big Data Forum and Workshop January 15-17 2013

Last modified

Story

220px-Steven_VanRoekel_Headshot.jpg

Canada's Shared Services Cloud Upstages Federal CIO Van Roekel at NIST Cloud AND Big Data Event

Federal CIO Steven VanRoekel gave the keynote at the NIST Cloud Computing AND Big Data Forum & Workshop talking about the progress that FedRamp (one organization certified and 78 more soon) and the Digital Strategy (progress on 12 months of deliverables) were making.

But last minute substitute speaker Steven Woodward, CEO and founder of Cloud Perspectives, a member of the Shared Services Canada Advisory Committee, leading contributor to the NIST FedRamp, and Canadian Cloud Council Government CTO, delivered the real meat when he announced that the Canadian government had created a new centralized shared services in the cloud agency by moving 7500 employee eventually allocated from 43 departments and agencies.

Shared Services Canada has the following goals:

  • Moving 43 partner organizations from separate email services to a consolidate, secure, and reliable and cost-effective email platform for the Government of Canada.
  • To move from more than 300 Government of Canada data centres to fewer than 20.
  • To move the Government of Canada to a single, shared telecommunications network infrastructure.

The Canadian Cloud Council is developing an online directory of cloud services, toolkits and resources for companies who want to sell cloud services and companies who want to move into the cloud. The Toolkit is based on NIST and open standards and will promote the adoption of open and best pratices for acceleration of the adoption of cloud technologies.

So the Canadians have become the world leader in government shared services by reorganizations and adopting and implementing the NIST FedRamp and other standards. There is a lesson there about the slow-pace of US Federal Government adoption of more agile approaches in general.

The NIST Cloud AND Big Data Conference also showed that former NIST Cloud Executive, Dawn Leaf, has moved on to become the deputy CIO at the Department of Labor and both the Cloud AND Big Data programs are now in the capable hands of:

  • Dr. Christopher L. Greer, Acting Senior Advisor for Cloud Computing and Associate Director, Program Implementation Office, ITL, NIST
  • Dr. Charles H. Romine, Director, NIST Information Technology Laboratory (ITL)
  • Dr. Robert Bohn, NIST Program Manager for Cloud Computing

Vinton Cerf, widely known as one of the Fathers of the Internet, and current Google Vice President and Chief Internet Evangelist, explained that while not an expert on cloud and big data, they need a secure sharing environment, loose coupling, and inter-cloud interoperability. He used the new Google driverless car to illustrate the work the Google is doing on these problems.

The big announcement at the event was that NIH had formed a Data and Informatics Working Group and would soon hire a new NIH Associate Director - Chief Data Scientist.

Now we just need a Chief Data Officer for the entire US Government to work on data as data projects and not IT projects like I recommended to Congressional staff last fall and to the Data Transparency Coalition advising the 113 Congress on the reintroduction of the Data Act.

The Webcast and slides are available.

Story

Professor Geoffery Fox, School of Informatics and Computing, Digital Science Center, Indiana University Bloomington

SPIDAL: Scalable Parallel Interoperable  Data Analytics Library (My Spotfire 5 Library: https://silverspotfire.tibco.com/us/...iemann/Public/?

Alex Szalay, Alumni Centennial Professor, Department of Physics and Astronomy, The Johns Hopkins University

We need new instruments: “microscopes” and “telescopes” for data

https://silverspotfire.tibco.com/us/...iemann/Public/?

Education, Architecture, Algorithms, and Experiments

Research Notes

Thank you for registering for a conference through NIST.

Your registration was received on December 8, 2012.  Please print this e-mail and keep it for your records.  It should serve as the confirmation for your registration.

Please verify the personal information below.  If there is an error in this information, please contact us as soon as possible via telephone: 301-975-2776, or fax: 301-948-2067.

    Dr. Brand Niemann

     Semantic Community

     4191 Lochleven Trail #304  304

     Fairfax,  VA  22030

     USA

     Phone: (703)268-9314

     Fax:

The conference information is:

     Cloud Computing and Big Data Forum

     January 15, 2013 - January 17, 2013

     NIST

     Gaithersburg, MD

There will be no fee charged for registration in this conference.

Additional information may be found at http://www.nist.gov/allevents.cfm

If your meeting will be held at NIST and you need directions to the NIST facility, please visit http://www.nist.gov  and click on Visitor Information.  The building and room locations for the meeting will be given to you when you arrive.  Valid photo ID must be presented at the Visitor Center.  International attendees are required to present a passport.

If you have any questions, please call 301-975-2776 or contact the NIST Conference Program Office by fax at 301-948-2067.

Slides

Data-Intensive Science: The Fourth Paradigm

Slides

AlexSzalay01152013Slide1.png

Big Data in Science

AlexSzalay01152013Slide2.png

Gray’s Laws of Data Engineering

Scientific computing is revolving around data
• Need scale-out solution for analysis
• Take the analysis to the data!
• Start with “20 queries”
• Go from “working

AlexSzalay01152013Slide3.png

Non-Incremental Changes

AlexSzalay01152013Slide4.png

Data in HPC Simulations

AlexSzalay01152013Slide5.png

Visualizing Petabytes

AlexSzalay01152013Slide6.png

The Long Tail

AlexSzalay01152013Slide7.png

VOSpace/VOBox

AlexSzalay01152013Slide8.png

Changing Sociology

AlexSzalay01152013Slide9.png

Summary

AlexSzalay01152013Slide10.png

 

 

Big Data and Clouds: Challenges and Opportunities

Slides

Slide1.GIF

Charge to Presenters

Slide2.GIF

Some Topics

Slide3.GIF

Education and Training

Slide4.GIF

Xinformatics

Slide5.GIF

Clouds for Scientific Data Analysis

Slide6.GIF

Data Analytics Futures?

Slide7.GIF

FutureGrid Offers Computing Testbed As a Service

Slide8.GIF

FutureGrid Key Concepts

Slide9.GIF

4 Use Types for FutureGrid Testbeds

Slide10.GIF

NEXT

Paper

Source: http://grids.ucs.indiana.edu/ptliupa.../CloudDB12.pdf (PDF)

 

Large Scale Data Analytics on Clouds
Geoffrey Fox
School of Informatics and Computing
Indiana University
Bloomington IN 47408, USA

1. The Clouds + Exascale Infrastructure

There are several important trend driving computing. We have the Data Deluge from Commercial (e.g. Amazon, e-commerce), Community (e.g. Facebook, Search), and Scientific applications (e.g. Analysis of LHC data, Genomics) with examples given just being representative of many others[1]. We have light weight clients from smartphones, tablets to sensors. The multicore chip architecture is reawakening parallel computing while it and GPGPU’s (even more cores) are behind Exascale initiatives, which will continue drive to high end with a simulation orientation. Clouds with cheaper, greener, easier to use IT for (some) applications are growing in importance. They enable the lightweight clients by acting as a backend resource and answer the difficult question “what do we do all with all those cores on a chip”. As that’s not so easy to answer on a conventional client, this is one driver to lighter weight client (using smaller CPU chips) but on a server, each core can host a separate cloud service. These developments drive both research and education and will weave together as we look at data analysis in the clouds. Curricula based on the “Science of Clouds”[2, 3] and/or “Data Science” [4] are attractive as both area are predicted to generate several million jobs and not find the needed skills. Finally the need for data analytics links old (e.g. finance, retail) business and new (Web 2.0) business with science.
 
Clouds have many interesting characteristics including on-demand service, measured service, scalable elastic service, broad any-time any-where network access, pooling of resources leading to economies of scale in performance and electrical power (Green IT). These correspond to Infrastructure as a Service but there are also powerful new software models corresponding to Platform as a Service that are also important. We will see examples such as cloud support of sensors (lightweight clients) where IaaS with broad access drives cloud data analysis and others where novel MapReduce algorithms (i.e. PaaS) are most important. Areas like genomics are driven both by the need for the most effective computing combined with interest in new programming models like MapReduce[5]. The most visible and major data intensive area – analysis of LHC data from CERN – could use clouds (as can typical high throughput computing loads) but they have an effective operational grid solution.
 
Simulations have been explored on clouds but traditional super computers are typically required to get good performance on large highly parallel jobs. Clouds are currently only clearly get good performance on “bags of simulation tasks” with many small jobs that are not individually sensitive to synchronization costs. Synchronization costs are higher in clouds as virtualization leads to overheads both from software costs and difficulties in preserving locality. Thus we get classic HPC systems now moving inevitably to Exascale as likely to remain a critical part of the computing Cyberinfrastructure.
 
The above analysis suggests a “Clouds+Exascale” Cyberinfrastructure scenario and in next section we ask how data intensive applications map into this ecosystem.

2. Example Applications

Previously we have used the MapReduce paradigm to classify parallel applications into four major groups [6,7,8,9]..
 
Map-only applications are bags of independent tasks and clearly are suitable for clouds. This pleasingly parallel case includes not only LHC and similar science analysis but also support of the “Internet of Things” (IoT) [10] where each of the world’s distributed devices (including smart phones) is backended by the cloud. The IoT is forecast to grow to 24 billion devices on the Internet by 2020. Robots are important sub-class of the IoT and cloud-backed robotics is very promising. The map-only case included “the long tail of science” (or indeed the “the long tail of most things”) where one has parallelism over users each running smallish jobs that run effectively on clouds.
MapReduce jobs consist of independent maps and reducers with communication between tasks happening at the link between Map and Reduce. These of course cover many “Life-style Informatics” applications such as those used in the social media and search industries. Clouds can support this problem class well. There are some scientific applications of this class including for example basic statistical analysis (such as histogramming ) common for example at final stage of LHC analysis.
 
Classic MPI jobs are those identified for supercomputers above and typically involve many small size point to point messages. This class is target of HPC systems and the domain of “Exascale” component of the computing ecosystem.
 
The final category has been called Iterative MapReduce [11, 12, 13, 14, 15, 16] and is very clear in many data analysis applications. Many data analytics algorithm involve linear algebra at their core where the parallelism is well understood. These do not have the geometric parallelism of simulations but rather that of matrix rows, columns or blocks. Correspondingly we do not get many small messages but large reduction of broadcast (multicast) messages that are not as sensitive to latency overheads that are important for MPI structure of particle dynamics or partial differential equation solvers. Thus clouds are an interesting architecture and one can introduce a “Map Collective” programming abstraction that can be supported by either MPI or iterative versions of MapReduce.
 
Supporting the three categories suitable of clouds has important issues including especially the data architecture where one needs to move the computing to the data which is typically not easy in today’s HPC or cloud environments. We discussed this in a previous note. In the last section we discuss a missing component that must be addressed.

3. Data Analytics Libraries

Here we note that in the hugely successful but largely simulation-oriented HPCC activities starting around 1990, an important activity was the design and construction of core libraries such as PETSc, SCALAPACK (becoming PLASMA now [17]) and underlying technologies such as BLACS and MPI. Data intensive cloud applications require scalable parallel analysis routines and that these will cross many application areas just as the earlier HPCC libraries enable differential equation solvers and linear algebra across many disciplines. We further expect that reliable data analysis will need new robust algorithms to mimic the oft-quoted observation that HPC progress has benefited equally from Moore’s Law-driven hardware improvements and from new algorithms. These observations motivate the introduction of SPIDAL, or the Scalable Parallel Interoperable Data Analytics Library, to address the analysis of big data. Figure 1 shows the components of the project. We include communities with data intensive applications which need to identify what library members need to be built. Good existing examples are R [18] and Mahout [19] but these are not aimed at high performance needed for large scale applications. As shown in Figure 1, we identify six layers and also five broad abstraction areas [20] whose definition allows library members to be built in a way that is portable. One abstraction is Jobs where we can identify the Pilot job concept [21, 22] to obtain interoperably; Communication where we need both MPI and MapReduce patterns and will use iterative MapReduce to design a common abstraction; a Data Layer where one needs abstractions to support storage, access and transport (since SPIDAL algorithms will need to run interoperably with databases, NOSQL, wide area file systems and file systems like Hadoop’s HDFS[23]). One also needs an Application Level Data abstraction between L2 and L3. Our final abstraction is the virtual machine or Appliance to deploy applications, where one could use a recently developed template approach [24, 25, 26, 27, 28, 29, 30] that can be realized on bare metal or commercial and private cloud VM managers. This supports both interoperability between different resources and preservation so that scientific results using SPIDAL will be reproducible.

Figure 1: A Data Analytics Architecture with abstractions

Figure1ADataAnalyticsArchitecture.png

4. Acknowledgements

I would like to acknowledge Shantenu Jha, Madhav Marathe, Judy Qiu and Joel Saltz for discussions of SPIDAL architecture. This material is based upon work supported in part by the National Science Foundation under Grant No. 0910812 for "FutureGrid: An Experimental, High-Performance Grid Test-bed."

References

1

Geoffrey Fox, Tony Hey, and Anne Trefethen, Where does all the data come from? , Chapter in Data Intensive Science Terence Critchlow and Kerstin Kleese Van Dam, Editors. 2011. http://grids.ucs.indiana.edu/ptliupa...0from%20v7.pdf.

2

IDC. Cloud Computing's Role in Job Creation. 2012 [accessed 2012 March 6]; Sponsored by Microsoft Available from: http://www.microsoft.com/presspass/d...hite_Paper.pdf.

3

Cloud Computing to Bring 2.4 Million New Jobs in Europe by 2015. 2011 [accessed 2011 March 6]; Available from: http://www.eweek.com/c/a/Cloud-Compu...y-2015-108084/.

4

James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and A.H. Byers. Big data: The next frontier for innovation, competition, and productivity. 2011 [accessed 2012 August 23]; McKinsey Global Institute Available from: http://www.mckinsey.com/insights/mgi...for_innovation.

5

Jeffrey Dean and Sanjay Ghemawat, MapReduce: simplified data processing on large clusters. Commun. ACM, 2008. 51(1): p. 107-113. DOI:http://doi.acm.org/10.1145/1327452.1327492

6

Fox, G.C., R.D. Williams, and P.C. Messina, Parallel computing works! 1994: Morgan Kaufmann Publishers, Inc. http://www.old-npac.org/copywrite/pc...00000000000000

7

Geoffrey C. Fox, Data intensive applications on clouds, in Proceedings of the second international workshop on Data intensive computing in the clouds. 2011, ACM. Seattle, Washington, USA. pages. 1-2. DOI: 10.1145/2087522.2087524.

8

Jaliya Ekanayake, Thilina Gunarathne, Judy Qiu, Geoffrey Fox, Scott Beason, Jong Youl Choi, Yang Ruan, Seung-Hee Bae, and Hui Li, Applicability of DryadLINQ to Scientific Applications. January 30, 2010, Community Grids Laboratory, Indiana University. http://grids.ucs.indiana.edu/ptliupa...ryadReport.pdf.

9

Judy Qiu, Jaliya Ekanayake, Thilina Gunarathne, Jong Youl Choi, Seung-Hee Bae, Yang Ruan, Saliya Ekanayake, Stephen Wu, Scott Beason, Geoffrey Fox, Mina Rho, and H. Tang, Data Intensive Computing for Bioinformatics. December 29, 2009.
 

10

Kai Hwang, Geoffrey Fox, and Jack Dongarra, Distributed and Cloud Computing : from Parallel Processing to The Internet of Things. 2011: Morgan Kaufmann Publishers

11

Thilina Gunarathne, Bingjing Zhang, Tak-Lon Wu, and Judy Qiu, Scalable Parallel Computing on Clouds Using Twister4Azure Iterative MapReduce Future Generation Computer Systems 2012. To be published. http://grids.ucs.indiana.edu/ptliupa..._cr_submit.pdf

12

Judy Qiu, Thilina Gunarathne, and Geoffrey Fox, Classical and Iterative MapReduce on Azure, in Cloud Futures 2011 workshop. June 2-3, 2011. Microsoft Conference Center Building 33 Redmond, Washington United States. http://grids.ucs.indiana.edu/ptliupa...une2-2011.pptx.

13

Yingyi Bu, Bill Howe, Magdalena Balazinska, and Michael D. Ernst, HaLoop: Efficient Iterative Data Processing on Large Clusters, in The 36th International Conference on Very Large Data Bases. September 13-17, 2010, VLDB Endowment: Vol. 3. Singapore. http://www.ics.uci.edu/~yingyib/pape...mera_ready.pdf.

14

SALSA Group. Iterative MapReduce. 2010 [accessed 2010 November 7]; Twister Home Page Available from: http://www.iterativemapreduce.org/.

15

J.Ekanayake, H.Li, B.Zhang, T.Gunarathne, S.Bae, J.Qiu, and G.Fox, Twister: A Runtime for iterative MapReduce, in Proceedings of the First International Workshop on MapReduce and its Applications of ACM HPDC 2010 conference June 20-25, 2010. 2010, ACM. Chicago, Illinois. http://grids.ucs.indiana.edu/ptliupa...submission.pdf.

16

Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica, Spark: Cluster Computing with Working Sets, in 2nd USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '10). June 22, 2010. Boston. http://www.cs.berkeley.edu/~franklin...s/hotcloud.pdf.

17

Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) project. [accessed 2012 September 6]; Available from: http://icl.cs.utk.edu/plasma/index.html.

18

The Comprehensive R Archive Network. [accessed 2012 August 22]; Available from: http://cran.r-project.org/.

19

Apache Mahout Scalable machine learning and data mining [accessed 2012 August 22]; Available from: http://mahout.apache.org/.

20

Shantenu Jha, Murray Cole, Daniel S. Katz, Manish Parashar, Omer Rana, and J. Weissman, Distributed Computing Practice for Large-Scale Science & Engineering Applications Concurrency and Computation: Practice and Experience (in press), 2012.

21

Andre Luckow, Mark Santcroos, Ole Weidner, Andre Merzky, Pradeep Mantha, and Shantenu Jha, P*: A Model of Pilot-Abstractions, in 8th IEEE International Conference on e-Science. 2012.

22

Pradeep Kumar Mantha, Andre Luckow, and S. Jha, Pilot-MapReduce: an extensible and flexible MapReduce implementation for distributed data, in Third international workshop on MapReduce and its Applications. 2012.

23

Apache. HDFS Overview. 2010 [accessed 2010 November 6]; Available from: http://hadoop.apache.org/hdfs/.

24

Jonathan Klinginsmith, M. Mahoui, and Y. M. Wu, Towards Reproducible eScience in the Cloud., in Third International Conference on Cloud Computing Technology and Science (CloudCom). November 29 - December 1, 2011. DOI: 10.1109/CloudCom.2011.89.

25

Jonathan Klinginsmith and Judy Qiu, Using Cloud Computing for Scalable, Reproducible Experimentation. August, 2012.

26

Gregor von Laszewski, Hyungro Lee, Javier Diaz, Fugang Wang, Koji Tanaka, Shubhada Karavinkoppa, Geoffrey C. Fox, and Tom Furlani, Design of an Accounting and Metric-based Cloud-shifting and Cloud-seeding framework for Federated Clouds and Bare-metal
Environments, in Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit. September 21, 2012. San Jose, CA (USA).

27

Geoffrey C. Fox, Gregor von Laszewski, Javier Diaz, Kate Keahey, Jose Fortes, Renato Figueiredo, Shava Smallen, Warren Smith, and Andrew Grimshaw, FutureGrid - a reconfigurable testbed for Cloud, HPC and Grid Computing, Chapter in On the Road to Exascale Computing: Contemporary Architectures in High Performance Computing, Jeff Vetter, Editor. 2012, Chapman & Hall/CRC Press http://grids.ucs.indiana.edu/ptliupa...ka-chapter.pdf

28

Javier Diaz, Gregor von Laszewski, Fugang Wang, and Geoffrey Fox, Abstract Image Management and Universal Image Registration for Cloud and HPC Infrastructures, in IEEE CLOUD 2012 5th International Conference on Cloud Computing June 24-29 2012. Hyatt Regency Waikiki Resort and Spa, Honolulu, Hawaii, USA http://grids.ucs.indiana.edu/ptliupa...12_id-4656.pdf

29

J. Diaz, A. J. Younge, G. von Laszewski, F. Wang, and G. C. Fox, Grappling cloud infrastructure services with a generic image repository, in CCA11: Cloud Computing and Its Applications. April 12-13, 2011. Argonne National Laboratory, USA. http://grids.ucs.indiana.edu/ptliupa...gerepo-cca.pdf.

30

Javier Diaz, Gregor von Laszewski, Fugang Wang, Andrew J. Younge, and Geoffrey Fox, FutureGrid Image Repository: A Generic Catalog and Storage System for Heterogeneous Virtual Machine Images, in 3rd IEEE International Conference CloudCom on Cloud Computing Technology and Science. November 29 - December 1 2011. Athens Greece. http://grids.ucs.indiana.edu/ptliupa...oudCom2011.pdf

Agenda

Source: http://www.nist.gov/itl/cloud/upload...hop_agenda.pdf (PDF)

Slides: http://collaborate.nist.gov/twiki-cl...rumCCBGIAgenda

Webcast: http://www.nist.gov/itl/cloud/nist-j...op-webcast.cfm

http://www.nist.gov/itl/cloud/upload...inal_11413.pdf (PDF) Updated 1/14/13

National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD

DAY 1 - January 15, 2013

Cloud Computing AND Big Data
Forum & Workshop Agenda
January 15, 16 and 17, 2013
National Institute of Standards and Technology
100 Bureau Drive, Building 101, Gaithersburg, MD 20899
DAY 1 – Tuesday, January 15, 2013
 
8:00 Check-in (Administration Building 101)
 
8:00- 9:00 Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway & Portrait Room) F O R U M – (NIST, Bldg. 101, Red Auditorium)
 
9:00-9:10 Welcome & Opening Remarks
Patrick Gallagher, Under Secretary of Commerce for Standards and Technology and Director, National Institute of Standards and Technology (NIST)
 
9:10-9:20 Keynote Steven L. VanRoekel, United States Chief Information Officer and Administrator, Office of Electronic Government, Office of Management and Budget
 
9:30-10:40 Panel: Convergence of Cloud and Big Data
Moderator: Charles H. Romine, Director, NIST Information Technology Laboratory (ITL)
1. Thomas Cynkin, Vice President and General Manager, Chief Corporate Representative, Fujitsu Ltd. Slides
2. Cyrus Wadia, Assistant Director for Clean Energy & Materials Research and Development, White House Office of Science and Technology Policy
3. Alex Szalay, Alumni Centennial Professor, Department of Physics and Astronomy, The Johns Hopkins University Slides
4. Geoffrey Fox, Professor of Informatics and Computing, Physics, Indiana University Slides Slides
5. Peter Tseronis, Chief Technology Officer, Department of Energy
 
10:40-11:00 Introduction to Forum and Workshop
Christopher L. Greer, Acting Senior Advisor for Cloud Computing and Associate Director, Program Implementation Office, ITL, NIST
USG Cloud Computing Technology Roadmap Progress
Robert Bohn, NIST Program Manager for Cloud Computing
 
11:00-11:30 Break (on your own; cafeteria accepts cash, credit cards)
 
Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway & Portrait Room) F O R U M – (NIST, Bldg. 101, Red Auditorium) DAY 1, Tuesday, January 15, 2013
 
11:30-12:30 Progress on Standards for Interoperability between Clouds
Moderator: John Messina, NIST Cloud Computing Reference Architecture and Taxonomy Working Group Co-chair
1. Chris Davis, Compliance Solutions Architect, Verizon Enterprise Solutions
2. Rob Vietmeyer, Chief Information Officer, Department of Defense (CANCELLED)
3. Alan Yoder, Manager, Storage standards at Huawei , SNIA Liaison officer at ISO JTC 1 SC 38 , Secretary, Governing Board at Storage Management Initiative, SNIA Slides
Steven Woodward, CEO Cloud Perspectives Slides
 
12:30-1:30 Focus Session: USG Cloud Computing Technology Roadmap Priority Action Plan (PAP) Progress and Examples
Moderator: Robert Bohn, NIST Cloud Computing Program Manager and Reference Architecture and Taxonomy Working Group Co-chair
Requirement 1: International voluntary consensus based interoperability, portability and security standards
Stephen L. Diamond, Chair, Institute of Electrical and Electronics Engineers (IEEE) Cloud Computing Standards Committee Slides
Requirement 2: Solutions for High Priority Security Requirements
Sharad Mehrotra, Professor of Computer Science, University of California, Irvine
Requirement 3: Technical specifications to enable development of consistent, high quality Service Level Agreements
Ken Stavinoha, Solutions Architect, Cisco Systems
Requirement 4: Clearly & consistently categorized cloud services
Joel J. Fleck, II, Senior Standards Architect, Global Technology Program, HP Labs, Hewlett-Packard Company Slides
 
1:30-2:45 Lunch (On your own; cafeteria accepts cash, credit cards)
Real-World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway & Portrait Room)
 
Optional Session, NITRD Interagency Big Data Senior Steering Group (BDSSG), Lecture Room A
This special topic briefing occurs concurrently from 1:45 – 2:30 pm PLEASE BRING YOUR LUNCH & JOIN US
• George O. Strawn, Director, Federal Networking and Information Technology Research and Development (NITRD), National Coordination Office (NCO)
• Suzanne Iacono, Senior Science Advisor for the Directorate for Computer and Information Science and Engineering (CISE) at the National Science Foundation (NSF)
• Wendy Wigen, Coordinator for BDSSG FORUM – (NIST, Bldg. 101, Red Auditorium), DAY 1, Tuesday, January 15, 2013
 
2:45-4:15 Focus Session: USG Cloud Computing Technology Roadmap Priority Action Plan (PAP) Progress and Examples
Moderator: Michael Hogan, NIST Cloud Computing Standards Roadmap Working Group Co-chair
Requirement 5: Frameworks to support seamless implementation of federated community cloud environments
Alan Sill, Senior Scientist, and Site Director, Center for Cloud and Autonomic Computing at Texas Tech University; Vice President of Standards, Open Grid Forum (OGF), Co-Chair, Standards Acceleration to Jumpstart the Adoption of Cloud Computing (SAJACC) Working Group Slides
Requirement 6: Updated Organization Policy that reflects the Cloud Computing Business and Technology model
Agnieszka Wodecka, Legal Officer, Unit E2: Software and Services, Cloud Computing at DG Connect, European Commission Slides
Requirement 7: Defined unique government regulatory requirements, technology gaps, and solutions
Craig Lee and Melvin Greer, Network Centric Operations Industry Consortium (NCOIC) Slides
Requirement 8: Collaborative parallel strategic “future cloud” development initiatives
Kyoung-Sook Kim, Researcher, National Institute of Communications Technology, Japan
Requirement 9: Defined and implemented reliability design goals
Shawn Veney, Principal GRC (Governance, Risk & Compliance) Architect for Microsoft Office 365
Requirement 10: Defined and implemented cloud service metrics
Daniel Burton, Senior Vice President and General Manager, Global Public Sector, Salesforce Slides
 
4:15-4:30 Closing Remarks
Christopher L. Greer, NIST Senior Advisor for Cloud Computing and Associate Director, Program Implementation Office, ITL, NIST

DAY 2 – January 16, 2013

Cloud Computing AND Big Data
Forum & Workshop Agenda
January 15, 16 and 17, 2013
National Institute of Standards and Technology
100 Bureau Drive, Gaithersburg, MD 20899
DAY 2 – January 16, 2013
 
8:00 Check-in (Administration Building 101
 
8:00-9:00 Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway and Lecture Room B) F O R U M – (NIST, Bldg. 101, Red Auditorium)
 
9:00-9:10 Welcome & Opening Remarks
Charles H. Romine, Director, NIST Information Technology Laboratory
 
9:10-9:40 Keynote Vint Cerf, Vice President and Chief Internet Evangelist, Google
 
9:40-10:40 Big Data Use-Cases Government Perspectives
Moderator: Ashit Talukder, Chief, Information Access Division, ITL, NIST
1. Peter Levin, Chief Technology Officer, Veterans Affairs Slides
2. Peter Lyster, Program Director, Division of Biomedical Technology, Bioinformatics, and Computational Biology, National Institute of General Medical Sciences, National Institutes of Health, (NIH/NIGMS) Slides
3. Sasi Pillay, Chief Technology Officer for IT, National Aeronautics and Space Administration (NASA)
 
10:40-11:00 Break (on your own; cafeteria accepts cash, credit cards)
Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway and Portrait Room)
 
11:00-12:00 Big Data Lifecycle Management: Measurement Science, Benchmarking and Evaluation Challenges
Moderator: John Henry Scott, Physicist, Materials Measurement Science Division, Material Measurement Laboratory, NIST
1. Brian Athey, Office of the Chair, Dept. of Computational Medicine and Bioinformatics (DCM&B), University of Michigan Slides
2. Zachary G. Goldstein, Deputy Chief Information Officer, National Oceanic and Atmospheric Administration (NOAA)
3. Michael L. Norman, Director of the San Diego Supercomputer Center and Distinguished Professor of Physics, University of California, San Diego
 
12:00-1:00 Panel: Big Data Analytics and Solutions
Moderator: John Garofolo, Senior Advisor, Information Access Division, ITL, NIST
1. Adam Fuchs, Chief Technology Officer, Sqrrl Slides
2. Will Cukierski, Data Scientist, Kaggle Slides
3. Philip (Flip) Kromer, Co-Founder and Chief Technology Officer, Infochimps Slides
4. Chaitan Baru, Director, Center for Large-scale Data Systems research (CLDS), San Diego Supercomputer Center Slides
 
1:00-2:30 Lunch (on your own; cafeteria accepts cash, credit cards)
Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin Bldg. 101, Flag Hallway & Portrait Room)
 
Optional Sessions: 1:15-2:20pm
Please bring your lunch and Join us
Session One, Lecture Room A: Standards Roadmap Working Group Meeting
Michael Hogan and Annie Sokol, NIST Cloud Computing Standards Roadmap Working Group Co-chairs
Session Two, Lecture Room B: NIST Cloud Computing Security Working Group Meeting
Michaela Iorga, NIST Senior Security Technical Lead for Cloud Computing; NIST Chair, Cloud Computing Security Working Group
Session Three, Lecture Room C: Standards Acceleration to Jumpstart the Adoption of Cloud Computing (SAJACC) Working Group Meeting
Alan Sill, Co-Chair, SAJACC Working Group; Senior Scientist, and Site Director, Center for Cloud and Autonomic Computing at Texas Tech University; Vice President of Standards, Open Grid Forum (OGF)
Session Four, Lecture Room D : Cloud Metric Working Sub-Group Meeting
Frederic de Vaulx, Chair, Cloud Metrics Sub-group of the Reference Architecture and Taxonomy Working Group
 
2:30-3:30 Big Data Analytics , Processing and Interaction: Measurement Science, Benchmarking and Evaluation Challenges
Moderator: Ashit Talukder, Chief, Information Access Division, ITL, NIST
1. Christopher D. Carothers, Professor, Department of Computer Science, Rensselaer Polytechnic Institute Slides
2. Ramakrishna Akella, Professor of Information Systems and Technology Management, University of California Santa Cruz
3. Chris White, Program Manager, Defense Advanced Research Projects Agency (DARPA)
 
3:30-3:45 Break (on your own; cafeteria accepts cash, credit cards)
Real World Collaboration with Academia, Government, Industry & Standards Development Organizations (Admin. Bldg. 101, Flag Hallway & Portrait Room)
 
3:45-4:45Big Data Infrastructure
Moderator: Mary Brady, Leader, Information Systems Group, Software and Systems Division, ITL, NIST
1. Amr Awadallah, Chief Technology Officer, Cloudera Slides
2. Mark Ryland, Amazon Web Services Slides
3. Jason Matheny, Program Manager, Intelligence Advanced Research Projects Activity (IARPA) Slides
 
4:45-5:00 Summary of first Big Data event and Day 2 Closing Remarks
Christopher L. Greer, Acting Senior Advisor for Cloud Computing, Associate Director, Program Implementation Office, ITL, NIST

DAY 3 – January 17, 2013

Cloud Computing AND Big Data
Forum & Workshop Agenda
January 15, 16 and 17, 2013
National Institute of Standards and Technology
100 Bureau Drive, Gaithersburg, MD 20899
DAY 3 – January 17, 2013
 
8:00 Check-in (Administration Building 101)
 
9:00 Welcome, James A. St. Pierre, NIST ITL Deputy Director,
 
9:10 Cloud Workshop Introduction, Robert Bohn, NIST Program Manager for Cloud Computing
 
9:20 Big Data Workshop Introduction, Ashit Talukder, Chief, Information Access Division, ITL, NIST
 
9:30 Morning Breakout Sessions (130 Minutes)
B01: USG Cloud Computing Technology Roadmap Volume III Progress
Eric Simmon, NIST Cloud Computing Roadmap Volume III primary author and lead, Co-Chair, Federal Cloud Computing Standards and Technology Working Group
• Business Use Cases
David A. Lifka, Director - Cornell University Center for Advanced Computing (CAC) Director Research Computing - Weill Cornell Medical College (WCMC) Adjunct Associate Professor - Cornell Computing and Information Science (CIS) Cornell University
• Standards Acceleration to Jumpstart the Adoption of Cloud Computing (SAJACC)
Alan Sill, Co-Chair, SAJACC Working Group, Senior Scientist, and Site Director, Center for Cloud and Autonomic Computing at Texas Tech University; Vice President of Standards, Open Grid Forum (OGF)
 
B02: International Cloud Computing Standards Progress and Considerations
Michael Hogan and Annie Sokol, NIST Cloud Computing Standards Roadmap Working Group Co-chairs
• Status Report on JTC1 & ITU-T Collaborative Teams on Cloud Computing
• Progress Update of NIST SP 500-291 – Standards Roadmap
 
B03: NIST Research and Development in Cloud Metrics and Cloud Security (progress, future work and discussion)
• NIST Cloud Computing Metrics
Frederic De Vaulx, NIST Associate, Chair, Metrics Working Sub-Group of Reference Architecture & Taxonomy
• NIST Cloud Computing Security Reference Architecture
Michaela Iorga, NIST Senior Security Technical Lead for Cloud Computing; NIST Chair, Cloud Computing Security Working Group; NIST Co-Chair, Cloud Computing Forensic Science Working Group Slides
 
B04: Discussion of Big Data Potential Collaborative Alliances
John Garofolo, Senior Advisor, Information Access Division, ITL, NIST
F O R U M – (NIST, Bldg. 101, Red Auditorium) DAY3, Thursday, January 17, 2013
 
B05: Big Data Definition & Characteristics, and Standards & Measurement Science Needs
Christopher L. Greer, Acting Senior Advisor for Cloud Computing, Associate Director, Program Implementation Office, ITL, NIST and Ashit Talukder Chief, Information Access Division, ITL, NIST
 
11:40 Report Out & Workshop Next Steps: Cloud Computing and Big Data
 
12:30 Closing Remarks
Christopher L. Greer, Acting Senior Advisor for Cloud Computing, Associate Director, Program Implementation Office, ITL, NIST
 
12:40 Adjourn
Page statistics
4679 view(s) and 16 edit(s)
Social share
Share this page?

Tags

This page has no custom tags.
This page has no classifications.

Comments

You must to post a comment.

Attachments