SlideShare a Scribd company logo
1 of 31
Download to read offline
Supercomputing and the Scientist
Debbie Bard
Group Lead for Data Science
Engagement
Biology, Energy, Environment Computing Materials, Chemistry, Geophysics
Particle Physics, Astrophysics
Largest funder of physical
science research in U.S.
Nuclear Physics Fusion Energy, Plasma Physics
- 2 -
NERSC is the mission HPC and data facility for the U.S
Department of Energy Office of Science
- 2 -
NERSC is the mission HPC and data facility for the U.S
Department of Energy Office of Science
Simulations at scale
Data analysis support for
DOE’s experimental and
observational facilities
Largest funder of
physical science
research in U.S.
7,000 Users
800 Projects
700 Codes
~2000 publications per year
- 3 -
NERSC Systems: present and future
NERSC-7:
Edison
Multicore
CPU
NERSC-8: Cori
Manycore CPU
NESAP Launched:
transition applications to
advanced architectures
2013
2016
2024
NERSC-9:
CPU and GPU nodes
Continued transition of
applications and support for
complex workflows
2020
NERSC-10:
Exa system
2028
NERSC-11:
Beyond
Moore
- 4 -
What’s a supercomputer anyway?
- 5 -
- 6 -
Supercomputers have super-fast interconnect
between nodes
Cluster cabinet Cori cabinet
Supercomputers have specialist storage systems
• Scale out the file system to 100s of
storage servers
• Access FS over high-speed
interconnect: high aggregate
bandwidth
• Global, coherent namespace
– Easy for scientists to use
– Hard to scale up metadata
operations!
- 7 -
Compute Nodes IO Nodes Storage Servers
How do you distribute PBs of data and millions of files to
hundreds of thousands of compute cores, with no latency?
Supercomputing usability for experimental science
Easier to:
• Write and run large-scale
parallelized code over
10,000s nodes.
• Read in and write out
huge data files
Harder to:
• Port code directly from your
laptop/cluster.
• Read and write lots of small files.
• Get fast turnaround on your
compute jobs.
• Stream data from external source.
- 8 -
Cori: Pre-Exascale System for DOE Science
• Cray XC System
• >9600 68-core Intel KNL compute nodes, >2800 32-core Intel Haswell nodes
• Cray Aries Interconnect
• NVRAM Burst Buffer, 1.6PB of SSDs, 1.7TB/sec I/O
• Lustre file system 28 PB of disk, >700 GB/sec I/O
• Investments to support large scale data analysis
– High bandwidth direct connection between experimental facilities and compute
nodes
– Virtualization capabilities (Shifter/Docker)
– More login nodes for managing advanced workflows
– Support for real time and high-throughput queues
- 9 -
Cori: Pre-Exascale System for DOE Science
• Cray XC System
• >9600 68-core Intel KNL compute nodes, >2800 32-core Intel Haswell nodes
• Cray Aries Interconnect
• NVRAM Burst Buffer, 1.6PB of SSDs, 1.7TB/sec I/O
• Lustre file system 28 PB of disk, >700 GB/sec I/O
• Investments to support large scale data analysis
– High bandwidth direct connection between experimental facilities and compute
nodes
– Virtualization capabilities (Shifter/Docker)
– More login nodes for managing advanced workflows
– Support for real time and high-throughput queues
- 10 -
#5 most powerful computer on the planet in Nov 2016.
#12 today.
NERSC-9: A System Optimized for Science
- 11 -
• Cray Shasta System providing 3-4x capability of Cori system
• First NERSC system designed to meet needs of both large scale
simulation and data analysis from experimental facilities
– Includes both NVIDIA GPU-accelerated and AMD CPU-only nodes
– Cray Slingshot network for Terabit-rate connections to system
– Optimised data software stack enabling analytics and Machine Learning at scale
– All-flash file system for accelerated IO
The needs of experimental facilities drive the design of our
supercomputers
- 12 -
Future
experiments
Experiments
operating now
BioEPIC
How Computing impacts experimental science
• Inform Experiments
– Simulations guide instrument design
– Simulations guide experimental methodology
– Real-time feedback guides experimental
operations
GEANT4 ATLAS model
Advanced Lightsource
data analysis
- 13 -
How Computing impacts experimental science
• Inform Experiments
– Simulations guide instrument design
– Simulations guide experimental methodology
– Real-time feedback guides experimental
operations
• Analyze Data
– Convert measured phenomena into
meaningful statistics
– Compare theory to measurement
GEANT4 ATLAS model
- 14 -
Viz of mouse brain ions
How Computing impacts experimental science
• Inform Experiments
– Simulations guide instrument design
– Simulations guide experimental methodology
– Real-time feedback guides experimental
operations
• Analyze Data
– Convert measured phenomena into
meaningful statistics
– Compare theory to measurement
• Replace Hardware
– Why solve a problem in hardware if you can
solve it in software?
GEANT4 ATLAS model
Viz of mouse brain ions
LSST CCD tree
ring effects- 15 -
- 16 -
Enabling new discoveries by coupling experimental science with
large scale data analysis and simulations
- 17 -
• How does photosynthesis
happen?
• How do drugs dock with
proteins in our cells?
• Why do jet engines fail?
Supercomputing for real-time experiments
Super-intense femtosecond xray pulses, >10PB data, up to 100 PF required for analysis
Supercomputing for data analysis
- 18 -A billion proton-proton collisions per second and multi-GB of data per second.
• What is the relationship
between fundamental
particles?
• What is the mechanism that
gives matter mass?
Supercomputing for sequencing
• How does the soil
microbiome impact crop
success?
• How did viruses evolve?
• Can we engineer
enzymes for more
effective carbon
fixation?
>170 trillion bases sequenced per year, >7PB of archived data, >100,000 users
Supercomputing to enable radical new detectors
20
FPGA-based
readout system
4D scanning transmission electron microscope, >1TB/sec data
• How does the structure
of batteries impact
their performance?
• Can nanocrystals be
used to store carbon
dioxide?
Custom computing for scientific data
• Enable experimentation with data
reduction and analysis techniques
– Enable higher frame rates
– Real-time data quality feedback
– New analysis algorithms
- 21 -
Supercomputing and Machine Learning
• Scientific data is typically
large and complex
– Harder to find optimal
hyperparameters
– Need lots of prototyping and
model evaluation
• Key metric: time to
scientific insight
– Don’t want to wait for days to train
a single model
– Fast turnaround of ideas and
exploration
- 22 -
→ use supercomputers to scale machine
learning algorithms to multiple nodes
Physics papers on the arXiv with abstracts
containing contain phrase “deep
learning” :
36 in 2016
133 in 2017
335 in 2018
109 in 2019
Supercomputing and Machine Learning
• Scientific data is typically
large and complex
– Harder to find optimal
hyperparameters
– Need lots of prototyping and
model evaluation
• Key metric: time to
scientific insight
– Don’t want to wait for days to
train a single model
– Fast turnaround of ideas and
exploration
- 23 -
→ use supercomputers to scale machine
learning algorithms to multiple nodes
- 24 -
Determining the fundamental constants of cosmology
https://arxiv.org/abs/1808.04728
Determining the fundamental constants of cosmology
• Achieved unprecedented accuracy in cosmological parameter estimation.
• Scaled out to 8192 CPU nodes; 20min training time; 3.5PF sustained performance.
• Largest application of TensorFlow on CPU-based system with fully-synchronous updates.
- 25 - https://arxiv.org/abs/1808.04728
Characterising Extreme Weather in a Changing Climate
- 26 - https://arxiv.org/abs/1810.01993
• High quality segmentation results obtained for climate data.
• Network scaled out to 4560 Summit nodes (27,360 Volta GPUs).
• 60min training time, 0.99 EF sustained performance in 16-bit precision.
• Largest application of TensorFlow on GPU-based system, first Exascale DL application.
- 27 -
Characterising Extreme Weather in a Changing Climate
https://arxiv.org/abs/1810.01993
Exascale Deep Learning is driving innovation in
supercomputing
- 28 -
Exascale Deep Learning is driving innovation in
supercomputing
- 29 -
Open Questions
– How much of the future supercomputer workload will be
machine learning?
• How far will scientists be able to use machine learning?
• Will the interpretability problem ever be solved?
– Can specialist devices be used for other algorithms?
– What does the ideal storage system look like for AI?
• Supercomputers play an increasingly important role
in experimental science.
• High performance computing can change the way
scientists form their questions, and open new
possibilities in detector design, experiment
operations and data analysis.
• We need a co-evolution of experimental and
computing techniques to leverage bleeding-
edge technologies for scientific insight.
Thanks!
NERSC is hiring! nersc.gov/careers

More Related Content

What's hot

The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataAnubhav Jain
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridSwiss Big Data User Group
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009Ian Foster
 
10 Abundant-Data Computing
10 Abundant-Data Computing10 Abundant-Data Computing
10 Abundant-Data ComputingRCCSRENKEI
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open SourceRCCSRENKEI
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersIan Foster
 
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark Summit
 
13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras Systems13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras SystemsRCCSRENKEI
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for ScienceIan Foster
 
Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)inside-BigData.com
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the ContinuumIan Foster
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Ian Foster
 
Scientific
Scientific Scientific
Scientific marpierc
 
Exascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing WorldExascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing Worldinside-BigData.com
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!Ian Foster
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-finalmarpierc
 
Neural network-based low-frequency data extrapolation
Neural network-based low-frequency data extrapolationNeural network-based low-frequency data extrapolation
Neural network-based low-frequency data extrapolationOleg Ovcharenko
 
Transfer learning for low frequency extrapolation from shot gathers for FWI a...
Transfer learning for low frequency extrapolation from shot gathers for FWI a...Transfer learning for low frequency extrapolation from shot gathers for FWI a...
Transfer learning for low frequency extrapolation from shot gathers for FWI a...Oleg Ovcharenko
 
Feasibility of moment tensor inversion for a single-well microseismic data us...
Feasibility of moment tensor inversion for a single-well microseismic data us...Feasibility of moment tensor inversion for a single-well microseismic data us...
Feasibility of moment tensor inversion for a single-well microseismic data us...Oleg Ovcharenko
 

What's hot (20)

The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
The World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC DatagridThe World Wide Distributed Computing Architecture of the LHC Datagrid
The World Wide Distributed Computing Architecture of the LHC Datagrid
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
10 Abundant-Data Computing
10 Abundant-Data Computing10 Abundant-Data Computing
10 Abundant-Data Computing
 
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
09 The Extreme-scale Scientific Software Stack for Collaborative Open Source
 
Many Task Applications for Grids and Supercomputers
Many Task Applications for Grids and SupercomputersMany Task Applications for Grids and Supercomputers
Many Task Applications for Grids and Supercomputers
 
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W uSpark for Behavioral Analytics Research: Spark Summit East talk by John W u
Spark for Behavioral Analytics Research: Spark Summit East talk by John W u
 
13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras Systems13 Supercomputer-Scale AI with Cerebras Systems
13 Supercomputer-Scale AI with Cerebras Systems
 
Learning Systems for Science
Learning Systems for ScienceLearning Systems for Science
Learning Systems for Science
 
Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)Update on the Exascale Computing Project (ECP)
Update on the Exascale Computing Project (ECP)
 
Coding the Continuum
Coding the ContinuumCoding the Continuum
Coding the Continuum
 
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
Accelerating the Experimental Feedback Loop: Data Streams and the Advanced Ph...
 
Scientific
Scientific Scientific
Scientific
 
Exascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing WorldExascale Computing Project - Driving a HUGE Change in a Changing World
Exascale Computing Project - Driving a HUGE Change in a Changing World
 
Taming Big Data!
Taming Big Data!Taming Big Data!
Taming Big Data!
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
Neural network-based low-frequency data extrapolation
Neural network-based low-frequency data extrapolationNeural network-based low-frequency data extrapolation
Neural network-based low-frequency data extrapolation
 
Transfer learning for low frequency extrapolation from shot gathers for FWI a...
Transfer learning for low frequency extrapolation from shot gathers for FWI a...Transfer learning for low frequency extrapolation from shot gathers for FWI a...
Transfer learning for low frequency extrapolation from shot gathers for FWI a...
 
Feasibility of moment tensor inversion for a single-well microseismic data us...
Feasibility of moment tensor inversion for a single-well microseismic data us...Feasibility of moment tensor inversion for a single-well microseismic data us...
Feasibility of moment tensor inversion for a single-well microseismic data us...
 

Similar to How HPC and large-scale data analytics are transforming experimental science

Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresSpark Summit
 
NERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardNERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardPacificResearchPlatform
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores inside-BigData.com
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Databricks
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineeringinside-BigData.com
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitGanesan Narayanasamy
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performanceinside-BigData.com
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research PlatformLarry Smarr
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...Facultad de Informática UCM
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...confluent
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...BigDataEverywhere
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facilityinside-BigData.com
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsTal Lavian Ph.D.
 
TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and ComputationTal Lavian Ph.D.
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impactinside-BigData.com
 
Implementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesImplementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesKTN
 
CHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformCHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformLarry Smarr
 

Similar to How HPC and large-scale data analytics are transforming experimental science (20)

Stories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi TorresStories About Spark, HPC and Barcelona by Jordi Torres
Stories About Spark, HPC and Barcelona by Jordi Torres
 
NERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie BardNERSC, AI and the Superfacility, Debbie Bard
NERSC, AI and the Superfacility, Debbie Bard
 
HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores HPC Cluster Computing from 64 to 156,000 Cores 
HPC Cluster Computing from 64 to 156,000 Cores 
 
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineering
 
AI Super computer update
AI Super computer update AI Super computer update
AI Super computer update
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on Summit
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
Modern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High PerformanceModern Computing: Cloud, Distributed, & High Performance
Modern Computing: Cloud, Distributed, & High Performance
 
The Pacific Research Platform
The Pacific Research PlatformThe Pacific Research Platform
The Pacific Research Platform
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
Enabling Insight to Support World-Class Supercomputing (Stefan Ceballos, Oak ...
 
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
Big Data Everywhere Chicago: High Performance Computing - Contributions Towar...
 
Future of hpc
Future of hpcFuture of hpc
Future of hpc
 
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
40 Powers of 10 - Simulating the Universe with the DiRAC HPC Facility
 
Grid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applicationsGrid optical network service architecture for data intensive applications
Grid optical network service architecture for data intensive applications
 
TeraGrid Communication and Computation
TeraGrid Communication and ComputationTeraGrid Communication and Computation
TeraGrid Communication and Computation
 
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry ImpactAccelerators at ORNL - Application Readiness, Early Science, and Industry Impact
Accelerators at ORNL - Application Readiness, Early Science, and Industry Impact
 
Implementing AI: Hardware Challenges
Implementing AI: Hardware ChallengesImplementing AI: Hardware Challenges
Implementing AI: Hardware Challenges
 
CHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning PlatformCHASE-CI: A Distributed Big Data Machine Learning Platform
CHASE-CI: A Distributed Big Data Machine Learning Platform
 

More from inside-BigData.com

Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...inside-BigData.com
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networksinside-BigData.com
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...inside-BigData.com
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...inside-BigData.com
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networksinside-BigData.com
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoringinside-BigData.com
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecastsinside-BigData.com
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Updateinside-BigData.com
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19inside-BigData.com
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuninginside-BigData.com
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficientlyinside-BigData.com
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Erainside-BigData.com
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computinginside-BigData.com
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Clusterinside-BigData.com
 

More from inside-BigData.com (20)

Major Market Shifts in IT
Major Market Shifts in ITMajor Market Shifts in IT
Major Market Shifts in IT
 
Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...Preparing to program Aurora at Exascale - Early experiences and future direct...
Preparing to program Aurora at Exascale - Early experiences and future direct...
 
Transforming Private 5G Networks
Transforming Private 5G NetworksTransforming Private 5G Networks
Transforming Private 5G Networks
 
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
 
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
 
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
 
HPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural NetworksHPC Impact: EDA Telemetry Neural Networks
HPC Impact: EDA Telemetry Neural Networks
 
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean MonitoringBiohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
 
Machine Learning for Weather Forecasts
Machine Learning for Weather ForecastsMachine Learning for Weather Forecasts
Machine Learning for Weather Forecasts
 
HPC AI Advisory Council Update
HPC AI Advisory Council UpdateHPC AI Advisory Council Update
HPC AI Advisory Council Update
 
Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19Fugaku Supercomputer joins fight against COVID-19
Fugaku Supercomputer joins fight against COVID-19
 
Energy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic TuningEnergy Efficient Computing using Dynamic Tuning
Energy Efficient Computing using Dynamic Tuning
 
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
State of ARM-based HPC
State of ARM-based HPCState of ARM-based HPC
State of ARM-based HPC
 
Versal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud AccelerationVersal Premium ACAP for Network and Cloud Acceleration
Versal Premium ACAP for Network and Cloud Acceleration
 
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance EfficientlyZettar: Moving Massive Amounts of Data across Any Distance Efficiently
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
 
Scaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's EraScaling TCO in a Post Moore's Era
Scaling TCO in a Post Moore's Era
 
CUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computingCUDA-Python and RAPIDS for blazing fast scientific computing
CUDA-Python and RAPIDS for blazing fast scientific computing
 
Introducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi ClusterIntroducing HPC with a Raspberry Pi Cluster
Introducing HPC with a Raspberry Pi Cluster
 
Overview of HPC Interconnects
Overview of HPC InterconnectsOverview of HPC Interconnects
Overview of HPC Interconnects
 

Recently uploaded

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 

How HPC and large-scale data analytics are transforming experimental science

  • 1. Supercomputing and the Scientist Debbie Bard Group Lead for Data Science Engagement
  • 2. Biology, Energy, Environment Computing Materials, Chemistry, Geophysics Particle Physics, Astrophysics Largest funder of physical science research in U.S. Nuclear Physics Fusion Energy, Plasma Physics - 2 - NERSC is the mission HPC and data facility for the U.S Department of Energy Office of Science - 2 -
  • 3. NERSC is the mission HPC and data facility for the U.S Department of Energy Office of Science Simulations at scale Data analysis support for DOE’s experimental and observational facilities Largest funder of physical science research in U.S. 7,000 Users 800 Projects 700 Codes ~2000 publications per year - 3 -
  • 4. NERSC Systems: present and future NERSC-7: Edison Multicore CPU NERSC-8: Cori Manycore CPU NESAP Launched: transition applications to advanced architectures 2013 2016 2024 NERSC-9: CPU and GPU nodes Continued transition of applications and support for complex workflows 2020 NERSC-10: Exa system 2028 NERSC-11: Beyond Moore - 4 -
  • 5. What’s a supercomputer anyway? - 5 -
  • 6. - 6 - Supercomputers have super-fast interconnect between nodes Cluster cabinet Cori cabinet
  • 7. Supercomputers have specialist storage systems • Scale out the file system to 100s of storage servers • Access FS over high-speed interconnect: high aggregate bandwidth • Global, coherent namespace – Easy for scientists to use – Hard to scale up metadata operations! - 7 - Compute Nodes IO Nodes Storage Servers How do you distribute PBs of data and millions of files to hundreds of thousands of compute cores, with no latency?
  • 8. Supercomputing usability for experimental science Easier to: • Write and run large-scale parallelized code over 10,000s nodes. • Read in and write out huge data files Harder to: • Port code directly from your laptop/cluster. • Read and write lots of small files. • Get fast turnaround on your compute jobs. • Stream data from external source. - 8 -
  • 9. Cori: Pre-Exascale System for DOE Science • Cray XC System • >9600 68-core Intel KNL compute nodes, >2800 32-core Intel Haswell nodes • Cray Aries Interconnect • NVRAM Burst Buffer, 1.6PB of SSDs, 1.7TB/sec I/O • Lustre file system 28 PB of disk, >700 GB/sec I/O • Investments to support large scale data analysis – High bandwidth direct connection between experimental facilities and compute nodes – Virtualization capabilities (Shifter/Docker) – More login nodes for managing advanced workflows – Support for real time and high-throughput queues - 9 -
  • 10. Cori: Pre-Exascale System for DOE Science • Cray XC System • >9600 68-core Intel KNL compute nodes, >2800 32-core Intel Haswell nodes • Cray Aries Interconnect • NVRAM Burst Buffer, 1.6PB of SSDs, 1.7TB/sec I/O • Lustre file system 28 PB of disk, >700 GB/sec I/O • Investments to support large scale data analysis – High bandwidth direct connection between experimental facilities and compute nodes – Virtualization capabilities (Shifter/Docker) – More login nodes for managing advanced workflows – Support for real time and high-throughput queues - 10 - #5 most powerful computer on the planet in Nov 2016. #12 today.
  • 11. NERSC-9: A System Optimized for Science - 11 - • Cray Shasta System providing 3-4x capability of Cori system • First NERSC system designed to meet needs of both large scale simulation and data analysis from experimental facilities – Includes both NVIDIA GPU-accelerated and AMD CPU-only nodes – Cray Slingshot network for Terabit-rate connections to system – Optimised data software stack enabling analytics and Machine Learning at scale – All-flash file system for accelerated IO
  • 12. The needs of experimental facilities drive the design of our supercomputers - 12 - Future experiments Experiments operating now BioEPIC
  • 13. How Computing impacts experimental science • Inform Experiments – Simulations guide instrument design – Simulations guide experimental methodology – Real-time feedback guides experimental operations GEANT4 ATLAS model Advanced Lightsource data analysis - 13 -
  • 14. How Computing impacts experimental science • Inform Experiments – Simulations guide instrument design – Simulations guide experimental methodology – Real-time feedback guides experimental operations • Analyze Data – Convert measured phenomena into meaningful statistics – Compare theory to measurement GEANT4 ATLAS model - 14 - Viz of mouse brain ions
  • 15. How Computing impacts experimental science • Inform Experiments – Simulations guide instrument design – Simulations guide experimental methodology – Real-time feedback guides experimental operations • Analyze Data – Convert measured phenomena into meaningful statistics – Compare theory to measurement • Replace Hardware – Why solve a problem in hardware if you can solve it in software? GEANT4 ATLAS model Viz of mouse brain ions LSST CCD tree ring effects- 15 -
  • 16. - 16 - Enabling new discoveries by coupling experimental science with large scale data analysis and simulations
  • 17. - 17 - • How does photosynthesis happen? • How do drugs dock with proteins in our cells? • Why do jet engines fail? Supercomputing for real-time experiments Super-intense femtosecond xray pulses, >10PB data, up to 100 PF required for analysis
  • 18. Supercomputing for data analysis - 18 -A billion proton-proton collisions per second and multi-GB of data per second. • What is the relationship between fundamental particles? • What is the mechanism that gives matter mass?
  • 19. Supercomputing for sequencing • How does the soil microbiome impact crop success? • How did viruses evolve? • Can we engineer enzymes for more effective carbon fixation? >170 trillion bases sequenced per year, >7PB of archived data, >100,000 users
  • 20. Supercomputing to enable radical new detectors 20 FPGA-based readout system 4D scanning transmission electron microscope, >1TB/sec data • How does the structure of batteries impact their performance? • Can nanocrystals be used to store carbon dioxide?
  • 21. Custom computing for scientific data • Enable experimentation with data reduction and analysis techniques – Enable higher frame rates – Real-time data quality feedback – New analysis algorithms - 21 -
  • 22. Supercomputing and Machine Learning • Scientific data is typically large and complex – Harder to find optimal hyperparameters – Need lots of prototyping and model evaluation • Key metric: time to scientific insight – Don’t want to wait for days to train a single model – Fast turnaround of ideas and exploration - 22 - → use supercomputers to scale machine learning algorithms to multiple nodes Physics papers on the arXiv with abstracts containing contain phrase “deep learning” : 36 in 2016 133 in 2017 335 in 2018 109 in 2019
  • 23. Supercomputing and Machine Learning • Scientific data is typically large and complex – Harder to find optimal hyperparameters – Need lots of prototyping and model evaluation • Key metric: time to scientific insight – Don’t want to wait for days to train a single model – Fast turnaround of ideas and exploration - 23 - → use supercomputers to scale machine learning algorithms to multiple nodes
  • 24. - 24 - Determining the fundamental constants of cosmology https://arxiv.org/abs/1808.04728
  • 25. Determining the fundamental constants of cosmology • Achieved unprecedented accuracy in cosmological parameter estimation. • Scaled out to 8192 CPU nodes; 20min training time; 3.5PF sustained performance. • Largest application of TensorFlow on CPU-based system with fully-synchronous updates. - 25 - https://arxiv.org/abs/1808.04728
  • 26. Characterising Extreme Weather in a Changing Climate - 26 - https://arxiv.org/abs/1810.01993
  • 27. • High quality segmentation results obtained for climate data. • Network scaled out to 4560 Summit nodes (27,360 Volta GPUs). • 60min training time, 0.99 EF sustained performance in 16-bit precision. • Largest application of TensorFlow on GPU-based system, first Exascale DL application. - 27 - Characterising Extreme Weather in a Changing Climate https://arxiv.org/abs/1810.01993
  • 28. Exascale Deep Learning is driving innovation in supercomputing - 28 -
  • 29. Exascale Deep Learning is driving innovation in supercomputing - 29 - Open Questions – How much of the future supercomputer workload will be machine learning? • How far will scientists be able to use machine learning? • Will the interpretability problem ever be solved? – Can specialist devices be used for other algorithms? – What does the ideal storage system look like for AI?
  • 30. • Supercomputers play an increasingly important role in experimental science. • High performance computing can change the way scientists form their questions, and open new possibilities in detector design, experiment operations and data analysis. • We need a co-evolution of experimental and computing techniques to leverage bleeding- edge technologies for scientific insight.
  • 31. Thanks! NERSC is hiring! nersc.gov/careers