HPDA in eScience with the Ophidia framework
PyOphidia Usage Tutorial
High Performance Data Analytics with the Ophidia framework
Overview
This training provides an overview of the Ophidia HPDA framework main features for climate data analysis and a practical tutorial on how to use the framework in examples of real-world analysis. It covers all the key concepts needed to effectively start using Ophidia in HPDA applications. The contents of this training have been extracted from the ESiWACE2 2021 course on HPDA and Visualisation. The code shown in the practical part is based on the Ophidia Python bindings: PyOphidia v1.9. For questions, please contact ophidia-info@cmcc.it
Introduction to the Ophidia framework
The effective management of the increasing data volumes in many scientific domains requires analysis tools able to effectively scale with these massive datasets.
Ophidia (http://ophidia.cmcc.it/) is a CMCC Foundation research effort addressing Big Data challenges for eScience. The Ophidia framework represents an open source solution for the analysis of scientific multi-dimensional data, joining HPC paradigms and Big Data approaches. It supports parallel, in-memory data processing, data-driven task scheduling and server-side analysis. Ophidia is primarily used in the climate change domain, although it has also been successfully exploited in other scientific domains.
This section introduces the Ophidia framework, its design and the main features supported for data management and analysis.
Outline
- Ophidia framework overview and motivations
- Main features and interaction modes provided
- Framework architecture
- Storage model, data partitioning and mapping to NetCDF file
- Data analytics operators and primitives
- Introduction to PyOphidia
Note: the slides in the attachment slightly differ from those in the video since these have been updated after the event
Practical tutorial with PyOphidia
This section provides a practical walkthrough of the features provided by the framework using the PyOphidia Python module applied to some real climate data. It provides step-by-step instructions to execute the tutorial and the listing of the tutorial notebook.
Tutorial and hands-on notebooks as well as full instructions for their execution can be found on GitHub: https://github.com/ESiWACE/hpda-vis-training/tree/master/Training2021/Session3
Outline
- Accessing and setting up the tutorial environment
- Loading a NetCDF file
- Managing the cube virtual space
- Analysing an Ophidia datacube
- Plotting the results on a map
- Performing time series analysis
Note: the slides in the attachment slightly differ from those in the video since these have been updated after the event