The goal of this partnership was to implement a scalable and sustainable multi-institutional cyberinfrastructure cloud federation model that provides data analysis building blocks in support of multiple research disciplines requiring flexible workflows and analysis tools for large-scale data sets.
The project supported a rich set of open source software and cloud usage modalities (e.g., Apache Hadoop for large data sets, Virtual Machine (VM) snapshots of complex software systems, dynamically-sized SLURM clusters, etc.) and implemented and optimized frameworks for query based exploration, modeling, and analysis.
Research scientists and users strategically explored problems of increasing complexity and corresponding increases in data. Data challenges from a diversity of fields (earth & atmospheric sciences, finance, chemistry, astronomy, civil engineering, genomics, etc.) were addressed with collaborators from over 40 academic institutions, public agencies, and research labs as well as citizen scientists. In addition to advancing scientific knowledge, this project advanced the knowledge of federated and hybrid cloud computing and their potential roles as campus bridging paradigms.