The Cornell University Center for Advanced Computing, with partners University at Buffalo and University of California, Santa Barbara, are deploying data infrastructure building blocks (DIBBs) for multi-institutional cyberinfrastructure (CI). Our approach combines data analytics and flexible workflow management in the form of a federated cloud model capable of supporting large-scale, shared, and collaborative data analysis. This project will be metrics driven, leveraging system analytics provided by Open XDMoD and DrAFTS, to ensure effective resource utilization and optimal time to science.
This project is multi-institutional in two important ways. First, it supports 7 science use cases with researchers and their collaborators at each partner site and extends to collaborators located at other institutions. This provides access to extended research groups with common data interests and requirements without each group having to replicate critical data assets.
Second, our cloud federation model enables the project’s storage assets to be shared among the three partner institutions, providing their researchers and collaborators elasticity that may not be financially or logistically possible at each individual site, particularly as the number of researchers requiring large scale data analysis grows.
An important goal of this project is to demonstrate a new model of cloud federation that is a sustainable and effective way for institutions to augment campus CI for collaborative data analysis. This model includes a common allocation and accounting mechanism, providing a transparent resource exchange mechanism between partner institutions.