Overview of Use Case Development and Cross-Disciplinary Planning Activities
Before undertaking a full design of the prototype software and data archive, the DASPOS project will define use cases for archived data and software in HEP. The potential uses of the data determines almost everything about how it will be curated, from the amount of mirroring required, when the data is archived, what connectivity to the storage is required, how the data is registered and retrieved, and any necessary policies around access and use.
While the focus of this preliminary effort is centered on the HEP community, many other disciplines face similar issues for the preservation of their data, software, and documentation. DASPOS will therefore engage other disciplines in this phase of the project. Doing so will capture insights from disciplines that have already begun to address archival issues and will identify commonalities that indicate opportunities for reusable infrastructure. Collaboration with existing inter-disciplinary data preservation activities such as the DataNet-initiated Data Conservancy project at Johns Hopkins will ensure a coordinated approach and non-duplication of effort. This particular task will also benefit greatly from the inclusion of experts from the OSG Consortium as contacts for this proposal, since the OSG is the major US platform for multi-disciplinary, distributed high throughput computing in the US.
Guided by the use case requirements, and in conjunction with the wider HEP community and DPHEP, data description vocabularies and dictionaries, and metadata formats will be developed to support efficient data discovery and possibly data visualization. At the same time, access and IP policies, standardization for data, software, and analysis documentation, and other issues necessary for data management and curation will be addressed.