This intermediate-level Storage Options and Data Management Best Practices workshop will discuss techniques and best practices for administering shared access to data, ensuring continuity of access, securing research data, and establishing a data lifecycle management policy. This workshop is intended for power users, lab PIs, and collection managers actively involved in generating, managing, and/or administering data.
While the emphasis here is on data stored within IU’s research computing systems, the principles in this workshop can be applied to many digital storage contexts and applications.
Topics covered include migrating data from cloud providers to on-premises IU digital storage (and vice-versa), best practices for storing and managing data, the distinction between backing up and archiving, and how to manage permissions for a data share.
Workshop attendees will:
- Become familiar with current best practices for storing and managing digital data, particularly with regards to resiliency and longevity.
- Learn various options available to migrate data between IU research computing storage systems and cloud providers with whom IU has usage agreements (Google Drive / Microsoft OneDrive).
- Become familiar with basic concepts in data security and access control, such as the use of permissions or access control lists (ACLs).
- Be able to select between permissions and ACLs based on the needs of a particular project or use case.
- Attendees will need an account on one of IU's research supercomputers (Big Red 200 or Quartz). Attendees can sign up for more than one, if eligible. Create Additional Accounts website: https://access.iu.edu/Accounts/Create.
- Accounts for Slate and the Scholarly Data Archive (SDA) are also recommended, as those services will also be covered in this workshop.
- Familiarity with the eligibility requirements for IU supercomputing accounts as well as for the SDA. For more information on eligibility, see the following IU KB article: https://kb.iu.edu/d/aczn#research.
- If you believe you are eligible for an account and the option to open an account is not available to you, contact email@example.com.
- Introductions and housekeeping
- Establishing data lifecycle policies
- Managing access: sharing data, securing data, preventing orphaned data
- Integrating cloud storage
- Automating transfers with Globus
- ACTIVITY – Globus timers as part of an automated transfer workflow
- Wrapping up, Q&A
Go to the recording from January 2023 workshop.Storage Options and Data Management Best Practices (on YouTube)
Review workshop materialsWorkshop slides and resources (on Google Drive)
The Supercomputing for Everyone Series (SC4ES) aims to bring more users into the realm of advanced computing, whether it be visualization, computation, analytics, storage, or any related discipline. Research Technologies can take you to the next level of computing.
Supercomputing for Everyone Series workshops and seminars are led by personnel from Research Technologies, a division of University Information Technology Services and a center in the Pervasive Technology Institute at Indiana University.