Sequence Read Archive

The National Center for Genome Analysis Support (NCGAS) offered this workshop as part of ASM Microbe 2019, June 20-24, 2019, in San Francisco, CA.

The Sequence Read Archive (SRA) hosts ~11PB of data from across the world. Mining this database is useful for research and provides the ability to add more datasets at no cost. The goal of this workshop is to help researchers working with, or interested in working with, SRA to be able to run their bioinformatics workflows efficiently using computational resources available through NCGAS/XSEDE.

The NCGAS was funded by the National Science Foundation under Grant Nos. DBI-1062432 2011 , ABI-1458641 2015 , and ABI-1759906 2018 to Indiana University.

Learning Objectives

By the end of this workshop, participants should leave with the following knowledge:

Introduction to NCGAS and HPC
Bioinformatics programs available to mine SRA
R to visualize the data

Prerequisite Skills

Basic Linux functionality (sign in, moving around file system, etc.), but expertise is not required. Take the Unix the Basics course.
For online courses, you will need your own computer and access to the internet
- An SSH client is required. This knowledge base article is helpful: https://kb.iu.edu/d/ahjh (Links to an external site.) (Links to an external site.)
- Mac and Linux machines should have built in SSH clients. For Windows, check and see if you have a program called PuTTY. If you have Windows 10, you may also consider installing the Windows subsystem for Linux: https://docs.microsoft.com/en-us/windows/wsl/install-win10 (Links to an external site.) (Links to an external site.)

For in-person workshops, a laptop is required—if you do not have one, contact the organizer to borrow one

Agenda

Introduction to HPC and nationally available resources
Tools available to mine SRA
Visualization using R

Access the course materials

Mining SRA (via GitHub)

The Supercomputing for Everyone Series (SC4ES) aims to bring more users into the realm of advanced computing, whether it be visualization, computation, analytics, storage, or any related discipline. Research Technologies can take you to the next level of computing.

Supercomputing for Everyone Series workshops and seminars are led by personnel from Research Technologies, a division of University Information Technology Services and a center in the Pervasive Technology Institute at Indiana University.

Request a custom time/date for a particular workshop or seminar

Go to list of all IT Training events