BioTrain - Bioinformatics and Statistical Training

The processing of many modern datasets requires the use of a unix or linux environment, and many people use this as their preferred operating system. In this course we look at how you can use the unix command line to control the running of individual programs, to manage your data and to perform some basic automation to make large scale processing easier.

We start by sorting out some terminology, Unix vs Linux vs POSIX and then look at the different types of installation you might see. We look at the different options for connecting to a machine, both graphical and command line and using both password and key based authentication.

We look at how to launch programs within a unix environment using the command line. We show the use of switches to affect the behaviour of the program, and

Managing your files of data is something which takes up much of your time. We look at how the unix filesystem works, how to easily write out the location of files in commands and look at programs to view, edit, rename, copy and delete files and folders

In the final section we look at some options for automating more long running processes. We look at redirection and piping. These allow us to save output or debug messages from programs to files instead of having them printed to the sceen. We also look at the use of pipes to link programs together into simple pipelines. Finally we show how you can use some simple BASH loop code to automate the running of similar commands large numbers of times so you can process multiple files, or try out multiple command options easily.

Introduction to Linux and Bash

Available Dates

Pre-Course Requirements & Suggestions

Course Content

Introduction to Linux and Bash

Available Dates

Pre-Course Requirements & Suggestions

Course Content

Introduction and Logging In

Basic Unix commands

File System Basics

Redirection and Scripting