Advanced R (with tidyverse)

Available Dates

No public dates currently available for this course

This course follows on from the introductory course in familiarising you with the core R language

In this course we focus on extending your language knowledge to include more advanced filtering and data maniupulation. We look at data restructuring if your original data isn't in the right format, and the summarisation of datasets with repeated values. We also show how to deal with awkward data - files that don't import cleanly, have missing values or are inconsistently annotated. This course will make you a more practial and proficient R programmer.

Pre-Course Requirements & Suggestions

This course assumes that you have knowledge or skills equivalent to those taught in the following courses.

Introduction to R (with tidyverse)

Please ask us if you're unsure if you have the necessary knowledge or skills for this course.

Course Content

(click to expand each section)

We will look in more depth at the read_ functions in the ReadR package. We'll look at how to cure common import problems such as mis-detected column types or files with additional headers on them.
We are going to extend our knowledge of the dplyr package for selecting and filtering data. Here we review piped operations, and look at new functions for sorting and deduplicating, as well as some more advanced ways to select columns to work on.
Moving on from vectors we look at some more complex data structures in R and how they can be used to model your data. We look at the Tibble as the standard 2D datastructure we'll be using. We introduce the 'tidyverse' and how it supplements the functionality of the core R language and we look at some tidyverse function for reading data from text files into tibbles.
We will look at what you can do if your data doesn't arrive with the 'tidy' structure which R works best with. We will look at restructuing data between 'long' and 'wide' formats, both of which may be needed during an analysis.
We look at how to modify datasets by either adding new columns computed from existing data, or by using grouped operations to summarise rows which are replicates of each other.
In this final tidyvserse session we will look at how to extend existing datasets with more data. We do this both through the addition of more rows/columns, but also look at combining related datasets with joining operations.
Writing your own functions is a great way to reuse code in more than one place in your script. Here we show the structure of a custom function and how it can be written to play nicely with other tidyverse functions.