Preface and Notes
0.1 What this is
The goal of this book is to provide an easy to follow, step-by-step guide to get you analyzing your own cytometry data with R. We’ve assumed zero experience so we’ll cover the basics including how to install R and all of the tools that you’ll need for the complete analysis.
0.2 What this isn’t
While I’ve tried to cover everything necessary to get you up and running with R, this is not a complete introduction to R programming. We’ll provide you with all of the tools necessary to analyze cytometry data but your R education will definitely be lacking in some major areas. If you enjoy R then there are many free resources that will give you a fuller education in R programming. R for Data Science is a fantastic resource for R with a focus on data science and is freely available.
0.3 Structure
After helping you install and set up an environment we’ll work through the standard steps of high paramerter data analysis. For the most part the analysis pipeline shown here is agnostic of instument or platform type - it ill work as well for data from mass, flow or
- Packages and Libraries
- Data Import
- Data Wrangling
- Clean the data
- Transform the data
- Visualize the Data
- Interpret the Data
- Iterate
We’ll look at each of these steps in detail subdividing them into simple steps that you can follow to complete your analysis.
0.3.1 How to use this book
The first chapter deals with installing R, RStudio and setting up a file structure suitable for any analysis. We’ll work through where to get R, RStudio and the other tools you’ll need. If you’ve already installed R and RStudio for other analyses you do, then feel free to skip this chapter. If you have never installed R before follow the steps in order as the order of operations is very important
The second chapter deals with installing packages and libraries that expand R’s functionality and allow us to analyse data (and cytometry data in particular) more easily.
After that we’ll work through the analysis beginning with importing the files onto your computer and then importing them into the R environment. Subsequent chapters will then work through an analysis and generating figures suitable for publication.
The data used as examples in this comes from a test to measure the effectiveness of various fixation methods on bone marrow cells.
We’ll work through a complete analysis using this example data (you’ll be able to download it from Dropbox or OneDrive in Chapter 2). This means you can completely replicate the analysis presented here and become confident in the code and commands needed. After this you can move on to analyzing your own data using the same workflow before finally adapting it to suit your specific requirements. Building up your skills and confidence in this way should provide a solid foundation.
Each chapter after the first deals with a single step in the analysis process. Your analysis may not need all of the steps but I recommend you work through all of the steps with the example data. This will give you a solid grasp of how the data is transformed during each step.
0.3.2 How to use each chapter
Each chapter is broken into 2 sections, one short section at the beginning and then a much longer cell afterwards, The short section is designed to show you the most economic code required to achieve the task at hand. A short discussion of what the code does and why it was chosen follows. The longer section goes into more depth about the task feature and provides alternatives and options available if you’re particularly interested in that step.
0.3.3 A note on the code
What you’ll find here will NOT be the most efficient, elegant code possible. The code provided is laid out to make it clear and obvious what each step does. Each piece of code could almost certainly be written more elegantly and more economically and I encourage you to experiment with this once you understand how things work and what each piece of code is trying to do. Furthermore, there is some repetition of code as each step is broken down line by line and discussed. This allows more explanation and breakdown of what is going on but this does come at the expense that the code snippets may be longer than necessary.