1 Introduction
High parameter cytometry is an expensive and extensive undertaking. It requires lots of time, expense and hard work. It’s crucial to have an idea of how you’ll go about analysing the data; what approaches and software you’ll use.
Here we’ll show you how you can apply high parameter approaches in the context of R to analyse the data.
1.1 Expert Manual Gating is not enough
Analyzing high parameter cytometry data can be daunting. There might be millions of events with up to 50 channels each. If you rely on traditional gating strategies you might become quickly overwhelmed by the amount of data. Traditional gating is also subjective and doesn’t analyse all of the data. What if there are conclusions that can be drawn from the populations you haven’t analyzed with your gating hierarchy? Why not use a technique that analyses all of the data?
1.3 Why?
To show the relative ease with which you can do your own high parameter analysis quickly and easily using open-source tools like RStudio.
1.4 How?
Together we’ll analyse some real files generated within our core. These files can be downloaded so that you can recreate the exact analysis done here. Brief step-by-step snippets of code are provided to show how each step of data analysis is done.
In addition, there are deeper explanations of what that particular snippet of code is doing, how you might modify it for different applications and alternative methods of analysis
So you’ve spent a lot of time and money
- Generating a panel, optimizing it and carefully preparing and well-stained samples.
- You’ve made sure to acquire all of the correct single colour and fluroescence or metal-minus one controls.
- You’ve set your voltages carefully and tweaked the compensation until it’s perfect.
- You’ve acquired a large number of live, single events so that you have 100’s or thousands of your rarest events.
High parameter analysis is the important next step. The analysis lets us take the hard work, time and cost put in so far and turn it into beautiful data and convincing figures.
1.5 Why choose R for your analysis?
While it is possible to do high parameter cytometry analysis with FlowJo, Cytobank or other proprietary software (and I encourage you to try them out if you have access), performing your analysis in an programming environment provides many benefits:
- Free
- R, Rstudio and R libraries are all available for free
- Proprietary software now costs several hundreds or thousands dollars every year.
- R, Rstudio and R libraries are all available for free
- Flexibility in analysis and output
- Working with R lets us modify our figures however we like.
- ggplot and similar tools allow us to get every graph exactly as we like.
- We can output figures in a wide array of formats (pdf, html, png, Microsoft Office etc.)
- Working with R lets us modify our figures however we like.
- Novelty and choice
- Most new analysis approaches are released as R packages long before they are available through other avenues (if they ever are).
- The analysis capabilities available in software packages like FlowJo or Cytobank are often months or years behind what’s avaialble in R.
- Most new analysis approaches are released as R packages long before they are available through other avenues (if they ever are).
- Access to the newest, most powerful algorithms and approaches.
- Almost every month we see new algorithms or approaches to cytometry analysis released, most often through R packages.
- The skills we use for cytometry analysis are similar to those required for genomics analysis so you may already have the skills and knowledge to undertake these analyses
1.6 What we’ll introduce
To help you get to grips with this new way to analyze your data I’ll take you through a step-by-step analysis of high parameter data using the R programming language and show you a complete work-flow for analyzing your data. This guide assumes zero experience with cytometry data and zero experience with R. If you are familiar with either R or cytometry data feel free to skip sections - see the table of contents on the left. Otherwise, try to work through the entire document. It outlines a useful workflow that should apply to most data sets generated from the high parameter cytometers like the
We’ll be using some data from a technical test comparing fixation and permeabilization conditions in bone marrow. You’ll get a link to download the data to your computer in the Data Import chapter when we first show you how to import the files into R. After you’ve worked through the example data, work through it with your own data.
How we’ll work:
- Each chapter takes an important step in the analysis and breaks it down into 2 different options. There is the <span style = color: “red”;> Essentials which offers the minimum code required to manage that step of the analysis. Use these code blocks to get things done and see how things work. The rest of each chapter is a <span style = color: “blue”;> Deeper Dive , which offers up a more explanation and understanding of how the code works for each step and why we do things in particular ways. Furthermore, it offers alternative code and tools to achieve the same or similar steps. If you’re particularly interested in a specific step, the Deeper Dive will allow you to learn more about it.