Are you hearing about CUT&RUN but not sure where to get started? Maybe you're already running CUT&RUN experiments but want to better understand the assay and downstream analysis steps. This post will help biologists and bioinformaticists understand the foundations for designing a CUT&RUN experiment, and analyzing your data to maximize novel insight gained.
Since the early 2000s, measuring binding sites of DNA-associated proteins has widely been done using ChIP-seq1, a genome-wide epigenomics profiling assay. In 2017, Peter Skene and Steven Henikoff published a paper introducting a new assay, CUT&RUN2. In the own authors' words:
CUT&RUN is simple to perform and is inherently robust, with extremely low backgrounds requiring only ~1/10th the sequencing depth as ChIP, making CUT&RUN especially cost-effective for transcription factor and chromatin profiling.
Cleavage Under Targets and Release Using Nuclease
Although all methods measure protein-DNA interactions with the use of an antibody, there are some key pros and cons to consider when selecting an assay to test your hypothesis.
CUT&Tag was invented in 2019 and addresses a similar question as CUT&RUN with different wet lab sample preparation - CUT&Tag utilizes the same tagmentation enzyme, Tn5, as ATAC-seq to facilitate the cutting of protein-bound DNA fragments3.
Expand each section to learn more
A note on spike-in controls for normalization: Spike-in controls can be added to each sample in order to normalize your data for differences in sequencing depth as long as they are used appropriately. It is currently debated in the field whether there is an added benefit to using spike-ins for CUT&RUN because, unlike ChIP-seq, CUT&RUN already has a high signal-to-noise ratio. Spike-in controls require precise preparation in order to be useful - if misapplied, they may actually add noise to your experiment.
Antibodies - You'll need to purchase the primary antibody that targets your protein of interest
Cells or nuclei - Your samples!
Ideally, CUT&RUN experiments should include a samples with H3K4me3 antibody as a positive control, in addition to an appropriately matched IgG negative control sample for each of your “target” samples. Target samples use an antibody that is typically either a histone marker (e.g. H3K4me3) or an antibody to a transcription factor (e.g. MLL), co-factor, or other protein of interest.
Control samples are a critical component for designing a good CUT&RUN experiment with interpretable results. Unlike other commonly used next-generation sequencing (NGS) experiments, NGS experiments that utilize peak calling algorithms for surveying antibody binding such as CUT&RUN, CUT&TAG, and ChIP-seq require the use of a good input control sample.
The input control for CUT&RUN and CUT&TAG experiments is usually an antibody to IgG. The pAG-MNase nuclease used in CUT&RUN experiments has a high binding affinity to most commonly used organism IgG antibodies with the exception of mouse. Since pAG-MNase is a nuclease, it is still able to cut DNA non-specifically in the absence of the target antibody when regions are open enough for the pAG-MNase to fit between regions of condensed chromatin.
We identify these "non-specific" regions of nuclease activity by assessing where pAG-MNase is binding to IgG coated regions of the chromatin as a proxy for determining where the non-specific sequences are being cut across the genome. The number of reads associated with these regions in the IgG only sample (no target antibody) serves as a good model of our background or noise in the data. These regions are then modelled or "subtracted out" of the reads being called in the target regions in the same area to determine whether the peak signal we see in the target is real, resulting in a potentially biologically interesting find, or non-specific and therefore deemed biologically insiginificant or uninteresting and filtered out.
After you've prepared your samples and sent them off for sequencing, you'll receive FASTQ files. These large, raw sequencing files need to be processed through a multi-step CUT&RUN pipeline in order to ultimately generate consensus peak counts.
Common steps comprising a CUT&RUN pipeline (and some of the software used to perform them) include:
Running the above tools yourself will require compute infrastructure and advanced coding. Expect the CUT&RUN pipeline to run in about 1-3 hours per sample depending on your compute resources and parallelization approach.
Want to get to insights faster? With Pluto, you can run an end-to-end CUT&RUN pipeline in your browser, with no infrastructure or coding required. Learn more with a live, 15-minute demo.
There are a wide variety of analyses and visualizations you can use to investigate your CUT&RUN data. Here are a few ideas to get you started doing data analysis!
Create all of these plots (and more!) in Pluto. Contact us to get started running bioinformatics analyses to create the interactive plots shown in this blog, all from your browser.
Analysing CUT&RUN data in your browser, without the need to manage pipelines can be a big win for your team. Allowing collaboration and agile discovery live in a meeting or presentation will transform how you iterate on your scientific research.
Thanks for reading this brief overview of CUT&RUN experiments! To learn more about how your team can collaboratively analyze CUT&RUN data in your browser with Pluto, visit our website or our CUT&RUN page to get started today.