The RNA-guided nuclease Cas9 can be reengineered as a programmable transcription factor.
However, modest levels of gene activation have limited potential applications. We
describe an improved transcriptional regulator through the rational design of a tripartite
activator, VP64-p65-Rta (VPR), fused to nuclease-null Cas9. We demonstrate its utility
in activating endogenous coding and non-coding genes, targeting several genes simultaneously
and stimulating neuronal differentiation of human induced pluripotent stem cells (iPSCs).
Cas9 is an RNA-guided endonuclease that is directed to a specific DNA sequence through
complementarity between the associated guide RNA (gRNA) and its target locus
1,2
. Cas9 can be directed to nearly any arbitrary sequence with a gRNA, requiring only
a short protospacer adjacent motif (PAM) site proximal to the target
3–5
. Through mutational analysis, variants of Cas9 have been generated that lack endonucleolytic
activity but retain the capacity to interact with DNA
2,6,7
. These nuclease-null (dCas9) variants have been subsequently functionalized with
effector domains such as transcriptional activation domains (ADs), enabling Cas9 to
serve as a tool for cellular programming at the transcriptional level
6,8–10
. The ability to program the robust induction of expression at a specific target within
its native chromosomal context would provide a transformative tool for myriad applications,
including the development of therapeutic interventions, genetic screening, activation
of endogenous and synthetic genetic circuits, and the induction of cellular differentiation
11–13
.
In natural systems, transcriptional initiation occurs through the coordinated recruitment
of necessary machinery by a number of locally concentrated transcription factor activation
domains (ADs). As a result, we hypothesized that the tandem fusion of multiple ADs
would increase transcriptional activation by mimicking the natural cooperative recruitment
process. Towards this goal a series of more than 20 candidate effectors with known
transcriptional roles were fused to the C terminus of Streptococcus pyogenes (SP)-dCas9,
and their potency was assessed by a fluorescent reporter assay performed in human
HEK 293T cells (Supplementary Figs. 1 and 2)
14
.
Of the hybrid proteins tested, dCas9-VP64, dCas9-p65, and dCas9-Rta showed the most
meaningful reporter induction. Nonetheless, neither the p65 nor the Rta hybrids were
stronger activators than the commonly used dCas9-VP64 protein. Taking dCas9-VP64 as
a starting scaffold, we subsequently extended the C-terminal fusion with the addition
of either p65 or Rta. As predicted, these bipartite fusions exhibited increased transcriptional
activity. Further improvement was observed when both p65 and Rta were fused in tandem
to VP64, generating a hybrid VP64-p65-Rta tripartite activator (hereon referred to
as VPR) (Supplementary Fig. 3).
To begin characterizing VPR, we verified the importance of each of its constituent
domains (VP64, p65, and Rta) by replacing the respective member with mCherry, and
measuring the resulting protein’s activity by reporter assay. All fusions containing
mCherry exhibited decreased activity, demonstrating the essentiality of all three
domains (Supplementary Fig. 4). We further validated the importance of domain order
by shuffling the positions of the three domains, generating all possible non-repeating
dCas9 fusion proteins. Evaluation of the VPR permutations confirmed that the original
ordering was indeed optimal (Supplementary Fig. 5).
Given the potency of our SP-dCas9-VPR fusion, we investigated whether the VPR construct
would exhibit similar potency when fused to other DNA-binding scaffolds. Fusion of
VPR to a nuclease-null Streptococcus thermophilus (ST1)-dCas9, a designer transcription
activator like effector (TALE), or a zinc-finger protein allowed for an increase in
activation relative to VP64 (Supplementary Fig. 6)
15
.
Having performed initial characterization of our SP-dCas9-VPR fusion, we sought to
assess its ability to activate endogenous coding and non-coding targets relative to
VP64. To this end, we constructed three to four gRNAs against a set of factors related
to cellular reprogramming, development, and gene therapy. When compared to the dCas9-VP64
activator, dCas9-VPR showed 22 to 320 fold improved activation of endogenous targets
(Fig. 1A). While VPR was able to induce each of our target genes to a much greater
extent than VP64, a marked difference in the relative levels of gene induction between
targets was observed. Furthermore, in accordance with previous studies
16
, we noted an inverse correlation between basal expression level and relative expression
gain induced by dCas9 activators (genes with high basal expression were less potently
activated) (Supplementary Fig. 7).
To place our observed levels of activation within a biologically relevant context
we compared dCas9-VPR activation in HEK 293T cells with the expression of the same
gene within its native human tissue. Absolute comparisons in gene expression between
in vitro cell lines and native tissues are difficult, but our preliminary analysis
suggests that we were able to activate a number of our target genes to similar levels
as in their native tissues (Supplementary Fig. 8).
Cas9 enables multiplexed activation through the simple introduction of a collection
of guide RNAs against a desired set of genes. To determine the efficiency of multi-gene
targeting, we performed a pooled activation experiment simultaneously inducing four
of our initially characterized genes: MIAT, NEUROD1, ASCL1, and RHOXF2. VPR allowed
for robust multi-locus activation, exhibiting several-fold higher expression levels
than VP64 across the panel of genes (Fig. 1B).
After demonstrating dCas9-VPR’s ability to robustly activate gene expression in human
cells, we sought to further explore its versatility as a general tool for gene induction
within alternate model systems. Expression of dCas9-VPR in Saccharomyces cerevisiae,
Drosophila melanogaster S2R+ cells, and Mus musculus Neuro-2A cells led to a range
of improved activation from 5 to 300 fold over VP64 based activators (Supplementary
Fig. 9).
The ability to selectively upregulate gene expression provides a powerful means to
reprogram cellular identity for regenerative medicine and basic research purposes.
Previous work has shown that the ectopic expression of several cDNAs promotes the
differentiation of stem cells into multiple cell types. While such artificial induction
often requires multiple factors, it was recently shown that exogenous expression of
single transcription factors, Neurogenin2 (NGN2) or Neurogenic differentiation factor
1 (NEUROD1), is sufficient to promote differentiation of human iPS cells into induced
neurons (iNeurons)
17,18
. While our previous attempts to generate iNeurons from iPS cells using dCas9-VP64-based
activators were unsuccessful (data not shown), we were optimistic that the increased
potency of VPR might induce sufficient expression of NGN2 and/or NEUROD1 protein to
trigger differentiation.
Stable PGP1 iPS, doxycycline-inducible, dCas9-VP64 and dCas9-VPR cell lines were generated
and transduced with lentiviral vectors containing a mixed pool of 30 gRNAs directed
against either NGN2 or NEUROD1. To determine differentiation efficiency, gRNA containing
dCas9-AD iPS cell lines were cultured in the presence of doxycycline and monitored
for phenotypic changes (Supplementary Figs. 10 and 11). We observed that VPR, in contrast
to VP64, enabled rapid and robust differentiation of iPS cells into a neuronal phenotype.
Additionally, these cells stained positively for the neuronal markers beta III tubulin
and neurofilament 200 (Fig. 2A and Supplementary Fig. 12A, respectively). Subsequent
quantification of the staining revealed that dCas9-VPR cell lines showed a 10 to 37-fold
improvement in the amount of iNeurons observed through upregulation of either NGN2
or NEUROD1 (Fig. 2B and Supplementary Fig. 12B). Analysis by qRT-PCR revealed a 10-fold
and 18-fold increase in NGN2 and NEUROD1 mRNA expression levels, respectively, within
dCas9-VPR cells over their dCas9-VP64 counterparts (Fig. 2C).
Over the past year there were a number of exciting advances in the field of Cas9-derived
transcriptional activators. Two-component systems that rely on innovative gRNA modifications
(e.g., synergistic activation mediator (SAM) and scaffold RNA (scRNA)) and epitope-based
attachment systems (e.g., SUperNova (SunTag)) continue to push the limits of activator
potency
16,19,20
. Notably, it was shown that the multimeric recruitment of even modestly effective
activation domains (i.e., VP64 and p65) can lead to abundant increases in transcriptional
output
16,19,20
. We believe that the rational selection and ordered fusion of individual activator
domains provides an approach that is highly effective while eliminating the delivery
and design complications generated by a two-component activator. In addition, as even
modestly potent activation domains have exhibited marked improvement in activity when
repeatedly recruited to a single dCas9 protein, we envision that our more potent VPR
activator should lead to drastically improved activation if multiply recruited to
a single dCas9 protein through technologies such as SAM, scRNA or SunTag.
Beyond the utility of VPR as a technological catalyst, we believe that our design
process brings to light several important generalizations for future synthetic effectors,
most notably the importance in screening large numbers of putative candidates and
the critical role of domain order in the emergent synergy of multi-component fusions.
Online Materials and Methods
Vectors used and designed
Activation domains were cloned using a combination of Gibson and Gateway assembly
or Golden Gate assembly methods. For experiments involving multiple activation domains,
ADs were separated by short glycine-serine linkers. Activator sequences are listed
in the Supplementary Data (vectors to be deposited in Addgene). All SP-dCas9 plasmids
were based on Cas9m4-VP64 (Addgene #47319)
6
, ST1-dCas9 plasmids were based on M-ST1n-VP64 (Addgene #48675)
15
. Sequences for gRNAs are listed in the supplementary information. gRNAs for endogenous
human gene activation were selected to bind between 1 and 1000 bp upstream of the
transcriptional start site (TSS). gRNAs for iPSC differentiation to iNeurons, targeting
NGN2 and NEUROD1, were selected to bind between 1 and 2000 base pairs upstream of
the transcriptional start site. All human gRNAs were expressed from either cloned
plasmids (Addgene #41817)
5
or integrated into the genome through lentiviral delivery (plasmid SB700). Guide RNA
sequences are listed within the Supplementary Data. Reporter targeting gRNAs were
previously described (Addgene #48671 and #48672)
6
.
Mammalian cell culture and transfections
HEK 293T cells (gift from P. Mali, UCSD) and Neuro-2A cells (ATCC CCL-131), were maintained
in high glucose Dulbecco’s Modified Eagle’s Medium (Invitrogen) supplemented with
10% FBS (Invitrogen) and penicillin/streptomycin (Invitrogen). Cells were maintained
at 37°C and 5% CO2 in a humidified incubator and tested for mycoplasma yearly. Cells
were transfected in 24-well plates seeded with 50,000 cells per well. 200ng of dCas9
activator, 10ng of gRNA and 60ng of reporter plasmid (when required) were delivered
to each well with Lipofectamine 2000 (HEK 293T) or Lipofectamine 3000 (Neuro-2A),
according to manufacturer’s instructions. For multiplex activation, a 40ng mix of
gRNAs was used, with a 10ng total amount of guide per each of the four gene targets.
For example, if four guide RNAs were used against an individual target, 2.5ng of each
guide RNA were combined, to obtain a 10ng mix for that target - then the four 10ng
mixes were combined to prepare 40 ng total for transfection. Cells were grown 36–48
hours after transfection before being assayed using fluorescence microscopy, flow
cytometry or lysed for RNA purification and quantification.
S. cerevisiae manipulation
Yeast strain W303 was used for all experiments. dCas9 activator constructs were cloned
into vector pAG414-GPD (Addgene #14144)
21
. gRNAs (located between 100–200 bp upstream of the TSS) were expressed from the SNR52
promoter and cloned into the 2μ based pAG60-2u vector
22
. Cells were grown up overnight at 30°C in synthetic complete media lacking tryptophan
and uracil. The following day cells were diluted 1:100 into 5mls of fresh media and
grown for an additional 7 hours at 30°C. 2mls of culture was then spun down for RNA
extraction.
Drosophila culture
Drosophila S2R+ cells were grown in Schneider’s medium (Invitrogen) supplemented with
10% heat-inactivated FBS (JRH Biosciences) and penicillin/streptomycin (Sigma) at
25°C without CO2. Cells were transfected using Effectene Transfection Reagent (Qiagen)
according to manufacturer’s instructions. Transfections were performed in 24 well
plates and cells were seeded at 30,000 cells per well. 150ng of dCas9 activator and
50ng gRNAs were transfected and incubated for 3 days at 25°C before extraction of
total RNA. Five gRNAs were transfected against each of the indicated target genes.
Fluorescence Reporter Assay
SP-dCas9 reporter assays were performed by targeting all dCas9-ADs with a single guide
to a minimal CMV promoter, driving expression of a fluorescent reporter. Addgene plasmid
#473206 was used to screen for novel ADs (Supplementary Figs. 1 and 2) or was altered
to contain a sfGFP reporter gene instead of tdTomato (Supplementary Figs. 3B, 4 and
5). In addition, to control for transfection efficiency (Supplementary Figs. 3B, 4
and 5) an EBFP2 expressing control plasmid was co-transfected at 25ng per well (EBFP2
plasmid was not co-transformed in Supplementary Fig. 2). To remove untransfected cells
from the analysis, sfGFP fluorescence was only analyzed in cells with >103 EBFP2 expression
(as determined by flow cytometry). For fusion of VPR to other programmable transcription
factors (ST1-dCas9, TALE, and zinc-finger protein) no EBFP2 plasmid was transfected.
ST1-dCas9 reporter assays were performed using the previously described tdTomato reporter
with an appropriate PAM inserted upstream of the tdTomato coding region (Addgene #48678)
15
. The binding sequences for the zinc finger and TALE are TAATTANGGGNG and ACCTCATCAGGAACATGTT,
respectively.
qRT-PCR analysis
Yeast RNA was extracted using the YeaStar kit (Zymogen), RNA from Drosophila S2R cells
was extracted using Trizol (Life Technologies), and RNA from human cells was extracted
using the RNeasy PLUS mini kit (Qiagen). Human tissue RNA was obtained from Life Technologies
(Human Brain Total RNA (AM7962), Human Heart Total RNA (AM7966) and Human Testes Total
RNA (AM7972)). 500ng of RNA was used with the iScript cDNA synthesis Kit (BioRad),
and 0.5μl of cDNA was used for each qPCR reaction, utilizing the KAPA SYBR® FAST Universal
2X qPCR Master Mix. The Drosophila qPCR reaction used iQ SYBR. qPCR primers are listed
within the Supplementary Table 1. qRT-PCR was run and analyzed on the CFX96 Real-Time
PCR Detection System (BIORAD), with all target gene expression levels normalized to
β-actin mRNA levels (human and M. musculus), FBA1 mRNA levels (S. cerevisiae) or RpL32
mRNA levels (D. melanogaster).
Lentivirus production
Lentiviral particles were generated by transfecting 293T cells with the pSB700 sgRNA
expression plasmid (with cerulean reporter) and the psPAX2 & pMD2.G (Addgene #12260
and #12259) packaging vectors at a ratio of 4:3:1, respectively. Viral supernatants
were collected 48–72h following transfection and concentrated using the PEG Virus
Precipitation Kit (BioVision) according to the manufacturer’s protocol.
iPSC culture and dCas9-AD cell line generation
PGP1 iPS cells were obtained from the Coriell Institute Biorepository (GM23338) and
maintained on matrigel (Corning) coated tissue culture plates in mTeSR1 Basal medium
(Stemcell technologies). To generate stable iPS dCas9-AD expressing cell lines, approximately
5x10^5 cells were nucleofected with 1.5μg of dCas9-AD piggy-bac expression vector
and 340ng of transposase vector (System Biosciences) using the Amaxa P3 Primary Cell
4D-Nucleofector X Kit (Lonza), program CB-150. Following electroporation, cells were
seeded onto 24-well matrigel-coated plates in the presence of 10uM ROCK inhibitor
(R&D systems) and allowed to recover for two days before expanding to 6-well plates
in the presence of 20ug/ml hygromycin to select for a mixed population of dCas9-AD
integrant containing cells.
iPSC transduction and neural induction
iPS dCas9-AD cell lines were transduced with lentiviral preparations containing 30
gRNAs, targeted against either NEUROD1 or NGN2, one day after seeding onto matrigel
coated plates. Transduced cells were expanded and then sorted for the top 15% of cerulean
positive cells (pSB700 gRNA expression). Sorted gRNA containing dCas9-AD iPS cell
lines were seeded in triplicates onto matrigel coated 24-well plates with mTeSR +
10uM ROCK inhibitor, either in the presence or absence of 1ug/ml of doxycycline. Fresh
mTeSR medium + or − doxycycline was added every day for 4 days, at which cells were
analyzed by light microscopy, immunofluorescence and harvested for qRT-PCR analysis.
Immunostaining of Cas9 iNeurons
All steps for staining were performed at room temperature. Samples were washed once
with PBS then fixed with 10% formalin (Electron Microscopy Sciences) for 20 min followed
by permeabilization with 0.2% Triton X-100/PBS for 15 min. Samples were then blocked
with 8% BSA for 30 min followed by staining with primary antibodies diluted into 4%
BSA. Staining was performed for either 3h with anti-beta III eFluor 660 conjugate
(eBioscience, catalog no. 5045-10, clone 2G10-TB3) or 1h with anti-neurofilament 200
(Sigma, catalog no. N4142), both at a 1:500 dilution. Samples were then washed 3 times,
5 minutes each, with 0.1% tween/PBS, followed by one wash with PBS. For neurofilament
200 staining, a secondary donkey anti-rabbit Alexa Fluor 647 (Life Sciences) antibody
was added at a 1:1000 dilution in 4% BSA for 1h. Samples were again washed as previously
mentioned then stained with nucBlue [Hoechst 33342] (Life Sciences) for 5 min.
Image acquisition and analysis of Cas9 iNeurons
24-well plates stained for NucBlue and neuronal markers, were imaged with a 10x objective
on a Zeiss Axio Observer Z1 microscope. Zen Blue software (Zeiss) was used to program
acquisition of 24 images per well. Total cell (NucBlue) and iNeuron (Beta III tubulin
or Neurofilament 200) counts were quantified for each image using custom Fiji and
Matlab scripts and used to determine the percentage of iNeurons per well by the formula:
(number of Beta III positive cells/number of nucBlue cells) × 100. In preparation
for publication, individual channels were composited and pseudocolored, with equal
adjustments across samples and controls, in Fiji.
Statistical analysis
All statistical comparisons are two-tailed t-tests calculated using the GraphPad Prism
software package (Version 6.0 for Windows. GraphPad Software, San Diego, CA). All
sample numbers listed indicate the number of biological replicates employed in each
experiment.
Code availability
Custom Fiji and Matlab scripts utilized in analyzing iPS cell differentiation are
available upon request.
Reproducibility
Throughout our study, we employ a sample size which is frequently used for similar
kinds of experiments. No data were excluded from any of our analysis. No randomization
was employed and blinding was not used except in iNeuron image analysis where the
scientist quantifying each of the conditions was blind to the sample type.
Supplementary Material
1
2