Cell lines and cell culture
Human HNSCC cell lines FaDu and Detroit562 were purchased from ATCC. All cells were confirmed to be Mycoplasma-negative before this study (6601, TaKaRa). Cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM, 08,458-45, Nacalai Tesque), supplemented with 10% fetal bovine serum ( FBS, 172012, Sigma), 1% non-essential amino acids (NEAA, 06344-56, Nacalai Tesque), 1% l-glutamine (16948-04, Nacalai Tesque), and 1% penicillin/streptomycin solution (26253-84, Nacalai Tesque), at 37 °C in an atmosphere of 95% relative humidity and 5% CO2.
Transfection
FaDu and Detroit562 cells were transfected with pCAGIP-GFP, pCAGIP-GFP-SF3B2, pCAGIP-GFP-TAP, or pCAGIP-SF3B2-TAP [6] using Fugene HD transfection reagent (E2311, Promega). Cells were seeded into six-well plates (2 × 105 cells/well), to which 3 µg/well DNA plasmid in Opti-MEM (31985070, Gibco) and 9 µL/well Fugene HD were added. To establish GFP- or GFP-SF3B2 stably-expressing FaDu and Detroit562 cells, 3000 transfected cells were cultured in medium containing puromycin (0.2 µg/mL), in a 10-cm dish, until colonies were formed. Colonies were picked and protein expression was analyzed using western blotting.
Immunofluorescence imaging
GFP- or GFP-SF3B2-expressing FaDu cells (3 × 106) were seeded on cover glass in six-well plates. After 48 h, the cells were washed with PBS and fixed with 4% paraformaldehyde (PFA, 26126-25, Nacalai Tesque) for 5 min at room temperature. After three washes with PBS, the cells were stained with DAPI (1:10000) (10236276001, Roche) for 5 min at room temperature. Next, they were washed with ultrapure water and mounted in Vector Shield (H-1000, Vector Laboratories). Immunofluorescence staining images were obtained using a confocal laser microscope FV1200 (OLYMPUS).
Western blotting
Proteins were extracted from 1×105 cells in 10 µL of sample buffer (1610737, BIO-RAD). The extracted proteins were separated in a 5%–20% polyacrylamide gradient gel (191-15011, Wako) and transferred onto a polyvinylidene difluoride membrane. The membrane was blocked with 3% skim milk at room temperature for 1 h and then incubated with anti-SF3B2 (sc-514976, Santa Cruz, 1:200 dilution) or anti-beta actin (A5441, Sigma, 1:5000 dilution) antibodies. Then, after washing twice and blocking with 3% skim milk, the membrane was incubated with an HRP-conjugated mouse secondary antibody. The signals were detected with Chemi-Lumi One or Chemi-Lumi One Super (02230-30, Nacalai Tesque) using an ImageQuant LAS 4000 mini system (GE Healthcare).
siRNA treatment
The following pre-designed MISSION siRNAs were purchased from Sigma-Aldrich: siSF3B2 #1, Hs_SF3B2_6217_s (5-GUAUGUGACUGAAGAACCU-3) and Hs_SF3B2_6217_as (5-AGGUUCUUCAGUCACAUAC-3); siSF3B2 #2, Hs_SF3B2_6219_s (5-GAUUGAGUAUGUGACUGAA-3) and Hs_SF3B2_6219_as (5-UUCAGUCACAUACUCAAUC-3); and siControl (SIC-001, SIGMA). siSF3B2 #2 (Hs_SF3B2_6219) was used for RNA-seq and PRO-seq. A mixture of 30 pmol siRNA in 500 µL Opti-MEM (31985070, Gibco) and 5 µL Lipofectamine RNAiMAX Transfection Reagent (13778150, Thermo Fisher Scientific) was added into each well, and then FaDu cells were seeded (2.5 × 105 cells/well) into six-well plates. The transfected cells were incubated at 37 °C in an atmosphere of 95% relative humidity and 5% CO2 for 48 h.
Human HNSCC xenograft model
The Osaka University Animal Experiments Committee approved all experiments using mice, and all experiments were performed in accordance with their guidelines. The in vivo tumor growth of human HNSCC cells was examined using a subcutaneous xenograft model. Cancer cells (2 × 106 cells) in 50 µL PBS were transplanted into the flanks of 8-week-old NOD/SCID mice (Charles River) under deep anesthesia. The mice were maintained and handled according to approved protocols and the guidelines of the Animal Committee of Osaka University (Osaka, Japan), as previously described [6]. Tumor size was measured once every four days and the tumor volume was calculated according to the following formula: tumor volume (mm3) = length × (width)2/2.
RNA-seq
Sequencing libraries from at least two biological replicate RNA samples were prepared using the NEBNext Ultra RNA Library Prep Kits for Illumina (E7530L, NEB) following the manufacturer's instructions, as previously described [6]. mRNA was enriched using NEBNext Oligo d(T)25 beads. The sequencing libraries were analyzed by HiSeq X Ten (Illumina).
PAR-CLIP
PAR-CLIP was performed using a previously published protocol [24], with some modifications [6, 36]. Briefly, FaDu cells stably expressing SF3B2-TAP were labeled with 100 µM 4-SU (T4509, Sigma) and then cross-linked by irradiation with 365 nm UV at 150 mJ/cm2. The whole-cell lysate was collected and treated with 5 U/mL RNase T1 (EN0541, Fermentas) for 15 min at 22 °C, as described previously [37]. TAP tag fusion proteins were immunoprecipitated via overnight incubation (at 4 °C) with IgG Sepharose 6 Fast Flow (17-0969-01, GE Healthcare). The beads were treated with 0.2 U/µL MNase (M0247S, NEB) at 37 °C for 5 min and then with 0.5 U/µL calf intestinal alkaline phosphatase (M0525S, NEB) at 37 °C for 10 min. Subsequently, a 3′ linker was ligated to the RNA fragments by incubating the beads with 0.5 U/µL T4 RNA ligase (EL0021, Thermo Fisher Scientific) overnight at 16 °C, as described previously [38]. The fragmented RNA was radiolabeled using γ-[32P]ATP. Proteins covalently bound to radiolabeled RNA were collected and then separated using NuPAGE Novex 4–12% Bis-Tris gels (NP0335BOX, Thermo Fisher Scientific). After the band corresponding to SF3B2-TAP was excised, the radiolabeled RNA was isolated from the RNA–protein complex using proteinase K. A 5′ linker was ligated to the isolated RNA fragments. The linker-ligated RNAs were separated using a 10% TBE-urea gel (EC68752BOX, Thermo Fisher Scientific), and the bands between 70 and 130 nt in length were excised. The extracted RNAs were subjected to RT-PCR, and the resulting PCR products were separated using a 10% TBE gel (Thermo Fisher Scientific). PCR products between 140 and 190 bp in length were excised and eluted. The libraries were sequenced using the Illumina HiSeq X ten platform.
Tandem affinity purification and mass spectrometry
The SF3B2-TAP-expressing HNSCC cell line was established using FaDu cells. The SF3B2-TAP complex was purified from nuclear extracts using TAP technology, as previously described [39]. The purified proteins were concentrated using Amicon Ultra-0.5 mL 3 K (UFC500308, Merck) and separated via SDS-PAGE. After staining the gel with silver, the protein bands were excised, digested with trypsin in-gel, and analyzed via LC/MS–MS. GO enrichment in the SF3B2-associated proteins, found in two out of three independent experiments, was analyzed using Metascape [40].
ChIP-seq
ChIP was performed as previously described [41] using anti-CTCF (07-729, Millipore, 10 µL, 1:10 dilution) and anti-SMC1 (A300-055A, BETHYL, 10 µL, 1:10 dilution) antibodies. Sequencing libraries were generated using the NEBNext Ultra II DNA Library Prep Kit (E7103, NEB). The libraries were sequenced using the Illumina HiSeq X ten platform.
CUT&Tag
CUT&Tag was performed with CUT&Tag-IT Assay Kit (53160, ACTIVE MOTIF) in 1.5 × 106 FaDu cells using anti-SF3B2 (sc-514976, Santa Cruz, 5 µL, 1:20 dilution) and anti-H3 (ab1791, Abcam, 1 µL, 1:100 dilution) antibodies. Rabbit anti-mouse IgG (ab6709, Abcam, 1 µL) was used to enhance the signal. The cells were collected using a cell scraper.
PRO-seq
PRO-seq was performed as previously described [42]. Briefly, 3 × 106 FaDu cells were collected and washed with ice-cold PBS. Drosophila melanogaster S2 cells (10% of the human cell number) were added to each sample as a spike-in control for normalization. The combined cells were resuspended in cold permeabilization buffer [10 mM Tris–HCl, pH 7.4, 300 mM sucrose, 10 mM KCl, 5 mM MgCl2, 1 mM EGTA, 0.05% Tween-20, 0.1% NP40 substitute, 0.5 mM DTT, 1:100 protease inhibitor cocktail, and 4 U/mL SUPERaseIN (Invitrogen)] and incubated on ice. The permeabilized cells were then pelleted, washed twice with permeabilization buffer, and resuspended in ice-cold storage buffer (10 mM Tris–HCl, pH 8.0, 25% glycerol, 5 mM MgCl2, 0.1 mM EDTA, and 5 mM DTT) at a concentration of 2 × 10e7 nuclei per 100 μL. Nuclear run-on (NRO) assays were performed using biotin-11-NTPs. In total, 2 × 10e7 nuclei per 100 μL were thoroughly mixed with an equal amount of pre-heated 2 × NRO reaction mixture [10 mM Tris–HCl, pH 8.0, 5 mM MgCl2, 300 mM KCl, 1 mM DTT, 1% Sarkosyl, 50 μM each of Biotin-11-A/G/C/UTP (PerkinElmer, Waltham, MA), and 0.8 U/μL RNase inhibitor] and incubated at 37 °C for 3 min in a heat block. Nascent RNA was extracted, purified, and fragmented by base hydrolysis in 0.2 N NaOH on ice for 10 min. After neutralization, fragmented nascent RNA was bound to DynabeadsTM M-280 Streptavidin magnetic beads (Invitrogen) and incubated for 20 min at 4 °C. The beads were sequentially washed as follows: twice in high-salt buffer (2 M NaCl, 50 mM Tris–HCl, pH 7.4, 0.5% Triton X-100), twice in medium salt buffer (300 mM NaCl, 10 mM Tris–HCl, pH 7.4, 0.1% Triton X-100), and once in low-salt buffer (5 mM Tris–HCl, pH 7.4, 0.1% Triton X-100). Biotinylated RNA was extracted from the beads and precipitated using ethanol. The 3′ RNA adaptors were ligated to biotinylated RNA, and the second round of biotin-streptavidin purification was performed. The mRNA cap was then removed, and the reverse 5′ RNA adaptor was ligated. After the third round of biotin-streptavidin purification, adaptor-ligated nascent RNA was reverse-transcribed into complementary DNA (cDNA) using the RP1 primer. cDNA was amplified with index primers, and amplicons of 120–350 bp were selected using AMPure XP beads (Beckman Coulter, Brea, CA). Equimolar concentrations of library fractions were pooled and sequenced using a high-output flow cell on the NovaSeq 6000 platform (Illumina).
RNA stability assay
Three days after 10 nM siRNA transfection to FaDu cells using Lipofectamine RNAi MAX (13778150, ThermoFisher), the cells were treated with 5 µg/ml actinomycin D for 0.5, 1, 2, and 4 h. RNA was collected using Sepasol-RNA I Super G (09379-55, Nacalai Tesque). The amounts of target RNA were measured using iTaq Universal SYBR Green One-Step Kit (1725151, BioRad) with specific primers as shown below. qRT-PCR was performed using CFX Connect Real-Time PCR Detection System (BioRad). FOSL1 F: GGCCTTGTGAACAGATCAGC and R: AGTTTGTCAGTCTCCGCCTG; MYC F: ACAGCTACGGAACTCTTGTGCGTA and R: CAGCCAAGGTTGTGAGGTTGCATT; FOS F: AGATTGCCAACCTGCTGAAGGAGA and R: TCAGATCAAGGGAAGCCACAGACA; RPS12 F: TCCGTCCTACCGGAAACCTA and R: TTCCAAACAGCAACCCACAC; 18S rRNA F: TCAACTTTCGATGGTAGTCGCCGT and R: TCCTTGGATGTGGTAGCCGTTTCT.
Statistical analysis
Student’s two-tailed t-test was used to compare two parametric samples, and the Tukey–Kramer test was used for comparisons between multiple parametric samples. Welch’s t-test was used to compare two non-homoscedastic samples with a normal distribution. Pairwise Wilcox rank sum test adjusted with Bonferroni correction was used for comparing multiple samples. Fisher’s exact test with Monte Carlo simulation was used for Fig. 3G.
Bioinformatics
Tools
awk v4.1.3 (https://www.gnu.org).
Bedtools 2.26.0/2.29.2 [43]
Bowtie 2–2.2.3/2.3.5.1 [44]
cERMIT 1.1 [45]
Cutadapt 1.9.1 [46]
DESeq2 1.26.0 [47]
FASTX-Toolkit 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit/).
Fastqc 0.11.5 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
HOMER 4.9.1 [48]
IGV 2.9.2 [49, 50]
MAJIQ 0.9.2a [51]
PARalyzer 1.5 [52]
Perl 5.14.2 (https://www.perl.org).
R 3.6.3 (https://www.r-project.org).
Rstudio 1.4.1103 (https://www.rstudio.com).
Samtools 0.1.17/1.7 [53]
STAR 2.5.3a [54]
Stringtie v1.3.4b [55]
TrimGalore 0.6.6/0.64_dev (https://github.com/FelixKrueger/TrimGalore).
ucsc-bigWigAverageOverbed v2 (https://anaconda.org/bioconda/ucsc-bigwigaverageoverbed).
ucsc-wigToBigWig v4 (https://anaconda.org/bioconda/ucsc-wigtobigwig).
ucsc-bedGraphToBigWig v4 (https://anaconda.org/bioconda/ucsc-bedgraphtobigwig).
R package
AnnotationDbi v1.48.0 (https://bioconductor.org/packages/release/bioc/html/AnnotationDbi.html).
clusterProfiler v3.14.5 [56]
DOSE v3.12.0 [57]
dplyr v1.0.4 (https://cran.r-project.org/web/packages/dplyr/index.html).
enrichplot v1.6.1 (https://yulab-smu.top/biomedical-knowledge-mining-book/).
ggplot2 v3.3.3 [58] (http://ggplot2.org).
org.Hs.eg.db v3.10.0 (https://bioconductor.org/packages/release/data/annotation/html/org.Hs.eg.db.html).
plyr v1.8.6 [59]
reshape2 v1.4.4 [60]
rtracklayer (version 1.48.0) [61]
tidyr v1.1.2 (https://cran.r-project.org/web/packages/tidyr/index.html).
Clinical data analysis
SF3B2 mRNA expression, overall survival, and clinical data analyses in patients with HNSCC were analyzed using cBioPortal [62]. We excluded overlapping samples and patients. 1316 Patients were stratified by SF3B2 mRNA expression (RNA Seq V2 RSEM) level.
RNA-seq data analysis
Paired-end reads were mapped using STAR on the human genome reference hg19 after checking the quality using FastQC. Strigtie and DESeq2 were used to merge the results and calculate normalized gene expression. Statistical analysis of gene expression was performed using DESeq2. Sequencing tracks were generated using IGVtools with option -z 7, as previously described [6]. The correlations between replicate samples were analyzed using the cor.test program in R. The local splicing variations were calculated using MAJIQ and plotted using the R ggplot2 package. The cumulative distribution of log2 fold change of expression was calculated using the R plyr package and plotted using the R ggplot2 package. Gene set enrichment analysis was calculated and plotted based on the log2 fold change calculated using DESeq2, using R packages ClusterProfiler, EnrichPlot, org.Hs.eg.db, and DOSE.
PRO-seq data analysis
Single-end reads were mapped onto the human reference genome hg19 using bowtie2 after removing the adaptor sequences with the option—end-to-end—no-ununals. Correlations between replicate samples were calculated using the cor.test program in R. The mapped reads were normalized using HOMER. The sequencing tracks were generated using HOMER and IGVtools with the option -z 7. FPKM was calculated using the analyzeRepeats program in HOMER or R package rtracklayer. The TSS-, TTS-, 5′ss-, and 3′ss-centered profiles were calculated using the annotatePeaks program with the option -size 6000 -hist 25 -pc 3. The trimmed reads were mapped to the human genome hg19 and D. melanogaster genome dm6 using Bowtie2 to normalize the PRO-seq data based on the D. melanogaster spike-in. Read counts were normalized according to the genomic coverage of mapped Drosophila reads using bedtools and samtools. The pausing index was calculated using the analyzeRepeats program in HOMER with the option -pc 3 or R package rtracklayer.
CUT&Tag and ChIP-seq data analysis
Paired-end reads were mapped to the human reference genome hg19 using bowtie2 with the option—local—sensitive-local—minins 0—maxins 500—no-discordant—no-mixed—fr—no-unal, after checking the quality using trim-galore. The mapped reads were normalized using HOMER. To calculate the correlation between replicate libraries, bigwig files were generated using bedGraphToBigWig. The counts in each bin with 10,000 windows were calculated using bigWigAverageOverBed, and the correlation was calculated using the cor.test program in R with the Pearson method. The sequencing tracks were generated using HOMER and IGVtools with the option -z 7. The SF3B2 CUT&Tag peaks were called using the findPeaks program in HOMER with the option -style factor, compared to H3 CUT&Tag reads. The H3 CUT&Tag, CTCF, and SMC1A peaks were compared to the sonicated genome of FaDu cells. The obtained peaks were annotated using the annotatePeaks program in HOMER. The overlap of peaks was calculated using the mergePeaks program in HOMER. Metagene- and splice junction-centered profiles were calculated using the makeMetaGeneProfile program in HOMER with options rna, splice3p, and splice5p. The motif enriched around SF3B2-binding chromatin regions was determined using the findMotifsGenome program in HOMER with the option -size 450 and rendered using R packages ggplot2 and ggseqlogo.
PAR-CLIP data analysis
After removing the adaptor sequences, PAR-CLIP data were analyzed using bowtie2, PARalyzer, and cERMIT, as previously described [6]. The correlation between replicate libraries was calculated as described in the CUT&Tag and ChIP-seq data analysis section. The obtained peaks were annotated using HOMER. Normalized sequencing tracks were generated using the IGV and IGVtools. Metagene- and splice junction-centered profiles were calculated using the makeMetaGeneProfile program. The SF3B2 PAR-CLIP motif was rendered using the R packages ggplot2 and ggseqlogo.