Acute leukemia represents the most common pediatric malignancy comprising diverse subtypes with varying prognosis and treatment outcomes. New and targeted treatment options are warranted for this disease. Patient-derived xenograft (PDX) models are increasingly being used for preclinical testing of novel treatment modalities. A novel approach involving targeted error-corrected RNA sequencing using ArcherDX HemeV2 kit was employed to compare 25 primary pediatric acute leukemia samples and their corresponding PDX samples. A comparison of the primary samples and PDX samples revealed a high concordance between single nucleotide variants and gene fusions whereas other complex structural variants were not as consistent. The presence of gene fusions representing the major driver mutations at similar allelic frequencies in PDX samples compared to primary samples and over multiple passages confirms the utility of PDX models for preclinical drug testing. Characterization and tracking of these novel cryptic fusions and exonal variants in PDX models is critical in assessing response to potential new therapies.
Genomic characterization of the somatic landscape is essential for the robust clinical evaluation and classification of pediatric leukemias [
Chromosomal rearrangements generating gene fusions and other structural variants (StVs) are more common in pediatric malignancies compared to adults [
Acute lymphoblastic leukemia (ALL) is the most common type of cancer in children and adolescents. ALL represents 20% of all cancers diagnosed in individuals with less than 20 years of age [
The goal of this study was to characterize complex genomic variants in pediatric leukemias and describe and monitor these variants in preclinical PDX models in comparison with the primary samples. The ability to track complex genomic lesions in primary samples and across passage in PDX lines is essential in ensuring that that the model can be used for biologic and therapeutic modeling. RNA next generation sequencing (NGS) techniques enable a sensitive and broad approach for analyzing complex genomic lesions and identifying clinically relevant novel somatic mutations associated with pediatric leukemias.
All samples used in this study were procured by the Nemours Biobank following written informed consent. For majority of samples, leukemic cells were isolated from human bone marrow aspirates with the exception of NTPL-59 and NTPL-109, which were isolated from apheresis products by Ficoll density gradient centrifugation and provided to us under an Institutional Review Board approved protocol (Nemours Office of Human Subjects Protection IRB# 267207). Summary of the subject’s characteristics are presented in
PDX models were generated as described previously [
To optimize detection of structural and copy number variants in RNA we prepared RNA–error-corrected sequencing libraries using the ArcherDX (Boulder, CO, USA) FusionPlex HemeV2 Kit (catalog no. AB0012) per manufacturer’s protocols. Total RNA was extracted using RNeasy Mini Kit (Qiagen, Hilden, Germany). Nucleic acid quantity and quality was then assessed using the Agilent (Santa Clara, CA, USA) TapeStation 4200 following the manufacturer’s protocol and using the High Sensitivity RNA Screen Tape (catalog no. 5067-5579). cDNA was made from 50 ng of RNA using the QIAseq kit. Each library was sequenced on the Illumina NextSeq platform (San Diego, CA, USA). The gene fusion data produced by the Archer panel was initially correlated with diagnostic fluorescence in situ hybridization data available for each primary sample.
The data was processed via ArcherDX Analysis platform (v5.1.3), hosted in the cloud by Amazon Web Services, including fastq trimming, read deduplication, genome alignment, and variant detection and annotation. The analysis pipeline contains the following applications: ABRA [
Fastq files were analyzed via fastqc for library quality, and error corrected reads (hamming distance of 2) were aligned to the genome build hg19 using BWA and bowtie2, and alignment files were processed via GATK best practices [
Variant allele frequencies (VAF) were calculated for SNVs based on number reads mapped to that location supporting the alternative allele versus the total number of reads mapped to that genome location. VAFs for StVs are calculated by analyzing the number of reads supporting the wild type sequence/junction, compared to the number of reads supporting the novel junction. R statistics was used for making scatter plots, specifically ggplot2 [
To determine the concordance of RNA variants between primary and PDX samples for pediatric AML, a targeted RNA sequencing panel approach (HemeV2; ArcherDx) was utilized. In this report, we analyzed 5 AML primary-PDX sample pairs, and in total 31 allelic specific SNVs were identified with the following distribution: 1 frameshift, 11 missense, 2 splice region and 17 untranslated region (UTR) variants (
VAFs for all RNA StVs including gene fusions and alternative exon usage variants were graphed between the primary and PDX AML samples and results are displayed (
Multiple retained introns (n=14) were identified in the 5 primary and PDX AML samples in the following genes:
To determine the concordance of RNA variants between primary and PDX samples for pediatric ALL, samples target RNA sequencing approach was utilized. The correlation coefficients of VAF between primary and PDX T-cell ALL (T-ALL) samples identified across 3 primary and PDX T-ALL samples were similar between SNVs and StVs (Pearson correlation coefficient, 0.88; p = 6.12e-10 and 0.73; p = 0.003 respectively) (
In total 14 StVs were identified in the primary and PDX models for T-ALL samples; 4 unique fusions (
The correlation between VAF from primary to PDX samples was analyzed for RNA StVs and SNVs in 17 B-cell ALL (B-ALL) samples. In total 114 RNA SNVs were identified in the primary and PDX B-ALL samples, and of those variants 4 were frameshift, 25 missense, 5 splice region, 2 stop gained and the rest were UTR variants (
The correlation between SNV VAFs from primary to PDX B-ALL samples was higher than the correlation between StV VAFs (Pearson correlation coefficient, 0.93; p = 2.2e-16 and 0.5; p = 9.5e-8, respectively) (
Sequencing of primary acute leukemia patient samples and matching PDX samples showed concordance between the detected variants and their allelic frequencies for the majority of variants tested. The percentage of all variants with absolute delta VAFs <0.2 was 86.7%. This percentage was higher in SNVs (93.6%) compared to StVs (79.6%) across all primary and PDX samples analyzed. Among the different categories of StVs, the allelic frequencies of fusion genes, which are considered to be driver mutations, matched most consistently between the primary and PDX samples (
We identified several SNVs, but no StVs, with sustained VAF = 1 in primary and PDX samples across all leukemia subtypes. These SNVs in genes
Retained intron variants were detected in all samples except NTPL-59. Retention of introns serves as another mode of regulation of gene expression [
As we have shown previously, error-correction via the introduction of a nucleic acid-specific UMI allows the removal of NGS errors, retaining only true mutations and significantly improving the sensitivity of NGS [
Taken together, advanced sequencing techniques are required to accurately detect and annotate complex StVs that are commonly associated with pediatric leukemias. Such complex variants, including StVs, are not detectable using DNA and short read sequencing technology such as Illumina sequencing platform. Additionally, the RNA molecules that are generated from these complex genomic rearrangements can be difficult to capture. Using an RNA sequencing approach with AMP technology and short read sequencing platform described in this study, pediatric PDX models could be appropriately characterized and validated for concordance of somatic mutations with respect to primary samples. Such analysis is not feasible using standard DNA sequencing techniques. This is one of the first reports to describe pediatric PDX samples using an RNA sequencing approach.
Conceptualization: SPB, AG, EAK, ELC. Data curation: SPB, AG, NM, TED. Formal analysis: ELC. Writing - original draft: SPB, AG, NM, TED, EA, ELC. Writing - review & editing: SPB, AG, NM, TED, EA, ELC.
No potential conflict of interest relevant to this article was reported.
The authors would like to thank the Nemours Center for Cancer and Blood Disorders, Nemours Biobank, and the Nemours Biomedical Research Department for supporting this work. This work was supported by the NIH NCI CA211711-01 (PI Druley), Leukemia Research Foundation of Delaware (PI Kolb), and B+ Foundation (PI Barwe).
Supplementary data including two tables can be found with this article online at
Single nucleotide variants detected per subject.
Structural variants detected per subject.
Summary of primary and xenograft RNA variants in acute myeloid leukemia (AML). (A) Allelic specific single nucleotide variants. Variant allele frequency (VAF) at time of diagnosis, x-axis is plotted versus the VAF in the xenograft model, y-axis. (B) Structural RNA variants. VAF at time of diagnosis, x-axis is plotted versus the VAF in the xenograft model, y-axis. PDX, patient-derived xenograft.
Summary of primary and xenograft RNA variants in T-cell acute lymphoblastic leukemia (T-ALL). (A) Allelic specific single nucleotide variants. Variant allele frequency (VAF) at time of diagnosis, x-axis is plotted versus the variant allele frequency in the xenograft model, y-axis. (B) Structural RNA variants. VAF at time of diagnosis, x-axis is plotted versus the VAF in the xenograft model, y-axis. (C)
Summary of primary and xenograft RNA variants in B-cell acute lymphoblastic leukemia (B-ALL). (A) Allelic specific single nucleotide variants. Variant allele frequency (VAF) at time of diagnosis, x-axis is plotted versus the VAF in the xenograft model, y-axis. (B) Structural RNA variants. VAF at time of diagnosis, x-axis is plotted versus the VAF in the xenograft model, y-axis. PDX, patient-derived xenograft.
Waterfall graph for single nucleotide variants (SNVs) and structural variants (StVs) detected in B-cell acute lymphoblastic leukemia samples. Genes with either a coding SNV or StV were plotted (y-axis) per sample (x-axis). Mutations are colored based on type.
Comparison of variant allele frequencies of structural variants between primary bone marrow samples (x-axis) and matched xenograft sample (y-axis). (A) The variant allele frequencies for all gene fusions were plotted between the primary and xenograft model (R2 = 0.7634). (B) The variant allele frequencies for all retained introns were plotted between the primary and xenograft model (R2 = 0.2906). (C) The variant allele frequencies for all exon deletion were plotted between the primary and xenograft model (R2 = 0.0078). (D) The variant allele frequencies for all exon duplications were plotted between the primary and xenograft model (R2 = 0.0118).
Summary of leukemic samples utilized
Patient characteristic | AML | ALL |
---|---|---|
No. | 5 | 20 |
Age (yr), median (range) | 10 (1.5-14) | 5.5 (1-16) |
Sex | ||
Male | 40 | 55 |
Female | 60 | 45 |
Race | ||
Caucasian | 60 | 35 |
African American | 0 | 25 |
Hispanic | 20 | 20 |
Samples collected at diagnosis | 80 | 95 |
Cytogenetically normal (by karyotype analysis) | 0 | 55 |
Bone marrow origin | 100 | 90 |
Peripheral blood origin | 0 | 10 |
Average leukemic blast percentage | 76 | 78 |
Values are presented as percentage unless otherwise indicated.
AML, acute myeloid leukemia; ALL, acute lymphoblastic leukemia.