banner
뉴스 센터
품질과 성능은 당사 제품의 특징입니다.

대장암의 통합된 종양, 면역 및 미생물군집 지도

Sep 01, 2023

Nature Medicine 29권, 1273~1286페이지(2023)이 기사 인용

18,000회 액세스

116 알트메트릭

측정항목 세부정보

광범위한 후속 정보를 갖춘 다중 오믹스 암 데이터 세트가 부족하여 임상 결과의 정확한 바이오마커를 식별하는 데 방해가 됩니다. 이 코호트 연구에서 우리는 원발성 결장암 환자 348명의 신선 냉동 샘플에 대한 포괄적인 게놈 분석을 수행했습니다. 여기에는 종양에 대한 RNA, 전체 엑솜, 심층 T 세포 수용체 및 16S 박테리아 rRNA 유전자 시퀀싱과 일치하는 건강한 결장 조직이 포함됩니다. 추가적인 미생물군유전체 특성화를 위해 종양 전체 게놈 시퀀싱을 사용합니다. 거부 면역 상수라고 불리는 1형 보조 T 세포, 세포 독성, 유전자 발현 시그니처는 클론 확장되고 종양이 풍부한 T 세포 클론의 존재를 포착했으며 합의 분자 하위 유형 및 미세부수체 불안정 분류와 같은 기존의 예후 분자 바이오마커보다 성능이 뛰어났습니다. . 예상보다 적은 수의 신생항원으로 정의되는 유전적 면역편집의 정량화는 예후 가치를 더욱 개선했습니다. 우리는 유리한 결과와 관련된 Ruminococcus bromii에 의해 주도되는 미생물군집 특징을 확인했습니다. 미생물군집 시그니처와 면역학적 거부 상수를 결합하여 우리는 우수한 생존 확률을 가진 환자 그룹을 식별하는 종합 점수(mICRoScore)를 개발하고 검증했습니다. 공개적으로 사용 가능한 다중 오믹스 데이터 세트는 개인화된 치료 접근법의 발견을 촉진할 수 있는 대장암 생물학을 더 잘 이해할 수 있는 리소스를 제공합니다.

원발성 대장암에 대한 바이오마커에 대한 상당한 양의 연구가 진행되어 왔지만, 현재 미국과 유럽의 임상 지침(국립종합암네트워크(National Comprehensive Cancer Network) 및 유럽종양학회 지침 포함)은 종양-절-전이에만 의존하고 있습니다. 치료 권장사항을 결정하기 위해 표준 임상병리학적 변수 외에 DNA 불일치 복구(MMR) 결핍 또는 미세부수체 불안정성(MSI)의 병기 결정 및 검출을 수행합니다1,2. MSI는 MMR 유전자의 체세포 또는 생식계열 결함으로 인해 발생하며 체세포 돌연변이, 신생항원의 축적으로 이어져 면역 인식 및 종양 침윤 림프구의 밀도가 높아집니다3.

예를 들어 T 세포의 밀도 및 공간 분포 평가(면역점수)를 통해 포착된 현장 적응 면역 반응의 강도는 MSI 상태를 포함한 다른 임상병리학적 변수와 관계없이 재발 및 사망 위험 감소와 관련이 있습니다4. 5.

그러나 결장암에 대한 면역점수 및 기타 면역 관련 매개변수의 예후 효과에 대한 압도적인 증거에도 불구하고6,7 The Cancer Genome Atlas(TCGA)에서는 유전자 발현 기반 면역 반응 추정치와 환자 생존율 간의 연관성이 부족합니다. 대장 선암종(COAD) 집단은 연구 커뮤니티에 의해 주목되었습니다8,9,10. TCGA는 게놈 데이터의 풍부함과 큐레이션으로 인해 오믹스 분석을 위한 탁월한 데이터 세트를 나타냅니다. 그러나 생존 결과를 포함한 포괄적인 임상 데이터를 수집하는 것은 TCGA의 주요 목표도 아니고 전 세계적인 범위와 시간적 제약을 고려할 때 실질적인 가능성도 아니었습니다11. 따라서 TCGA-COAD 및 기타 TCGA 데이터 세트와 관련된 제한된 환자 추적 데이터는 통계적으로 엄격한 생존 분석을 방해했습니다11. 또한 TCGA에는 T세포 수용체(TCR) 레퍼토리 분석 또는 미생물군집 특성화를 위한 전용 분석이 포함되지 않았습니다. 이는 나중에 벌크 DNA 및 RNA 서열 분석(RNA-seq) 데이터를 사용하여 수행되었으며 소수의 건강한 고형 조직(예: 건강한 결장)만 포함되었습니다. ) 샘플12,13. 또한 TCGA가 초기에 암세포에서 발생하는 게놈 및 분자 변화를 분류하는 데 초점을 맞추면서 엄격한 종양 순도 기준을 기반으로 한 샘플 포함 기준이 부과되어 잠재적으로 면역력이 낮거나 간질이 풍부한 종양 표본에 인구가 편향될 수 있습니다.

0.1% in the tumor, which are at least 32 times higher in the tumor compared to normal) are highlighted. i, Correlation of proportion of tumor-enriched T cell clones in the tumor (in percent) with ICR score. Pearson's r and P value of the correlation are indicated in the plot. All P values are two-sided./p>12 per Mb. Overall P value is calculated by log-rank test. c, Scatter-plot of ICR score by genetic immunoediting (GIE) value for ICR-high and ICR-low samples. Number of samples in each quadrant is indicated in the graph. Gray area delineates ICR scores from 5–9. d, Kaplan–Meier for OS by IES. Censor points are indicated by vertical lines and corresponding table of number of patients at risk in each group is included below the Kaplan–Meier plot. Overall P value is calculated by log-rank test. e, Violin plot of IES by productive TCR clonality (immunoSEQ) (left) and MiXCR-derived TCR clonality (right). Spearman correlation statistics are indicated above each plot. Significance within ICR low and high is indicated. Center line, box limits and whiskers represent the median, interquartile range and 1.5× interquartile range, respectively. P values are two-sided, n reflects the independent number of samples./p> 2) (Fig. 5c and annotated in Supplementary Table 5). No major difference in α diversity (the variety and abundance of species within an individual sample) was observed between tumor and healthy samples (Extended Data Fig. 7b) and only a modestly reduced microbial diversity was observed in ICR-high versus ICR-low tumors (Extended Data Fig. 7b). Selenomonas and Selenomonas 3 were the taxa most significantly increased in ICR-high versus -low tumors (Fig. 5e, Extended Data Fig. 7c and Supplementary Table 6). In terms of survival analysis, the highest number of nominally significant associations was obtained using tumor data (rather than healthy colon data) and OS as the end point (Extended Data Fig. 7d and Supplementary Table 7)./p>20-fold coverage of at least 99% of targeted exons and >70-fold in at least 81% targeted exons. In healthy samples, sequencing achieved >20-fold coverage of at least 94% of targeted exons and >30-fold in at least 84% targeted exons. Adaptor trimming was performed using the tool trimadap (v.0.1.3). ConPair was run to evaluate concordance and estimate contamination between matched tumor–normal pairs. In eight of the pairs a mismatch was detected and for five pairs, a potential contamination was indicated. HLA typing data were used to validate these results. All potential mismatches and contaminations were excluded, retaining 281 patients for data analysis./p>2 µg) and sample selection was exclusively based on DNA availability. TCR sequencing was performed using extracted DNA of 114 primary tissue samples and ten matched healthy colon tissues with sufficient DNA available./p>0.1% were defined as tumor-enriched sequences, as previously implemented by Beausang et al.75. The fraction of tumor-enriched TCR sequences in the tumor was calculated by dividing the number of productive templates of tumor-enriched sequences by the total number of productive templates per tumor sample. Pearson's correlation coefficient between the fraction tumor-enriched TCR sequences and ICR score was calculated./p>1% in the general population. After these technical exclusion criteria, biological filters were applied, including selection of nonsynonymous mutations (frame shift deletions, frame shift insertions, inframe deletions, inframe insertions, missense mutations, nonsense mutations, nonstop mutations, splice site and translation start site mutations). The resulting number of variants/mutations per Mb (capture size is 40 Mb) per sample is referred to as the nonsynonymous TMB. Next, to identify most frequently mutated genes in our cohort that might play a role in cancer, we excluded variants that are predicted to be tolerated according to SIFT annotation or benign according to PolyPhen (polymorphism phenotyping). Finally, all artifact genes, which are typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene, were excluded76. The OncoPlot function from ComplexHeatmap (v.2.1.2) was used to visualize the most frequent somatic mutations./p>5% of the tumor samples) with frequencies detected in previously published datasets containing colon cancer samples (TCGA-COAD and NHS-HPFS) as well as reported cancer driver genes32 or colon oncogenic mediators38. First, we extracted genes with a nonsynonymous mutation frequency >5% in the AC-ICAM cohort. Subsequently, only genes that are likely involved in cancer development, as described in the section ‘Cancer-related gene annotation’, were retained. All artifact genes (mutations typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene), were excluded. Genes that have previously been reported as colon cancer oncogenic mediator38 or cancer driver gene for colorectal cancer (COADREAD)32 were also excluded. Finally, only genes with a mutation frequency <5% in the NHS-HPFS colon cancer cohort37 and <5% in TCGA-COAD36 were maintained. As a final filter, only genes that had a nonsynonymous mutation frequency of at least twofold in AC-ICAM compared to TCGA-COAD were labeled as potentially new in colon cancer./p> 0.4) or MSS (MANTIS score ≤ 0.4)./p> 500 nM, were used as criteria to infer neoantigens. Predicted neoantigens were used to calculate the GIE value. We calculated the GIE value by taking the ratio between the number of observed versus the number of expected neoantigens. The expected number of neoantigens was based on the assumption of a linearity between TMB and the number of neoantigens. We therefore assumed that samples that have a lower frequency of neoantigens than expected (lower GIE values), display evidence of immunoediting. A higher frequency of neoantigens than expected indicates a lack of immunoediting, see calculations section for details./p>60× coverage per sample. The median (across samples) of the average target coverage (per sample) was 76× (range of 50–92)./p> ±0.3. Clusters among the networks (groups of at least three correlated genera using the cutoffs specified above) were defined via a fast greedy clustering algorithm. All co-occurrence networks were made using the R package ‘NetCoMI (v.1.1.0) – Network Construction and Comparison for Microbiome Data’84 and visualized using Cytoscape (v.3.9.1)./p>0) and ‘low-risk’ (<0) groups as performed in the training set. Therefore, no cutoff optimization occurred in the validation phase./p>2 μg). Securing additional funds allowed us to perform WGS and 16S rRNA sequencing and to expand the WES and TCR analyses to any sample with sufficient DNA available. No specific power calculation was performed at that time and the targeted sample size was based on the estimated number of samples that could be retrieved from LUMC (n = 400), which compared favorably with the sample size of similar studies in the field./p>90% to detect a 10% mutational frequency in 90% of genes86./p>80% for an HR of 0.5 with a two-sided α of 0.05. With 154 OS events in the whole cohort, our study has a power of 90% for an HR of 0.59 (assuming two group of equal size c) and a power of 90% for an HR of 0.57 (assuming groups with unequal sample size, 2:1) with a two-sided α of 0.05./p>

0.1% in the tumor, that are at least 32 times more abundant in the tumor compared to the normal./p>12/Mb) versus Low (<12/Mb) TMB. b, Same as a, but only including ICR Medium. c, Kaplan–Meier curves for OS by GIE status. d, Same as c in ICR Medium patients. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression (a-d). e, Stacked bar charts of mutational load category (top) and MSI status (bottom) per IES. f, Kaplan–Meier curves for OS (left) and PFS (right) stratified by AJCC pathological stage (I, II, III) within IES4. Stratification was not performed for stage IV due to the limited number (n = 2). g, Stacked bar chart of distribution of AJCC Pathological Tumor Stage by IES. h, Multivariate cox proportional hazards model for OS including IES (ordinal, IES1, IES2, IES3, IES4) and AJCC Pathological Tumor Stage (ordinal, Stage I, II, III, IV). P values corresponding to HR calculated by cox proportional hazard regression analysis are indicated. i, Violin plot represents TCR clonality as determined by MiXCR in ICR Medium samples. Center line, box limits, and whiskers represent the median, interquartile range and 1.5x interquartile range respectively. P value calculated by unpaired, two-sided t-test. j, Results of the multiple linear regression model showing the respective contributions of productive TCR clonality (X1) and (X2) for prediction of IES (Y). Corresponding significance of the effects are indicated in the scatter-plots (left). k, Local Polynomial Regression Fitting of productive TCR clonality by IES (ordinal variable). The gray band reflects the 95% confidence interval for predictions of the local polynomial regression model. All P values are two-sided; n reflects the independent number of samples in all panels. Overall Survival (OS). Tumor Mutational Burden (TMB). Genetic Immunoediting (GIE). ImmunoEditing Score (IES)./p> 0). d, Concordance index of optimal multivariate cox regression model per dataset. The cross-validation performance highlights the mean concordance of 10-different folds with the optimal hyper parameters (gamma and lambda) that is, the same parameters as the optimal model. e, Forest plot with HR (center), corresponding 95% confidence intervals (error bars), and P value calculated by cox proportional hazard regression analysis for OS, using: 1) the 16 S MBR score in AC-ICAM, 2) WGS R. bromii abundance 3) PCR-based R. bromii abundance, 4) 16 S Ruminococcus 2 relative abundance and 5) MBR score calculated using WGS data. f, Heat map of Spearman correlation between the relative abundance of the MBR classifier taxa in tumor samples and immune traits. Only correlations with an FDR > 0.1 are visualized. An additional row is added for Ruminococcus 2 showing all correlations, unfiltered for FDR. * The taxonomical order is indicated between brackets, as family was unassigned. g, Kaplan–Meier curve for PFS in AC-ICAM, with all patients stratified by mICRoScore High vs Low. HR and P value are calculated using cox proportional regression. h, AJCC pathological stage within the mICRoScore High group in AC-ICAM and within TCGA-COAD i, Kaplan–Meier curve for PFS in AC-ICAM, with all patients with ICR High stratified by mICRoScore. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression. Overall Survival (OS), Progression-Free Survival (PFS). All P values are two-sided; n reflects the independent number of samples in all panels./p>