Pivotal study on perioperative toripalimab in resectable non-small-cell lung cancer showed population differences by histological subtype compared with other immunotherapy regimens, raising doubts about therapeutic positioning. The aim of this study was to interpret the methodological analysis by subgroups according to histological subtype of perioperative toripalimab in resectable non-small-cell lung cancer.
MethodsValidated subgroup analysis applicability tool was used. This tool had two parts: preliminary questions to directly rule out analysis without relevant minimum conditions, and checklist. This checklist assessed statistical association, biological plausibility and consistency of subgroup results, and related these criteria to recommendations on applicability.
ResultsPreliminary question regarding differences in effect between subgroups p(i) < 0.1 was answered negatively, and checklist was not applied due to direct discard. Even if the checklist had been applied, statistical association criterion would have been rated ‘null’ due to absence of statistically significant differences. Biological plausibility would have been rated ‘probable’ due to non-squamous histology being a negative prognostic factor. Consistency would have been rated ‘null’ for absence of heterogeneity between subgroups in similar studies.
ConclusionsThis methodological interpretation recommended against applying histology-based subgroup results for perioperative toripalimab in resectable non-small-cell lung cancer, avoiding ruling out the use of toripalimab in the non-squamous subgroup.
el estudio pivotal sobre toripalimab perioperatorio en cáncer de pulmón no microcítico resecable presentó diferencias poblacionales según el subtipo histológico frente a los demás esquemas inmunoterápicos, generando dudas sobre su posicionamiento terapéutico. El objetivo es realizar una interpretación metodológica del análisis por subgrupos según el subtipo histológico de toripalimab perioperatorio en cáncer de pulmón no microcítico resecable.
Métodose utilizó una herramienta validada sobre aplicabilidad del análisis por subgrupos con 2 partes: cuestiones preliminares para descartar directamente los análisis sin condiciones mínimas relevantes y un checklist. Este checklist valoró la asociación estadística, la plausibilidad biológica y la consistencia de los resultados de los subgrupos, relacionando estos criterios con recomendaciones sobre su aplicabilidad.
Resultadosla cuestión preliminar sobre diferencias de efecto entre subgrupos (p(i) < 0,1) tuvo respuesta negativa, por lo que el checklist no fue aplicado debido al descarte directo. Aunque el checklist hubiera sido aplicado, el criterio de asociación estadística habría obtenido una valoración «nula» por ausencia de diferencias estadísticamente significativas. La plausibilidad biológica habría alcanzado una valoración «probable» por histología no escamosa como factor pronóstico negativo. La consistencia hubiera sido valorada como «nula» por la ausencia de heterogeneidad entre subgrupos en estudios similares.
Conclusionesesta interpretación metodológica recomendó no aplicar los resultados de los subgrupos según histología para toripalimab perioperatorio en cáncer de pulmón no microcítico resecable, evitando descartar su uso en el subgrupo no escamoso.
Adjuvant chemotherapy after surgical resection has long been the standard of care for resectable non-small-cell lung cancer (NSCLC).1 In this setting, the emergence of immunotherapy marked a significant advance. For example, the use of adjuvant pembrolizumab was shown to increase disease-free survival in patients with resectable NSCLC at stages IB–IIIA.2
Subsequently, the use of perioperative immunotherapy in resectable NSCLC was evaluated. The perioperative approach involves using drugs both before and after surgery. Several randomised phase III clinical trials have analysed the results of immune checkpoint inhibitor regimens in this clinical setting.3–6 These regimens offer notable benefits, as they are associated with increased event-free survival.
Studies are needed that indirectly compare the different perioperative immunotherapy regimens in order to clarify their relative therapeutic position in resectable NSCLC.7 In addition, analyses are needed to determine whether these clinical trials have comparable characteristics and populations or whether such indirect comparisons should be rejected.8 Evaluation of the different therapies must take into account biomarkers, including programmed death-ligand 1 (PD-L1), as well as patient characteristics such as lymph node count and disease extent. The efficacy of a given treatment may differ from that of others, for example, due to different PD-L1 expression levels or disease stage.
The Neotorch clinical trial evaluated the perioperative use of toripalimab in resectable NSCLC.6 The patient population in this clinical trial differed from those enrolled in studies of other perioperative immunotherapy regimens. The trial found that 78% of patients had squamous histology, compared with 43% to 50% in other studies. Given that histological subtypes can influence disease prognosis, this difference could be considered a bias when comparing different perioperative immunotherapy combinations in resectable NSCLC.9 Further analysis of the results according to histological subtype in this clinical setting could help to determine the most appropriate therapeutic approach. This would prevent the misinterpretation of subgroup results from adversely influencing clinical decision-making.
Subgroup analysis evaluates data obtained from a healthcare intervention across different subpopulations according to a given factor.10 This approach must be applied with caution, as it inherently increases uncertainty.11 On the one hand, additional determinations increase the likelihood of detecting differences that are not truly present (Type I error). On the other hand, the redistribution of patients into smaller subgroups can prevent the detection of real differences (Type II error). Therefore, subgroup analyses should be conducted systematically before making clinical decisions or determining therapeutic approaches.
The objective of this study was to methodologically interpret subgroup analysis by histological subtype on the perioperative use of toripalimab in resectable NSCLC in the Neotorch clinical trial.
MethodsUsing a validated tool developed by Gil-Sierra et al., the researchers systematically interpreted subgroup analyses by histological subtype, enhancing the clinical applicability of the results.12 The tool comprises 2 parts: 4 preliminary questions to determine whether the subgroup analysis meets the minimum relevant conditions, followed by a checklist.
The preliminary questions address the level of evidence of the study with subgroup analysis, the clinical relevance of the variable evaluated, the difference in effect between subgroups, and the presence of the factor determining the analysis prior to the healthcare intervention. Progression to the checklist requires that all preliminary questions be satisfied: failure to satisfy any of them results in exclusion of the subgroup analysis. This approach provides a rapid way to prevent conducting a subgroup analysis that does not meet the minimum relevant conditions.
The second part of the tool (the checklist) is used to evaluate a set of criteria, including the statistical evidence supporting subgroup effects, the biological plausibility of differences between subgroups, and the consistency of findings with similar studies. The criterion relating to statistical evidence was further subdivided into the assessment of the interaction p-value to estimate the probability that observed differences between subgroups are attributable to chance, taking into account prespecification of the subgroup analysis, sample size, the number of factors evaluated, and the overall study outcome. Where data were not provided, the p-value was quantified using the subgroup calculator described by Primo et al.13
For each criterion, the checklist was used to assign an association score reflecting the strength of evidence: probable (+3 points), possible (+2), doubtful (0), and none (−3). The tool itself includes a guide containing clarifications and comments to facilitate the assignment of scores when evaluating criteria. The total score can be used to guide recommendations on the use of subgroup data in clinical decision-making. Higher scores indicate greater reliability: probable (7–9 points), supporting the application of subgroup results until another confirmatory randomised clinical trial is published; possible (5–6 points), allowing cautious use due to factors such as low drug tolerance, difficulty of administration, or cost; doubtful (3–4 points), generally precluding application except in exceptional cases; and nil (<3 points), indicating that the subgroup analysis should not be applied.
ResultsOf the 4 preliminary questions on minimum conditions, 3 were satisfied. Subgroup analysis was conducted in the Neotorch phase III randomised clinical trial. Phase III trials of this type provide the highest level of scientific evidence. We selected event-free survival as the variable for the subgroup analysis due to its clinical relevance. This variable was defined as the time from randomisation to disease progression precluding surgery, postoperative progression, local or metastatic recurrence, or death. The histological subtype evaluated in the subgroup analysis was determined before drug administration.
The preliminary test for differences in treatment effect between subgroups did not demonstrate a statistically significant interaction at the 0.1 level. In the Neotorch trial, the interaction p-value between subgroups defined by histology was 0.29.6 Therefore, the checklist was not applied due to direct exclusion. The statistical association criterion would have obtained a nil score (−3 points) due to the absence of a statistically significant difference between the squamous and non-squamous histology subgroups. The criterion of biological plausibility would have received a probable score (+3 points), given that a hypothesis supported by previous literature could explain the observation. Previous studies have suggested that non-squamous histology is a factor for poor prognosis.9 This finding could be consistent with the hazard ratio (HR) and confidence interval (CI) for the non-squamous histology subgroup in the Neotorch trial (HR = 0.54, 95%CI: 0.26–1.08), which includes the null value.6 However, the consistency criterion would have been rated as nil (−3 points). Previous studies evaluating the use of perioperative immunotherapy found no statistically significant difference in p-values between squamous and non-squamous histology subgroups. Regarding event-free survival, the interaction p-value was 0.94 for perioperative pembrolizumab, 0.13 for nivolumab, and 0.91 for durvalumab (Table 1). These findings support rejecting the assumption of differences between the 2 subgroups, despite numerically larger hazard ratios observed in the non-squamous histology subgroups for perioperative toripalimab, pembrolizumab, and nivolumab. Table 2 summarises the interpretation of the subgroup analysis by histological subtype in the Neotorch study.
Interaction p-value between subgroups by histological subtype based on efficacy values for event-free survival during the consistency assessment.
| Study (author and year) | Immunotherapeutic agent used in the perioperative setting | EFS value in the squamous histology subgroup (HR and CI) | EFS value in the non-squamous histology subgroup (HR and CI) | Interaction p-value |
|---|---|---|---|---|
| Heymach et al. (2023) | Durvalumab | 0.71 (95%CI: 0.49–1.03) | 0.69 (95%CI: 0.48–0.99) | 0.91 |
| Wakelee et al. (2023) | Pembrolizumab | 0.57 (95%CI: 0.41–0.77) | 0.58 (95%CI: 0.43–0.78) | 0.94 |
| Cascone et al. (2024) | Nivolumab | 0.46 (95%CI: 0.30–0.72) | 0.72 (95%CI: 0.49–1.07) | 0.13 |
| Lu et al. (2024) | Toripalimab | 0.35 (95%CI: 0.23–0.52) | 0.54 (95%CI: 0.26–1.08) | 0.29 |
CI, confidence interval; EFS, event-free survival; HR, hazard ratio.
Summarised interpretation of the subgroup analysis by histological subtype in the Neotorch study.
| Methodology | Criteria | Variable | |
|---|---|---|---|
| Event-free survival | |||
| Validated tool (Gil-Sierra et al.)12 | Preliminary questions | Highest level of evidence in the study with subgroup analysis | Yes |
| Clear clinical relevance of the variable analysed or of the primary surrogate variable | Yes | ||
| Difference in effect between subgroups for the factor analysedp < .1 | No | ||
| Factor determining the subgroup analysis prior to healthcare intervention | Yes | ||
| ChecklistStatistical association (score)Biological plausibility (score)Consistency (score)Recommendation for application (total score) | Not appliedNone (−3 points)Probable (+3 points)None (−3 points)‘None’ due to direct exclusion from preliminary questions: ‘do not consider subgroups’ | ||
This methodological interpretation suggests that histology-based subgroup results for perioperative toripalimab in resectable NSCLC should not be used in place of the overall results. This conclusion was reached through a systematic methodological assessment of the results from the squamous and non-squamous histology populations in the Neotorch trial, contextualised with the findings of other studies of perioperative immunotherapies in this setting.3–6,12
In the squamous subgroup, toripalimab showed a confidence interval that did not cross the null value and a lower numerical hazard ratio compared with the non-squamous subgroup. However, no interaction between histological subtypes was found in the Neotorch study.6 Perioperative use of pembrolizumab and nivolumab in resectable NSCLC also showed larger numerical effects in the squamous subgroup than in the non-squamous subgroup, but these differences did not reach statistical significance. In addition, durvalumab showed a pattern of subgroup effects opposite to that observed in previous studies, although the differences between subgroups did not reach statistical significance. Therefore, the consistent lack of heterogeneity between these subgroups for immunotherapies with similar mechanisms of action and clinical settings suggests that histology-based subgroup analysis should not be applied, based on current evidence. In addition, it should be noted that the low representation of patients with non-squamous histology in each arm (N = 45) of the Neotorch study further limits the applicability of subgroup analysis.
The main strength of the present study is the methodological assessment of subgroup analysis, which avoids interpretations based on unjustified a priori assumptions. Another strength of our study is that systematic evaluation of subgroup analyses in clinical trials is uncommon in the literature. A limitation inherent to subgroup results is the lack of conclusive findings from the clinical trials on which our research is based. In addition, the perioperative immunotherapy regimen used in the Neotorch trial6 was as follows: 3 neoadjuvant cycles of toripalimab plus chemotherapy prior to surgery, followed by an additional adjuvant cycle of toripalimab with chemotherapy, followed by toripalimab monotherapy. In contrast, the other perioperative immunotherapy regimens included 4 neoadjuvant cycles of immunotherapy plus chemotherapy prior to surgery, followed by the immunotherapy agent as adjuvant monotherapy.
The subgroup analysis approach requires the systematic interpretation of a set of relevant statistical and clinical criteria. Previous literature recommends which subgroup characteristics should be evaluated, such as a plausible explanation for the observed effect of the healthcare intervention in specific subpopulations or an appropriate temporal sequence between the factor studied and the drug administered.10,12 In fact, previous literature has emphasised the lack of credibility in conducting subgroup analyses based on irrelevant characteristics, such as zodiac sign. It should be remembered that the exploratory nature of subgroup analyses can generate numerous hypotheses that lack a sound pathophysiological basis.
Inadequate interpretation of subgroup analysis by histological subtype for perioperative treatment in resectable NSCLC could negatively affect therapeutic decisions, leading to consequences similar to those observed in other subgroup analyses in oncohaematology.14 In this context, the use of toripalimab in the non-squamous subgroup could be mistakenly ruled out. This situation would reduce the number of therapeutic alternatives for perioperative treatment of resectable NSCLC. Consequently, price competition would be reduced, making it difficult to acquire drugs at lower prices and to make optimal therapeutic decisions. Therefore, our findings may help prevent the selective application of histology-based subgroup results, influenced by particular interests, in perioperative immunotherapy for resectable NSCLC.
Contribution to the scientific literatureA methodological analysis of histology-based subgroups for perioperative immunotherapy in resectable non-small-cell lung cancer. The results may have implications for the therapeutic positioning of perioperative immunotherapy.
Ethical responsibilitiesAll the necessary ethical responsibilities regarding authorship and redundant publication were fulfilled. Ethical approval regarding the protection of human and animal subjects, as well as informed consent, was not required due to the study design.
CRediT authorship contribution statementManuel David Gil-Sierra: Writing – review & editing, Writing – original draft, Visualisation, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Elisa Pizarro-Barron: Writing – review & editing, Writing – original draft, Visualisation, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization. Maria del Pilar Briceño-Casado: Writing – review & editing, Writing – original draft, Visualisation, Validation, Supervision, Software, Resources, Project administration, Methodology, Investigation, Formal analysis, Data curation, Conceptualization.
FundingNone declared.
None declared.





