Clinical Trial Endpoints
To gain regulatory approval, Phase III trials need to evaluate accepted clinical endpoints, which directly measures how a patient feels, functions, or survives. Survival is not a pragmatically feasible endpoint in the development of chronic CF drugs because these trials would take years and only enroll patients with the most severe lung disease. A surrogate endpoint is a laboratory measure or a physical sign that can substitute for a clinically meaningful endpoint(24). Surrogate endpoints may be used if they are strongly associated with direct clinical endpoints and have been validated. Lung function is a surrogate endpoint accepted by regulatory agencies based on the relationship between lung function and mortality(25). However, how changes in lung function are measured will influence study design. Most CF drugs have been approved on the basis of an acute improvement in forced expiratory volume in 1 second (FEV1), evaluated after weeks-months. While an acute improvement in FEV1 may be beneficial to people with CF, a more meaningful goal of chronic CF therapies would be to slow the decline in lung function. This has been shown for a few CF therapies,(26, 27) but these studies can take years to detect a meaningful slowing of lung function decline. Longer studies may require fewer patients, but such studies are also difficult to manage and can become very costly. In the era of highly effective modulator therapy (HEMT), the number of patients required in any study in CF may increase significantly as health outcomes improve(28, 29).
Deciding between evaluating acute improvements in FEV1or slowing the chronic decline of lung function may depend on both the intervention being studied and the patient population of interest. Patients with mild disease may not be able to demonstrate acute improvements in FEV1 (the so-called ceiling effect); these patients may also experience the fastest decline in lung function(30). Thus, drugs designed to address early lung disease or prevent the development of more severe disease, such as anti-inflammatory therapies, may opt for longer studies in healthier patients. More recent studies have focused on acute improvements in FEV1, and then been followed by prolonged open-label, observational studies (see more details in Phase IV below), but as overall lung function improves, demonstrating acute improvements in FEV1 may become more difficult(29).
Reduction in the risk or rate of pulmonary exacerbations would seem to be an attractive clinical endpoint to demonstrate efficacy of a new drug because they directly affect how a patient feels, functions, and in people with severe lung disease, survives(31, 32). However, pulmonary exacerbations have proven problematic for several reasons(33). First, there is no widely accepted and validated prospective definition. Several definitions have been put forward and used in clinical studies(6, 34, 35); the FDA has seemed to prefer a version of the Fuchs criteria first used in the pivotal study of dornase alfa. Second, there are several methods for determining changes in pulmonary exacerbations: time to the first pulmonary exacerbation, frequency of pulmonary exacerbation, total number of pulmonary exacerbations, etc. Time to the first pulmonary exacerbation requires the smallest number of participants, but may not be considered an acceptable outcome for registration studies by regulatory agencies. Alternative measures of pulmonary exacerbation require larger numbers of patients; this can be mitigated by enriching a study population for patients who have a higher risk of pulmonary exacerbations, but these patients may not be the ideal population for the drug, and such a study could lack generalizability to the wider CF population.
Patient reported outcomes (PROs) that directly measures a patient’s health or their quality of life without interpretation by medical professionals may be validated as surrogate endpoints(36). There are several CF-related PRO tools (e.g., CFQ-R, CFRSD) (37, 38), and improvement in symptoms using the CFQ-R was the endpoint for the study that led to the FDA approval of inhaled aztreonam (Cayston ®)(39). Growth has been used in the past as a validated surrogate endpoint, though not in any recent drug trials(40). As CFTR modulator studies expand to younger population, treatment associated effects on linear growth is affected may become of interest(41).
The use of surrogate endpoints such as PROs may present difficulties with study interpretation(24). Outside of CF, pivotal studies may be so large that even small clinical differences may be statistically significant. The difference may be less important in pediatric and/or orphan disease studies, because these are typically smaller studies and any statistically significant results are likely to also be clinically meaningful. However, determining what difference is clinically meaningful is important in many situations. An example would be a non-inferiority study, which is designed to show that a new therapy is not unacceptably worse than current standard therapy. This situation may arise in CF when comparing a new drug within the same therapeutic class (e.g., CFTR modulator or new inhaled antibiotic) to an established drug when comparison with a placebo would not be acceptable(42). Clinician surveys have indicated that many are uncomfortable withdrawing efficacious medications(43).
Biomarkers may be used as surrogate endpoints and could potentially speed up drug development. Biomarkers require validation to show they reflect the biologic activity of a therapy as well as the relationship with clinical outcomes(21). Biomarkers that have been explored in CF include those that would reflect changes in CFTR activity (e.g., sweat chloride, nasal potential difference, intestinal current measurement), infection (e.g., bacterial density, detection of CF pathogens, and inflammation (e.g., sputum neutrophil elastase activity, cell counts, cytokines, and serum CRP). In order to validate biomarkers as useful clinical surrogate endpoints, the relationships between biomarker changes/clinically meaningful changes or thresholds and the likelihood of subsequent clinical benefit must be clearly understood(24). Otherwise, the use of biomarkers is limited because risks and benefits cannot be fully assessed.
Recently, the FDA has demonstrated willingness to expand the use of results from biomarker studies in the approval process for CFTR modulators. Theratyping, the process of matching medications to specific CFTR mutations based upon in vitro testing results, has been used to expand the indication for ivacaftor beyond the original mutations for which it was approved through the traditional regulatory approval pathway(44). In 2017, after ivacaftor had been approved by the FDA for people with at least one G551D mutation and deemed safe in people with CF, ivacaftor was approved for people with CF with one of 23 other residual function mutations based on how cells from people with CF responded to ivacaftor in laboratory experiments(45). Similarly, elexacaftor/tezacaftor/ivacaftor has been approved for all CFTR mutations that can be demonstrated to respond in laboratory experiments and has not been restricted to just those mutations present in clinical trial participants.
Identifying appropriate clinical trial endpoints in young children may be particularly difficult. Many endpoints that may be used in studies in adults occur less frequently (e.g., deaths, pulmonary exacerbations), have not been validated (e.g., age-appropriate PROs), or cannot be measured readily (e.g., lung function in young children). Pivotal Phase III studies have been pursued in older adolescents and adults in part to avoid these limitations. Once drugs have been approved for use in older children, the FDA has relied on safety data alone as the primary endpoint in clinical trials in younger children with CF(46-48). This has the added benefit of significantly reducing study size: pivotal trials of ivacaftor included 100 patients 12 years and above, but only 38 children ages 2-5 years(48, 49). However, this strategy may not allow for the detection of rare adverse events.