Scott Miller, PhD, is a biostatistician at Clinipace.

I had the pleasure of recently hosting the webcast “Navigating Regulatory Biostatistical Requirements Throughout the Clinical Trial Lifecycle” with my colleague Ron Marks, Clinipace CSO, director of biostatistics and cofounder. We reviewed common statistical issues (during trial planning, conduct, analysis and submission phases) and discussed potential approaches to address these issues.

As a result of a great turnout and many excellent follow-up questions, we decided to address a number of questions in a series of blog posts. This week we addressed protocol deviations and adaptive trial designs. Today, we’re going to address considerations for adaptive trial designs.

If you have any additional thoughts or related questions to ask our experts, please share your comments below.

Q: Should we choose different alpha spending functions in the interim analysis with the purpose of futility assessment versus efficacy assessment?

The short answer is yes. When you plan to do an interim analysis, there are a couple of different things you can do with it.

An interim analysis is a formal term to describe an early analysis of your trial so far. In a standard trial, you determine the sample size, then enroll that many subjects, and finally at the end of the trial, you conduct your primary and secondary analyses. There are several reasons you may want to look at the primary analysis before the end of the trial. We talked about one reason in the previous discussion on adaptive trial designs; that was the idea of a sample size re-estimation (SSR) where you check into the accumulating data so far to verify it is consistent with the treatment effect estimates you used in your sample size calculations.

Another reason to look at the trial data prior to the end of the study is to possibly stop the study early. One reason you might want to stop early is because it looks like the data you have so far is not consistent with what you thought you’d see, and even if you continue enrolling more subjects, you have a very small chance of obtaining a significant p-value at the end of the trial. That is called a futility analysis. The idea behind a futility analysis is not that you are saying the treatment doesn’t work; you’re just saying, “Based on what I powered it for a treatment effect difference of 10, I only have a 20% chance of getting to the end of the study, so it’s actually panning out to give me a non-significant P value.” Typically some people will call this something like a conditional power, and all that basically says is, “I started out the study with 80% power. Now that I have some of the data, my power based on the data I have is 25%.” And you might conclude, “Realistically, I either need to increase my sample size or I need to pull the plug because there’s no reason to keep going at this rate.”

The analysis hinges on the assumption that the treatment effect with which you started is the smallest that would be meaningful. Sometimes you power it for a certain treatment size, but if a smaller one was there, it might still be clinically meaningful. That distinction is something to keep in mind when you’re doing a futility analysis.

A second reason you might want to stop early is for overwhelming efficacy. The data you see might be so favorable when you run the primary analysis that you get a significant p-value, so you don’t need to enroll any more subjects because you have already demonstrated that the treatment works better than the control. In that case, you no longer have equipoise since you have reason to believe the treatment is better. Therefore, you want to allow all subjects access to the treatment and submit the trial for regulatory review now rather than spend more money enrolling more subjects. So it is possible to perform an interim analysis for futility, efficacy, or both.

It’s also possible that you might do a futility or efficacy analysis and not stop early. In that case, you might say, “I don’t want to stop early. I’m only going to potentially stop if the treatment looks like it’s not going to reach desired efficacy levels.” Sometimes the terms “binding” or “non-binding” are used in this context; “binding” means that if you meet a given criterion, you will stop no matter what the rest of the trial results look like. “Non-binding” means you will look at that criterion but will also consider other aspects of the trial, so you aren’t committing to stop no matter what.

Then there are times when the FDA says, “We’re not interested in you stopping that study early, because if you do, you’re only going to have 75 subjects treated with your product. That’s not enough for us to make an informed decision about safety. So whether you get a significant P value or not at 150 subjects, we’re not going to be convinced there’s sufficient information for us to make an approval decision.” This is another reason for deciding you don’t want to stop early for efficacy. You might say, “I’m not interested in stopping the trial for futility; I’m going to keep going to the very end. I just want to be able to stop early for efficacy.”

Depending on your goals, you’ll use different boundaries, and those are often referred to as alpha spending boundaries for efficacy or beta spending functions for futility. There are a lot of different boundaries you can choose from, and some factors that impact that are the overall type I error and power you want the trial to have, how many times you are conducting an interim analysis, whether you can stop early for futility, efficacy, or both, and how conservative you want to be – some boundaries make it fairly easy to stop early for efficacy while others make it very hard. One thing that really matters is that the FDA wants you to say in advance which one you’re going to use because they’ll give you different answers – if you get a certain value, some stopping boundaries would say you can stop early for efficacy, while others would say no, you haven’t yet reached that point. It’s definitely something you don’t want to just do and not tell FDA about. If you just say, “Oh yeah, we did this, and we picked the O’Brien-Fleming boundary,” they could respond with, “Well, why did you pick that as opposed to another one?” and they might fit some of the other boundaries. If those give you a different answer, that’s going to raise a red flag to them.

There really isn’t a standard one-size-fits-all choice because it is very specific to your goals for the interim analysis. Are you stopping it for both futility and efficacy, or are you only stopping it for one or the other? The other issue is how confident you are about stopping early. There are many different stop boundaries, and which one you pick impacts your type I error and type II error. Ideally you want to plan in advance, but sometimes, for whatever reason, that doesn’t happen. If you decide you want it down the road, you should share that with somebody at the regulatory agency and say, “Alright, we really want to add this. We understand this could potentially have implications. Here’s what we’re proposing to do. What do you think about that?”

If you’re interested in learning more about Navigating Regulatory Biostatistical Requirements through the Trial Lifecycle, be sure to listen to our webcast in its entirety. Check back soon for questions regarding trial conduct and trial analysis/submission as well as the related eBook!