Survival Analysis
Survival Analysis
When the dependent variable is continuous, we would ordinarily first think of linear regression, It’s a very good method when you want to look at the relationship between a continuous dependent variable and one or more independent variables.
But, like nearly all statistical techniques, it makes assumptions. And one of the assumptions that is so clear as to usually go unstated is that we know the value of the dependent variable; usually, this is not a problem. If we want to model, say, what people weigh, we can weigh them. But in one common type of analysis, we don’t always know the dependent variable – that’s when the dependent variable is time to an event. In that case, we need survival analysis.
The key reason that we need survival analysis is that these data are often censored. If, for example, we were looking at how long couples stay married, we could select some couples, and follow them over time. But some couples won’t get divorced before we finish our study. Similarly, some patients won’t die during our study, and so on.
Types of survival analysis
Although there are a wide variety of techniques for doing survival analysis, they fall into three famlies: Parametric, semiparametric, and nonparametric. The difference is in what we wish to assume about the distribution of survival times. In parametric survival analysis, we assume that survival times come from some specific statistical distribution; in semiparametric survival analysis, we do not need to make this assumption, but we do make another assumption – usually the proportional hazards assumption. In nonparametric analysis, we avoid even that assumption. Since the exact nature of the survival function is hard to know, and is critical to the results, semiparametric survival analysis is much more commonly used than parametric. And semiparametric offers more useful output than nonparametric analysis. By far the most common method is known as Cox proportional hazards regression.
Semiparametric methods, unlike parametric methods do not allow you to predict a survival time; rather, they just let you see differences between groups, or differences based on some other measure. For instance, Cox methods would not predict how long couples would stay together, but it could predict how much more quickly (say) couples with a large age difference got divorced than couples with similar ages. This is often of primary interest.
In addition, recent developments allow us to look at multiple events – for instance, we might model repeated patterns of being arrested over time, or getting a particular disease.
Featured Posts

What is the mean? The average, or mean, or, more formally, the arithmetic mean, is one of the simplest...

[latexpage] In ordinary regression the model is: $ Y = \beta_0 + \beta_1x_1 + \beta_2_x_2 + .... +...

[latexpage] In his wonderful book "A Mathematician's Lament" Paul Lockhart...

This is a talk I've given at Northeast SAS Users Group (NESUG) and at SAS Global Forum (SGF)

Whenever you run a SAS program, you should look at the log file. In fact, I have set up keys on my SAS...

[latexpage] The chisquare test can refer to several different types of tests. Here I will discuss the...

When you have two numeric variables and are interested in the relationship between them, the basic statistical...

When learning statistics, you may learn about ANOVA (analysis of variance), ANCOVA (analysis of covariance)...

There are many books that teach you to use SAS or that teach you to use R. There is at least one book that...

If you picture the data as a 2 x 2 crosstab, then quasicomplete separation occurs when one of the cells is...

These are the slides from 4 hour course I gave at SESUG.

You're about to do some research. You've got an idea in your field and you hope to turn it into a grant or an...