Talk:Survival analysis

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Statistics (Rated C-class, High-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 High  This article has been rated as High-importance on the importance scale.
 
WikiProject Death (Rated Start-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Death, a collaborative effort to improve the coverage of Death on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

This article needs some re-organization, for example references to external life in the section survival function does not add to the discussion. The section about fitting parameters does not talk about fitting per se but rather about different forms of censoring. I think there needs to be a section on non-parametric techiques like the Kaplan Meier, Nelson Aalen or life table methods added and also a reference to semi-parametric methods like the Cox proportional hazards regression model as well as fully parametric models based on say exponential or weibull models. WT

The common distributions used in survival analysis would be like the exponential or weibull distributions. The gaussian and logistic distributions are not used for the reason that their support is on the whole real line and it makes more sense to have non-negative values as a distribution for time unless we're using truncated distributions. WT


There is a useful table in a paper I just read that I think would be good to include... although I'm not familiar enough with Wikipedia to format it and cite it correctly: SURVIVAL ANALYSIS IN PUBLIC HEALTH RESEARCH; Elisa T. Lee and Oscar T. Go. I would add something like Table 3 to this article. Table 3 Commonly used distribution in survival data analysis (has columns for Distribution, Parameter, Density and survival functions, Hazard function. DannyLeigh13 (talk) 03:56, 15 October 2009 (UTC)DannyLeigh13


I added some to your table, but too tired to finish. There isn't much on this page, we maybe we can add some. I didnt read the page closely, but there is nothing about mean residual life? mrl(x) = \frac{\int_x^{\infty} S(t)}{S(x)} Im new so, if you have ideas.... Oh, and nothing about the exponential having a constant hazard rate (and memoryless) Wiki matzo (talk) 05:32, 2 February 2010 (UTC)

Distribution Parameter(s) Survival function Density function Hazard function Mean
Exponential \lambda>0 S(t) = e^{-\lambda t} f(t) = \lambda e^{-\lambda t} h(t) = \lambda (constant) \frac{1}{\lambda}
Weibull \lambda , \alpha > 0 S(t) = \exp(-\lambda t^{\alpha}) f(t) = \alpha \lambda t^{\alpha -1} \exp(-\lambda t^{\alpha}) h(t) = \alpha \lambda t^{\alpha -1} \frac{\Gamma(1+1/\alpha)}{\lambda^{1/\alpha}}
Gamma \beta, \lambda > 0 h(t) = \frac{f(t)}{S(t)} S(t) = 1 - I(\lambda t, \beta) (Where I(\cdot) denotes the incomplete gamma function) f(t) = \frac{\lambda^\beta t^{\beta-1}\exp(-\lambda t)}{\Gamma(\beta)} \frac{\beta}{\lambda}
Gompertz \alpha, \theta >0 exp(\frac{\theta}{\alpha}(1-e^{\alpha t})) \int^{\infty}_0 S(t)dt
Lognormal \sigma >0 S(t) = 1 - \Phi(\frac{\ln t - \mu}{\sigma}) exp(\mu + \sigma^2/2)
Log-logistic \alpha, \lambda > 0 S(t) = \frac{1}{1 + \lambda t^{\alpha}}

Censoring[edit]

Censoring needs some tidying-up. — Preceding unsigned comment added by 92.2.212.149 (talk) 05:34, 5 March 2013 (UTC)

"If a subject's lifetime is known to be less than a certain duration, the lifetime is said to be left-censored. [...] Left-censored data can occur when a person's survival time becomes incomplete on the left side of the follow-up period for the person. As an example, we may follow up a patient for any infectious disorder from the time of his or her being tested positive for the infection. We may never know the exact time of exposure to the infectious agent."

When a symptom occurs, the duration given by time-of-symptom minus time-of-infection, the putative duration of interest, is bounded below by time-of-symptom minus time-of-infection-detected. This sounds like an example of right-censoring, not left-censoring. The confusion seems to arise because while the duration is bounded below, i.e. is right-censored, the uncertainty arises at the left end of the span from infection-time to symptom-time. — Preceding unsigned comment added by 98.234.221.147 (talk) 08:33, 3 May 2013 (UTC)

Merger proposal[edit]

Survivor function into Survival analysis

I suggest not merging,. I think the article Survivor function is needed separately as a topic in statistics, and it needs to be expanded to cover the use of log-survivor plots to help choose the survival distribution via data analysis. Melcombe (talk) 16:42, 24 June 2008 (UTC)

No merge. Survival Function is a clear, precise, well-defined item, that needs its own page. It shouldn't be buried somewhere in a page on survival analysis.--Zaqrfv (talk) 23:03, 1 September 2008 (UTC)

Discrete Distributions[edit]

It might be useful to note these methods are also used for discrete distributions. For example, economists are interested in the length of time an unemployment spell lasts. But unemployment is defined on a weekly basis. In this case t=1,2,... and the hazard at time t is the probability the the spell has ended at time t+1 given it lasted until time t.

It might also be noted that it is the hazard rates that are often estimated. The reason is that researchers are often interested in how covariates might influence survival, and these may be time varying, e.g. smoking or drinking. CE —Preceding unsigned comment added by 140.254.199.38 (talk) 20:26, 1 April 2010 (UTC)

"The survival function is usually assumed to approach zero as age increases without bound, i.e., S(t) → 0 as t → ∞, although the limit could be greater than zero if eternal life is possible." - In healthcare economics it is common to look at the survival of an event which is not fatal in nature. As a result it is possible to survive the event completely despite having finite lifespan (dying before the event has a chance to happen). Do people think this deserves a mention? --Ts4079 (talk) 14:52, 16 August 2011 (UTC)

Cleaned up introduction[edit]

I re-arranged the text in the introduction to flow more smoothly. I deleted the following paragraph from the intro because it's vague and unsourced, and may be someone's attempt to promote their own research. If it is appropriate to re-add this, it should probably go into the article body, perhaps in a section about handling multiple events per subject.

More recently, many concepts in survival analysis have been explained by Counting Process Theory, which adds flexibility in that it allows modeling multiple (or recurrent) events. This type of modeling fits very well in many situations, when the event is significant but does not end the lifespan of the subject – e.g. people can go to jail multiple times, alcoholics can start and stop drinking multiple times, and people can get married and divorced multiple times.

Oanjao (talk) 17:56, 5 February 2012 (UTC)