###### Quick Guide To Survival Analysis Using Kaplan Meier Curve (With Python Code) # Quick Guide To Survival Analysis Using Kaplan Meier Curve (With Python Code) Today, with the advancement in technology, Survival analysis is frequently used in the pharmaceutical sector. It analyses a given dataset in a characterised time length before another event happens. The Kaplan Meier estimator is an estimator used in survival analysis by using the lifetime data. In medical research, it is frequently used to gauge the part of patients living for a specific measure of time after treatment.

Here, we will implement the survival analysis using the Kaplan Meier Estimate to predict whether or not the patient will survive for at least one year.

`Register for our upcoming Masterclass>>`

The dataset can be downloaded from the following link. It gives the details of the patient’s heart attack and condition.

### Code Implementation

Install all the libraries required for this project.

```pip install lifelines
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statistics
from sklearn.impute import SimpleImputer
from lifelines import KaplanMeierFitter, CoxPHFitter
from lifelines.statistics import logrank_test
from scipy import stats```

```df = pd.read_csv("echocardiogram.csv")

### Data Pre-Processing

Let us check for missing values and impute them with mean values.

`Looking for a job change? Let us help you.`
```mean = SimpleImputer(missing_values = np.nan, strategy = 'mean')
Columns = ['age', 'pericardialeffusion', 'fractionalshortening', 'epss', 'lvdd', 'wallmotion-score']
X = mean.fit_transform(df[Columns])
df_X = pd.DataFrame(X,
columns = Columns)
keep = ['survival', 'alive']
df_keepcolumn = df[keep]
df = pd.concat([df_keepcolumn, df_X], axis = 1)
df = df.dropna()
print(df.isnull().sum())
print(df.shape)```

### Create a new column

```df.loc[df.alive == 1, 'dead'] = 0
df.loc[df.alive == 0, 'dead'] = 1

### Kaplan Meier Curve

```kmf = KaplanMeierFitter()
X= df['survival']
kmf.fit(X, event_observed = Y)
kmf.plot()
plt.title("Kaplan Meier estimates")
plt.xlabel("Month after heart attack")
plt.ylabel("Survival")
plt.show()```

From the plot we can see that the survival rate decreases with the increase in the number of months.The Kaplan estimate is 1 for the initial days following the heart treatment.It gradually decreases to around 0.05 after 50 months.

`print("The median survival time :",kmf.median_survival_time_)`

The average survival time of patients is 29 months.Given below is the KM_estimate that gives the probability of survival after the treatment.

`print(kmf.survival_function_)`
```age_group = df['age'] < statistics.median(df['age'])
ax = plt.subplot(111)
kmf.fit(X[age_group], event_observed = Y[age_group], label = 'below 62')
kmf.plot(ax = ax)
kmf.fit(X[~age_group], event_observed = Y[~age_group], label = 'above 62')
kmf.plot(ax = ax)
plt.title("Kaplan Meier estimates by age group")
plt.xlabel("Month after heart attack")
plt.ylabel("Survival")```

### Kaplan Meier Curve Using Wallmotion Score

As we can see that the difference between the age groups is less in the previous step, it is good to analyse our data using the wallmotion-score group.The Kaplan estimate for age group below 62 is higher for 24 months after the heart condition. After it, the survival rate is similar to the age group above 62.

```score_group = df['wallmotion-score'] < statistics.median(df['wallmotion-score'])
ax = plt.subplot(111)
kmf.fit(X[score_group], event_observed = Y[score_group], label = 'Low score')
kmf.plot(ax = ax)
kmf.fit(X[~score_group], event_observed = Y[~score_group], label = 'High score')
kmf.plot(ax = ax)
plt.title("Kaplan Meier estimates by wallmotion-score group")
plt.xlabel("Month after heart attack")
plt.ylabel("Survival")
```

### Conclusion

In this article, we have discussed the survival analysis using the Kaplan Meier Estimate. It also helps us to determine distributions given the Kaplan survival plots. Further, we researched on the survival rate of different age groups after following the heart treatment. Finally, it is advisable to look into survival analysis in detail.

What Do You Think?