For our first lab, we are going to fit a logistic regression model to a dataset concerning heart disease. Whether or not a patient has heart disease is indicated in the final column labelled 'target'. 1 is for positive for heart disease while 0 indicates no heart disease.
Our goals are to:
- Define appropriate X and y
- Normalize the Data
- Split the data into train and test sets
- Fit a logistic regression model using SciKit Learn
With that, let's have at it!
#Starter Code
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
import pandas as pd
#Starter Code
df = pd.read_csv('heart.csv')
df.head()
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | slope | ca | thal | target | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 63 | 1 | 3 | 145 | 233 | 1 | 0 | 150 | 0 | 2.3 | 0 | 0 | 1 | 1 |
1 | 37 | 1 | 2 | 130 | 250 | 0 | 1 | 187 | 0 | 3.5 | 0 | 0 | 2 | 1 |
2 | 41 | 0 | 1 | 130 | 204 | 0 | 0 | 172 | 0 | 1.4 | 2 | 0 | 2 | 1 |
3 | 56 | 1 | 1 | 120 | 236 | 0 | 1 | 178 | 0 | 0.8 | 2 | 0 | 2 | 1 |
4 | 57 | 0 | 0 | 120 | 354 | 0 | 1 | 163 | 1 | 0.6 | 2 | 0 | 2 | 1 |
Recall the dataset is whether or not a patient has heart disease and is indicated in the final column labelled 'target'. With that, define appropriate X and y in order to model whether or not a patient has heart disease.
#Your code here
X =
y =
Normalize the data prior to fitting the model.
#Your code here
Split the data into train and test sets.
#Your code here
Fit an intial model to the training set. In SciKit Learn you do this by first creating an instance of the regression class. From there, then use the fit method from your class instance to fit a model to the training data.
logreg = LogisticRegression(fit_intercept = False, C = 1e12) #Starter code
#Your code here
Generate predictions for the train and test sets. Use the predict method from the logreg object.
#Your code here
How many times was the classifier correct for the training set?
#Your code here
#Your code here
Describe how well you think this initial model is based on the train and test performance. Within your description, make note of how you evaluated perforamnce as compared to our previous work with regression.
#Your answer here