This repository is for an IBM course Project: Exploratory Data Analysis for Machine Learning
Breast cancer is cancer that forms in the cells of the breasts. After skin cancer, breast cancer is the most common cancer diagnosed in women in the United States. Breast cancer can occur in both men and women, but it's far more common in women. Substantial support for breast cancer awareness and research funding has helped created advances in the diagnosis and treatment of breast cancer. Breast cancer survival rates have increased, and the number of deaths associated with this disease is steadily declining, largely due to factors such as earlier detection, a new personalized approach to treatment and a better understanding of the disease.
Signs and symptoms of breast cancer may include: - A breast lump or thickening that feels different from the surrounding tissue - Change in the size, shape or appearance of a breast - Changes to the skin over the breast, such as dimpling - A newly inverted nipple - Peeling, scaling, crusting or flaking of the pigmented area of skin surrounding the nipple (areola) or breast skin - Redness or pitting of the skin over your breast, like the skin of an orange
Data is downloaded from Kaggle
- License CC0: Public Domain
- Visibility: Public
- Date created: 2021-08-05
- Current version: Version 1
The period is over short time frame but it useful for hypothesis testing and statistical analysis.
There are <400 rows so is a great beginners dataset.
This dataset consists of a group of breast cancer patients, who had surgery to remove their tumour. The dataset consists of the following variables:
Feature | Description |
---|---|
Patient_ID | unique identifier id of a patient |
Age | age at diagnosis (Years) |
Gender | Male/Female |
Protein1, Protein2, Protein3, Protein4 | expression levels (undefined units) |
Tumour_Stage | I, II, III |
Histology | Infiltrating Ductal Carcinoma, Infiltrating Lobular Carcinoma, Mucinous Carcinoma |
ER status | Positive/Negative |
PR status | Positive/Negative |
HER2 status | Positive/Negative |
Surgery_type | Lumpectomy, Simple Mastectomy, Modified Radical Mastectomy, Other |
DateofSurgery | Date on which surgery was performed (in DD-MON-YY) |
DateofLast_Visit | Date of last visit (in DD-MON-YY) |
Patient_Status | Alive/Dead |