Code Monkey home page Code Monkey logo

cancer-project's Introduction

Survival After Cancer Diagnosis

You can find the project article on Medium.

Project Overview

The goal of this project was to predict the length of time a person might survive after recieving a diagnosis of cancer. Analysis was performed on a dataset obtained from the National Institutes of Health (NIH). Findings and analysis are discussed in my medium article.

  • Achieved 70% accuracy for predicting survivability of a patient.

  • Analyzed and plotted relevant data to showcase the most important findings.

  • Shared findings with NIH and gained access to their private dataset for future projects.

Tech Stack

  • Jupyter
  • Pandas
  • numpy
  • scikit-learn (Random forest classifier, K-means clustering)
  • XGBoost classifier
  • scipy
  • eli5
  • Shap
  • pdpbox
  • seaborn

Data Sources

National Institutes of Health cancer dataset:

Getting Started

Locally

Pre-requisite python packages

  • jupyterlab (or notebook)
  • seaborn
  • sklearn
  • xgboost
  • pandas
  • scipy
  • yellowbrick

Everything else is pip installed within the jupyter notebook

Analyze Data

Run the cells within the jupyter notebook. Areas of interest have been annotated with comments or in markdown sections.

Contributing

When contributing to this repository, please first discuss the change you wish to make via issue, email, or any other method with the owners of this repository before making a change.

Issue/Bug Request

If you are having an issue with the existing project code, please submit a bug report under the following guidelines:

  • Check first to see if your issue has already been reported.
  • Check to see if the issue has recently been fixed by attempting to reproduce the issue using the latest master branch in the repository.
  • Create a live example of the problem.
  • Submit a detailed bug report including your environment & browser, steps to reproduce the issue, actual and expected outcomes, where you believe the issue is originating from, and any potential solutions you have considered.

Feature Requests

I would love to hear from you about new features which would improve this study and furthers the aims of the project. Please provide as much detail and information as possible to show why you think your new feature should be implemented.

Pull Requests

If you have developed a patch, bug fix, or new feature that would improve this project, please submit a pull request. It is best to communicate your ideas with the developer first before investing a great deal of time into a pull request to ensure that it will mesh smoothly with the project.

Pull Request Guidelines

  • Ensure any install or build dependencies are removed before the end of the layer when doing a build.
  • Update the README.md with details of changes to the interface, including new plist variables, exposed ports, useful file locations and container parameters.
  • Ensure that your code conforms to the existing code conventions.
  • Include the relevant issue number, if applicable.
  • Your Pull Request will be merged in once you have the sign-off of the developer.

Attribution

These contribution guidelines have been adapted from this good-Contributing.md-template.

Documentation

Refer to the SEER*stat website for guidance on the definitions of the data columns and values.

cancer-project's People

Contributors

jonathanmendoza-tx avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.