Code Monkey home page Code Monkey logo

ist491's Introduction

Lung Cancer EDA and Prediction

This repository contains Jupyter notebooks and scripts for Exploratory Data Analysis (EDA) and prediction on lung cancer dataset. It also includes a Streamlit web application for easy interaction and prediction.

Dataset

The dataset used is the Lung Cancer Prediction dataset, which contains information on patients with lung cancer, including their age, gender, air pollution exposure, alcohol use, dust allergy, occupational hazards, genetic risk, chronic lung disease, balanced diet, obesity, smoking, passive smoker, chest pain, coughing of blood, fatigue, weight loss, shortness of breath, wheezing, swallowing difficulty, clubbing of finger nails and snoring.

EDA

The EDA notebook provides a comprehensive analysis of the dataset, including data cleaning, data visualization, and statistical analysis. It helps to understand the factors contributing to lung cancer and their relationships.

Prediction

The prediction script includes machine learning models to predict the risk of lung cancer based on the provided features. It also provides an explanation of the models used and their performance.

Streamlit Web Application

The Streamlit web application allows users to interact with the dataset and the models in a user-friendly interface. Users can input their information and get a risk prediction for lung cancer.

Homepage

👉 Expand for Other Web Application Screenshots

Dataset

Prediction-1

Prediction-2

Visualization

Contact

Usage

To use this repository, clone it and run the Jupyter notebooks for EDA and prediction. To use the Streamlit web application, navigate to the application directory and run the Streamlit command.

Contribution

Contributions to this repository are welcome. If you have any suggestions or improvements, please submit a pull request.

License

This project is licensed under the terms of the MIT license.

Source

https://www.kaggle.com/datasets/thedevastator/cancer-patients-and-air-pollution-a-new-link?resource=download

ist491's People

Contributors

ashnumpy avatar f1rzen avatar

Watchers

 avatar

Forkers

ashnumpy

ist491's Issues

Streamlit Arayüz Tasarımı

Bento Grid'ten esinlenilerek yapılan arayüzler. Grid için pico.css te bulunun .grid classı, card için article elementi kullanılabilir.

WhatsApp Görsel 2024-05-10 saat 16 12 09_b2555185

Overfit Olan Modeller

Hoca ile konuştuğumda modellerin overfit olduğunu söyledi.

  1. Lojistik regresyonda overfitten kurtulmalıyız.
  2. Ordered logistic regression olabilir. Sklearnden bakılması gerekiyor.
  3. Eğer ordered olmuyorsa low-med birleştirilerek yapılabilir.
  4. Logistic regression olmuyorsa naive bayes bakılabilir. Confussion matrixi daha tutarlı.
  5. PowerBI gerek yok. Streamlit uygulaması devam edecek.
  6. Streamlit uygulamasında ayrı bir sayfada dinamik grafik olacak.

Lookup sklearn: Ignore warnings

          Güzel bir çalışma olmuş İlkay eline sağlık. Ancak;
  1. Notebook üzerindeki uyarıları kapatalım.
import warnings
warnings.filterwarnings('ignore')

Bu madde düzeltildikten sonra merge edeceğim.

Originally posted by @AshNumpy in #7 (comment)

Streamlit Grafik Sayfası

Streamlit uygulamasında verilere bakacağımız, selectorler ile seçim yapabildiğimiz kullanıcının veriyi incelemesi ve genel bir fikir sahibi olması için bir grafik üreten dinamik bir sayfa oluşturmalıyız.

EDA Enhancement

Keşifsel Veri Analizi

Keşifsel veri analizi başlığında oluşturulan EDA.ipynb dosyasını geliştirelim.

  • #5
  • Özelliklerin (features) hedef değişken üzerindeki etkilerini inceleyelim. Örneğin cinsiyetin ya da alkol kullanımının kanser seviyesindeki etkileri nedir?
  • Veri setinde özellik çıkarımı (feature engineering) uygulanabilir mi? Uygulanabilir ise EDA başlığında ya da harici bir başlıkta incelenmeli.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.