Code Monkey home page Code Monkey logo

diabetes-mellitus-prediction-in-pima-indians's Introduction

Diabetes-mellitus-prediction-in-Pima-Indians

This repository was created to show the workshop for "Switch Fb developer circle", which is going to take place on 05-08-2017.

Títle: "Proceso CRISP-DM aplicado a la predicción de la diabetes mellitus con datasets públicos"

The content of the workshop is divided into two stages:

Theoretical

1.-The CRISP-DM and BAB process.

2.-Scrum Agile and how to mix it with Data Science from my experience

2.-Explanation of the problem to solve, structure and problems that we face in the dataset.

Hands on

1.-Understanding and characterization of the data.

2.-EDA for the Data Understanding

3.-Data preparation

4.-Application of logistic,GridSearch algortihm and Random Forest models

5.-Performance analysis

6.-Conclusions

In the future: TBD according to the feedback of the first audience in the Dev Circle Ago/2017 in SCL-CL

the force be with you

diabetes-mellitus-prediction-in-pima-indians's People

Contributors

iair avatar iairlinker avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

diabetes-mellitus-prediction-in-pima-indians's Issues

Ejemplo de Test Anova

import pandas as pd
from pandas import read_csv
import numpy as np
import scipy as sp
import matplotlib as plt
get_ipython().magic(u'matplotlib inline') 
get_ipython().magic(u"config InlineBackend.figure_format='retina'")
#import plotly
#import plotly.plotly as py
#import plotly.graph_objs as go
#from plotly.tools import FigureFactory as FF
import seaborn as sbs
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import KFold
from sklearn.model_selection import train_test_split
# Evaluate using Cross Validation
from sklearn.model_selection import cross_val_score
from sklearn import metrics
from sklearn.linear_model import LogisticRegression

#El fragmento siguiente carga el conjunto de datos de inicio de diabetes de los indios Pima
#Link a los datos https://archive.ics.uci.edu/ml/datasets/pima+indians+diabetes
url = "https://goo.gl/vhm1eU"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
df = read_csv(url, names=names)
df.head()

import scipy.stats as stats

voter_frame = df[['pres','class']]
groups = voter_frame.groupby("class").groups

keys = list(groups.keys())

c0 = voter_frame[voter_frame.index.isin(groups[keys[0]])]['pres']
c1 = voter_frame[voter_frame.index.isin(groups[keys[1]])]['pres']

stats.f_oneway(c0, c1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.