Code Monkey home page Code Monkey logo

kushal997-das / the-sparks-foundation Goto Github PK

View Code? Open in Web Editor NEW
103.0 4.0 63.0 7.43 MB

📌 This repo. Contains Basic - Advance level Data science / Machine learning / business analysis Projects. 👨‍💻

Home Page: https://www.youtube.com/channel/UCIHj6mNCMnSnmWLHOxzIESw/videos?view_as=subscriber

License: MIT License

Jupyter Notebook 100.00%
machine-learning data-science dataanalysis python3 business-analytics exploratory-data-analysis dataset

the-sparks-foundation's Introduction



Web Developement Image


Hello programmer Welcome to this repoGitHub issues GitHub forks GitHub stars



Problem statement :

  • Predict the percentage of an student based on the no. of study hours.
  • This is a simple linear regression task as it involves just 2 variables.
  • You can use R, Python, SAS Enterprise Miner or any other tool.
  • What will be predicted score if a student studies for 9.25 hrs/ day?
  • Here is the dataset : Dataset.csv

Solution: Prediction using Supervised ML

Demo: Prediction using Supervised ML




Problem Statement:

  • From the given ‘Iris’ dataset, predict the optimum number of clusters and represent it visually.
  • Use R or Python or perform this task
  • Here is the dataset : Dataset.csv

Solution: Prediction using UnSupervised ML

Demo: Prediction using Unsupervised ML




Problem Statement:

  • Perform ‘Exploratory Data Analysis’ on dataset ‘Retail(Dataset).csv’
  • As a business manager, try to find out the weak areas where you can work to make more profit.
  • What all business problems you can derive by exploring the data?
  • You can choose any of the tool of your choice
    (Python/R/Tableau/PowerBI/Excel/SAP/SAS)
  • Here is the dataset : Dataset.csv


Solution: Exploratory Data Analysis-Retail

Demo: Exploratory Data Analysis-Retail




Problem Statement:

  • Create the Decision Tree classifier and visualize it graphically.
  • The purpose is if we feed any new data to this classifier, it would be able to predict the right class accordingly.
  • Use R or Python or perform this task
  • Here is the dataset : Dataset.csv

Solution: Prediction using DecisionTreeAlgorithm

Demo: Prediction using Decision Tree Algorithm




Problem Statement:

  • Perform ‘explore Business Analytics’ on dataset ‘superstore.csv’

  • What all business problems you can derive by exploring the data?

  • You can choose any of the tool of your choice
    (Python/R/Tableau/PowerBI/Excel/SAP/SAS)

  • Here is the dataset : Dataset.csv

Solution: To explore Business Analytics

Demo: To explore Business Analytics




Problem Statement:

  • Perform ‘Exploratory Data Analysis’ on dataset ‘Global Terrorism’
  • As a security/defense analyst, try to find out the hot zone of terrorism.
  • What all security issues and insights you can derive by EDA?
  • You can choose any of the tool of your choice (Python/R/Tableau/PowerBI/Excel/SAP/SAS)
  • Here is the dataset : Dataset.csv

Solution: Exploratory Data Analysis - Terrorism

Demo: Exploratory Data Analysis - Terrorism




Problem Statement:

  • Perform ‘Exploratory Data Analysis’ on dataset ‘Indian Premier League’
  • As a sports analysts, find out the most successful teams, players and factors
    -contributing win or loss of a team.
  • Suggest teams or players a company should endorse for its products.
  • You can choose any of the tool of your choice (Python/R/Tableau/PowerBI/Excel/SAP/SAS)
  • Here is the dataset : Dataset.csv

Solution: Exploratory Data Analysis - Sports

Demo: Exploratory Data Analysis - Sports




Problem Statement:

  • Objective: Create a hybrid model for stock price/performance prediction using numerical analysis of historical stock prices, and sentimental analysis of news headlines
  • Stock to analyze and predict - SENSEX (S&P BSE SENSEX)
  • Use either R or Python, or both for separate analysis and then combine the findings to create a hybrid model
  • You are free to select a different stock to analyze and news dataset as well while not changing the objective of the task.
  • Here is the dataset :

Solution: Stock Market Prediction using Numerical and Textual Analysis

Demo: Stock Market Prediction using Numerical and Textual Analysis




Problem Statement:

  • Create a storyboard showing spread of Covid-19 cases in your country or any region (Asia, Europe, BRICS etc) using Tableau, Power BI or SAP

  • Identify interesting patterns and possible reasons helping Covid-19 spread with basic as well as advanced charts

  • Here is the dataset :

Solution: Timeline Analysis : Covid-19

Demo: Timeline Analysis : Covid-19



Let's connect! Find me on the web.



If you have any Queries or Suggestions, feel free to reach out to me.

Show some  ❤️  by starring some of the repositories!

the-sparks-foundation's People

Contributors

kushal997-das avatar programmer1473 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

the-sparks-foundation's Issues

Not able to run

I added below :
import statsmodels
from statsmodels.tsa.stattools import adfuller
in first input.

Then I tested the code till :
#Stationarity test
def test_stationarity(timeseries):
...
...
I got stuck with this error message :

Results of dickey fuller test

---------------------------------------------------------------------------
MissingDataError Traceback (most recent call last)
Input In [102], in <cell line: 27>()
25 else:
26 print("Weak evidence against null hypothesis, time series is non-stationary ")
---> 27 test_stationarity(train['Close'])
`Input In [102], in test_stationarity(timeseries)` ` 16 plt.show(block = False)` ` 18 print('Results of dickey fuller test')` `---> 19 result = adfuller(timeseries, autolag = 'AIC')` ` 20 labels = ['ADF Test Statistic','p-value','#Lags Used','Number of Observations Used']` ` 21 for value,label in zip(result, labels):`
File ~/.local/lib/python3.10/site-packages/statsmodels/tsa/stattools.py:321, in adfuller(x, maxlag, regression, autolag, store,regresults)
315 # 1 for level
316 # search for lag length with smallest information criteria
317 # Note: use the same number of observations to have comparable IC
318 # aic and bic: smaller is better
320 if not regresults:
--> 321 icbest, bestlag = _autolag(
322 OLS, xdshort, fullRHS, startlag, maxlag, autolag
323 )
324 else:
325 icbest, bestlag, alres = _autolag(
326 OLS,
327 xdshort,
(...)
332 regresults=regresults,
333 )
`File ~/.local/lib/python3.10/site-packages/statsmodels/tsa/stattools.py:129, in _autolag(mod, endog, exog, startlag, maxlag,method, modargs, fitargs, regresults) 127 method = method.lower() 128 for lag in range(startlag, startlag + maxlag + 1): --> 129 mod_instance = mod(endog, exog[:, :lag], *modargs) 130 results[lag] = mod_instance.fit() 132 if method == "aic":``File ~/.local/lib/python3.10/site-packages/statsmodels/regression/linear_model.py:906, in OLS.init(self, endog, exog, missing, hasconst, **kwargs)` ` 903 msg = ("Weights are not supported in OLS and will be ignored"` ` 904 "An exception will be raised in the next version.")` ` 905 warnings.warn(msg, ValueWarning)` `--> 906 super(OLS, self).__init__(endog, exog, missing=missing,` ` 907 hasconst=hasconst, **kwargs)` ` 908 if "weights" in self._init_keys:` ` 909 self._init_keys.remove("weights")`
File ~/.local/lib/python3.10/site-packages/statsmodels/regression/linear_model.py:733, in WLS.__init__(self, endog, exog, ``weights, missing, hasconst, **kwargs)
731 else:
732 weights = weights.squeeze()
--> 733 super(WLS, self).__init__(endog, exog, missing=missing,
734 weights=weights, hasconst=hasconst, **kwargs)
735 nobs = self.exog.shape[0]
736 weights = self.weights
`File ~/.local/lib/python3.10/site-packages/statsmodels/regression/linear_model.py:190, in RegressionModel.__init__(self, endog,exog, **kwargs) 189 def init(self, endog, exog, **kwargs): --> 190 super(RegressionModel, self).init(endog, exog, **kwargs) 191 self._data_attr.extend(['pinv_wexog', 'wendog', 'wexog', 'weights'])``File ~/.local/lib/python3.10/site-packages/statsmodels/base/model.py:267, in LikelihoodModel.init(self, endog, exog, **kwargs)` ` 266 def __init__(self, endog, exog=None, **kwargs):` `--> 267 super().__init__(endog, exog, **kwargs)` ` 268 self.initialize()`
File ~/.local/lib/python3.10/site-packages/statsmodels/base/model.py:92, in Model.__init__(self, endog, exog, **kwargs)
90 missing = kwargs.pop('missing', 'none')
91 hasconst = kwargs.pop('hasconst', None)
---> 92 self.data = self._handle_data(endog, exog, missing, hasconst,
93 **kwargs)
94 self.k_constant = self.data.k_constant
95 self.exog = self.data.exog
`File ~/.local/lib/python3.10/site-packages/statsmodels/base/model.py:132, in Model._handle_data(self, endog, exog, missing,hasconst, **kwargs) 131 def _handle_data(self, endog, exog, missing, hasconst, **kwargs): --> 132 data = handle_data(endog, exog, missing, hasconst, **kwargs) 133 # kwargs arrays could have changed, easier to just attach here 134 for key in kwargs:``File ~/.local/lib/python3.10/site-packages/statsmodels/base/data.py:700, in handle_data(endog, exog, missing, hasconst, **kwargs)` ` 697 exog = np.asarray(exog)` ` 699 klass = handle_data_class_factory(endog, exog)` `--> 700 return klass(endog, exog=exog, missing=missing, hasconst=hasconst,` ` 701 **kwargs)`
File ~/.local/lib/python3.10/site-packages/statsmodels/base/data.py:88, in ModelData.__init__(self, endog, exog, missing, ``hasconst, **kwargs)
86 self.const_idx = None
87 self.k_constant = 0
---> 88 self._handle_constant(hasconst)
89 self._check_integrity()
90 self._cache = {}
`File ~/.local/lib/python3.10/site-packages/statsmodels/base/data.py:134, in ModelData._handle_constant(self, hasconst)` ` 132 exog_max = np.max(self.exog, axis=0)` ` 133 if not np.isfinite(exog_max).all():` `--> 134 raise MissingDataError('exog contains inf or nans')` ` 135 exog_min = np.min(self.exog, axis=0)` ` 136 const_idx = np.where(exog_max == exog_min)[0].squeeze()`
MissingDataError: exog contains inf or nans
`train_log = np.log(`

I give up for formatting. God knows, how this type of error stack can be provided to developer. Height of frustration.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.