Code Monkey home page Code Monkey logo

tirthajyoti / machine-learning-with-python Goto Github PK

View Code? Open in Web Editor NEW
3.0K 3.0K 1.8K 98.99 MB

Practice and tutorial-style notebooks covering wide variety of machine learning techniques

Home Page: https://machine-learning-with-python.readthedocs.io/en/latest/

License: BSD 2-Clause "Simplified" License

Jupyter Notebook 99.78% Python 0.20% HTML 0.01% CSS 0.01%
artificial-intelligence classification clustering data-science decision-trees deep-learning dimensionality-reduction flask k-nearest-neighbours machine-learning matplotlib naive-bayes neural-network numpy pandas pytest random-forest regression scikit-learn statistics

machine-learning-with-python's Introduction

Welcome ๐Ÿ‘‹

Hello! This is Tirtha. I am an explorer.

Work

I am working as VP, AI/ML, at Rhombus Power Inc., where I am building exciting and critically important solutions with AI, Data, and Math.

Before this, I was a Data Science and Solutions Engineering Manager at Adapdix Corp, putting the power of AI/ML on the Edge for Industry 4.0 and next-generation Smart Factory.

Even before that, I was a Sr. Principal Engineer developing power semiconductor technologies and applying AI/ML for semiconductor product/tech deveklopment at ON Semiconductor, also known as onsemi.

At its core, I translate customer business problems into data-driven problems and help build solutions.

Currently...

  • ๐Ÿ”ญ Iโ€™m currently working on: lectures/workshops, courses, and spreading knowledge on machine learning/statistical modeling. In particular, I serving as the Track Chair of "AI Optimization" track for the ValleyML AI Expo 2021. Also, I am developing course content for the ValleyML Fellowship program.

  • ๐ŸŒฑ Iโ€™m currently learning: ML flow management tools, Ray serve and distributed computing, and how AI/ML applies to the various aspects of the Industrial IoT sector.

  • ๐Ÿ‘ฏ Iโ€™m looking to collaborate on: Data science/ML books. Probably will use Jupyter Books and Leanpub platform

Books, lectures, articles

I publish highly-cited articles regularly on data science and machine learning topics, on leading platforms such Towards Data Science, KDNuggets, and Analytics Vidya.

I also teach IEEE/ACM workshops on data science/ machine learning.

My first data science related book Data wrangling with Python was published on February, 2019. In future, I wish to self-publish a second book about Hands-on mathematics/statistics for data scientists.

Skills

Open-source

Anurag's github stats

My open-source projects span the topics of,

  • general data analytics,
  • machine learning,
  • deep learning,
  • computer vision and image processing,
  • math and statistics,
  • synthetic data generation, etc.

I have published multiple Python packages related to data analytics and statistical modeling. See this page for my projects

Top Langs

Contribution to the technical community

Currently, in the organizing committee of ValleyML AI Expo 2021.

I served on the Technical Content Committee for the Open Data Science Conference (ODSC) West, 2020.

In 2015, I was elevated to the grade of Senior Member of IEEE for my contributions towards power electronics. I have authored/co-authored more than 25 peer-reviewed Transaction and Conference papers, 2 monographs/book chapters, and 4 U.S. Patents. Here is my Google Scholar Page.

I also serve on the technical program committee as Track/Topic chair in numerous IEEE conferences. I am the co-chair of the Semiconductor Committee of Power Supply Manufacturers' Association (PSMA).

machine-learning-with-python's People

Contributors

da115115 avatar dependabot[bot] avatar tirthajyoti avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

machine-learning-with-python's Issues

Using scipy's genetic algorithm for initial parameter estimation in gradient descent

I see you are writing Python code for optimization on GitHub. A general problem for gradient descent and other non-linear algorithms - particularly for more complex equations - is the choice of initial parameters to start the "descent" in error space. Without good starting parameters, the algorithm will stop in a local error minimum. For this reason the authors of scipy have added a genetic algorithm for initial parameter estimation for use in gradient descent. The module is named scipy.optimize.differential_evolution.

I have used scipy's Differential Evolution genetic algorithm to determine initial parameters for fitting a double Lorentzian peak equation to Raman spectroscopy of carbon nanotubes and found that the results were excellent. The GitHub project, with a test spectroscopy data file, is:

https://github.com/zunzun/RamanSpectroscopyFit

If you have any questions, please let me know. My background is in nuclear engineering and industrial radiation physics, and I love Python, so I will be glad to help.

df1.csv? df2.csv?

May I know where I can download the df1.csv/df2.csv in the Pandas Operations notebook? Thanks

Wrong interpretation of the Shapiro-Wilk test

In the Regression_diagnostics notebook , you are presenting the Shapiro-Wilk test.

The Shapiro-Wilk test's null hypothesis is that the data come from a Gaussian distribution. Therefore, the lower the p-value, the higher the change to reject the Gaussian distribution. The notebook says the opposite:
grafik

Add indications on how to run Jupyter notebooks with Docker in a few minutes

The https://github.com/machine-learning-helpers/docker-python-jupyter project builds a Docker image so that the (your) Jupyter notebooks can be run out-of-the-box on almost any platform in a few minutes.

It gives something like:

  • Initialization of the Git repository for the Jupyter notebooks:
$ mkdir -p ~/dev/ml
$ cd ~/dev/ml
$ git clone https://github.com/tirthajyoti/PythonMachineLearning.git
  • Initialization of the Docker image to run those Jupyter notebooks:
$ docker pull artificialintelligence/python-jupyter
  • Usage:
$ cd ~/dev/ml/PythonMachineLearning
$ docker run -d -p 9000:8888 -v ${PWD}:/notebook -v ${PWD}:/data artificialintelligence/python-jupyter

And then you can open http://localhost:9000 in your browser.

Any modification to the notebooks may be committed to the Git repository (if you are registered as a contributor), and/or submitted as a pull request.

  • Shutdown the Docker image
$ docker ps
CONTAINER ID        IMAGE                                   COMMAND                  CREATED             STATUS              PORTS                    NAMES
431b12a93ccf        artificialintelligence/python-jupyter   "/bin/sh -c 'jupyt..."   4 minutes ago       Up 4 minutes        0.0.0.0:9000->8888/tcp   friendly_euclid
$ docker kill 431b12a93ccf 

So, all the above could be added to your README.md file.

Question about How fast are NumPy ops.ipynb

Hey just wondering, for the How fast are NumPy ops.ipynb

When considering the speed for the log(10) of all the elements in the Numpy array a1, shouldn't you also include the creation of the initial Numpy array?

Line 50 is this:

t1=time.time()
a2=np.log10(a1)
t2 = time.time()
print("With direct NumPy log10 method it took {} seconds".format(t2-t1))
speed.append(t2-t1)

But isn't it more fair to make it this:

t1=time.time()
a1 = np.array(l1)
a2=np.log10(a1)
t2 = time.time()
print("With direct NumPy log10 method it took {} seconds".format(t2-t1))
speed.append(t2-t1)

Considering that it is an additional step not present in the other methods? In your code the bolded line is line 40.

Statistically significant function in regression model

Hi,

I'm wondering what the yesno-fuction does in the following notebook:
https://github.com/tirthajyoti/Machine-Learning-with-Python/blob/master/Regression/Regression_Diagnostics.ipynb

def yes_no(b):
if b:
return 'Yes'
else:
return 'No'

It should decide whether a parameter is significantly important or not for the model?
Where does the b refer to and what's the threshold for it to decide it's not statistically significant?

I usually look at the p-values in the statsmodels-ols table and when they fall below 0.05, they are significant, but in this notebook something else seems to be happening and I'm wondering if you could elaborate a bit on it (What is b?, how is it calculated?, what's the b's threshold? How to change the threshold from 0.01 to 0.05?) When the p-value in the ols-table is above 0.05, but the yes_no-function decides it's significant, what should I do (leave the parameter out or not)?

Kind regards,
Matthias

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.