Michael Galarnyk's Projects
Nearly 3 million rows of auto accidents in the USA over several years. I'm trying to do a barplot race....
Minimal examples of data structures and algorithms in Python
R's data.table package extends data.frame. HOMEPAGE:
Data Science Repo and blog for John Hopkins Coursera Courses. Please let me know if you have any questions.
Interview stuff for friends
Homework/Classwork for my DSE 200 Python for Data Analysis Class at UC San Diego (UCSD)
Database Management Systems Data Science Masters Course (DSE 201)
Probability and Statistics Using Python Data Science Masters Course at UCSD (DSE 210)
Repo for my graduate data science machine learning class at UCSD (UC San Diego). This course provides a broad introduction to the practical side of machine-learning and data analysis. The topics covered in this class include topics in supervised learning, such as k-nearest neighbor classifiers, decision trees, boosting and perceptrons, and topics in unsupervised learning, such as k-means, PCA and Gaussian mixture models.
Map-reduce, streaming analysis, and external memory algorithms and their implementation using the Hadoop and its eco-system: HBase, Hive, Pig and Spark. The class will include assignment of analyzing large existing databases.
Define fortify and autoplot functions to allow ggplot2 to handle some popular R packages.
Installations for Data Science. Anaconda, RStudio, Spark, TensorFlow, AWS (Amazon Web Services).
Resources for my LinkedIn Learning Courses
Coursera machine learning specialization coursework (python based, University of Washington).
DSE 2015
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
GitHub Repo for MGT-6090 Assignment 8 BHC.
Modin: Speed up your Pandas workflows by changing a single line of code
Python tutorials in both Jupyter Notebook and youtube format.
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.
Shingho is a PySpark based statistical library designed for Big Data applications.
This is a repo to keep the data for my tutorials. This is to make it so people dont need a Kaggle account and such as much as possible.
Legally allowable public portion of the UCSD Extension course: Data Analytics Using Python (CSE-41204)
New repo for my web development certificate through UCSD extension. I am looking forward to the new courses!