Training materials related to data science, artificial intelligence and bioinformatics. Resources which are not available for free are marked ($). You can find links to organizations which provide physical courses (in physicalcourses.md) and links to data sources (in datasources.md). Distance courses by Swedish universities which require official registration are listed in SwedishUniDistanceCourses.md.
Here is a suggested learning path for getting started in data science. Resources are below:
-
Install Anaconda and get familiar with its main functions and jupyter notebooks. Alternatively, if your own computer is limited, get familiar with Google colab.
-
Learn Python basics
-
Get familiar with the main functions of python tools needed for data processing and scientific computing: regular expressions, numpy, pandas
-
Get familiar with the basics of data visualization: matplotlib
-
Get a conceptual understanding of the core principles of machine learning and deep learning
-
Get a basic understanding of the main machine learning libraries: pytorch, keras
-
Familiarize yourself with the concepts and tools of data science reproducibility: git, FAIR principles
-
Familiarize yourself with the main concepts and tools in your main area of interest, e.g. image analysis, nlp
-
Try solving specific tasks you are interested in, e.g. from your research project or daily life, using machine learning, and just continue learning the things that are required to solve these tasks.
Anaconda installation
https://www.datacamp.com/community/tutorials/installing-anaconda-windows
- Setting up tensorflow environment https://www.anaconda.com/blog/tensorflow-in-anaconda
Jupyter notebooks
https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook
Markdown basics
https://guides.github.com/features/mastering-markdown/
Google Colab
https://web.eecs.umich.edu/~justincj/teaching/eecs498/FA2020/colab.html
Software engineering best practices
https://www.pythonlikeyoumeanit.com/Module5_OddsAndEnds/Writing_Good_Code.html
https://scikit-learn.org/stable/developers/contributing.html
Hardware recommendations
https://blog.slavv.com/picking-a-gpu-for-deep-learning-3d4795c273b9
https://timdettmers.com/2019/04/03/which-gpu-for-deep-learning/
How to think like a data scientist
https://runestone.academy/runestone/books/published/httlads/index.html
An Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani
Scientific book collection by Springer, many machine learning books included
Runestone Interactive
https://runestoneinteractive.org/pages/library.html
Data8 The Foundations of Data Science course
CS109A: Introduction to Data Science
https://harvard-iacs.github.io/2018-CS109A/
CS109B: Advanced Topics in Data Science from Harvard
https://harvard-iacs.github.io/2018-CS109B/
https://www.analyticsvidhya.com/blog/
Stackoverflow forum
https://nbis-reproducible-research.readthedocs.io/en/latest/
https://github.com/IFB-ElixirFr/IFB-FAIR-bioinfo-training
https://the-turing-way.netlify.app/welcome.html
https://the-turing-way.netlify.app/reproducible-research/vcs.html#rr-vcs
https://swcarpentry.github.io/git-novice/
Official Python documentation
-
Tutorial https://docs.python.org/3/tutorial/
PEP8 python style guide
https://www.python.org/dev/peps/pep-0008/#tabs-or-spaces
Google Python style guide
https://google.github.io/styleguide/pyguide.html
ipython
scipy
numpy
-
saving numpy tutorial https://machinelearningmastery.com/how-to-save-a-numpy-array-to-file-for-machine-learning/
pandas
matplotlib
scikit-learn
scikit-image
Python courses by University of Michigan on coursera or edx
https://www.coursera.org/specializations/python
https://www.edx.org/bio/charles-severance
Codecademy Python course
https://www.codecademy.com/learn/learn-python
Analytics Vidhya Python course
https://courses.analyticsvidhya.com/courses/introduction-to-data-science
Google's Python class
https://developers.google.com/edu/python/
Google's Python Crash Course on Course
https://www.coursera.org/learn/python-crash-course
Corey Schaefer's Python Programming Beginner Tutorials
https://www.youtube.com/playlist?list=PL-osiE80TeTskrapNbzXhwoFUiLCjGgY7
Dataquest Data Analyst path (some free, some $)
https://www.dataquest.io/path/data-analyst/
Python for Everybody: Exploring Data In Python 3 by Charles Severance
Learn Python the Hard Way by Zed Shaw
https://learnpythonthehardway.org/python3/
Programming Python, 4th Edition by Mark Lutz ($)
http://shop.oreilly.com/product/9780596158118.do
Learning Python, 5th Edition by Mark Lutz ($)
http://shop.oreilly.com/product/0636920028154.do
A Whirlwind tour of Python by Jake VanderPlas
for people familiar with programming
https://github.com/jakevdp/WhirlwindTourOfPython
Python Data Science Hanbook by Jake VanderPlas
https://github.com/jakevdp/PythonDataScienceHandbook
Scientific Computing with Python 3 by Claus Führer, Jan Erik Solem, Olivier Verdier ($)
https://www.oreilly.com/library/view/scientific-computing-with/9781786463517/
How to think like a computer scientist
https://runestone.academy/runestone/books/published/thinkcspy/index.html
Foundations of Python Programming
https://runestone.academy/runestone/books/published/fopp/index.html
CS109 Homework 1. Exploratory Data Analysis
https://nbviewer.jupyter.org/github/cs109/2014/blob/master/homework/HW1.ipynb
List of Python learning resources
https://forums.fast.ai/t/recommended-python-learning-resources/26888
Python NumPy tutorial
http://cs231n.github.io/python-numpy-tutorial/
Scipy tutorial
https://docs.scipy.org/doc/scipy/reference/tutorial/
Matplotlib tutorial
Pandas tutorials
https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html
http://www.gregreda.com/2013/10/26/intro-to-pandas-data-structures/
https://www.analyticsvidhya.com/blog/2014/09/data-munging-python-using-pandas-baby-steps-python/
https://pandas.pydata.org/pandas-docs/stable/getting_started/tutorials.html
Lectures notes on Python
https://github.com/jrjohansson/scientific-python-lectures/tree/master/
https://github.com/NBISweden/workshop-python/tree/ht18
Peter Norvig's python training examples
https://github.com/norvig/pytudes#pytudes-index-of-jupyter-ipython-notebooks
https://www.codecademy.com/learn/learn-r
https://docs.python.org/3.6/library/re.html
https://docs.python.org/3/howto/regex.html
https://www.youtube.com/watch?v=DRR9fOXkfRE&feature=youtu.be
https://www.analyticsvidhya.com/blog/2015/06/regular-expression-python/
https://developers.google.com/edu/python/regular-expressions
https://www.debuggex.com/cheatsheet/regex/python
CS229 Machine learning course from Stanford
https://www.youtube.com/watch?v=PPLop4L2eGk&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN
http://cs229.stanford.edu/syllabus.html
CS221 Artificial Intelligence course from Stanford
https://stanford-cs221.github.io/autumn2019/
CS230 Deep Learning course from Stanford
CS188 Introduction to Artificial Intelligence from Berkeley
https://inst.eecs.berkeley.edu/~cs188/fa20/
https://inst.eecs.berkeley.edu/~cs188/fa18/
CS294-158-SP20 Deep Unsupervised Learning from Berkeley
https://sites.google.com/view/berkeley-cs294-158-sp20/home
CSC321 Neural Networks and Machine Learning from University of Toronto
https://www.cs.toronto.edu/~lczhang/321/index.html
Machine Learning course from VU University in Amsterdam
https://www.youtube.com/watch?v=-pve3oIvxa8&index=1&list=PLCof9EqayQgupldnTvqNy_BThTcME5r93
Fast.ai courses
Material from Andreas Mueller's courses
MIT Deep Learning and Artificial Intelligence Lectures
Deep RL Bootcamp (2017)
https://sites.google.com/view/deep-rl-bootcamp/lectures
Full Stack Deep Learning Bootcamp
https://course.fullstackdeeplearning.com/
Official Pytorch tutorial
https://pytorch.org/tutorials/beginner/nn_tutorial.html
Machine learning book by Hal Daumé III
Deep Learning by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
https://www.deeplearningbook.org/
Neural Networks and Deep Learning by Michael A. Nielsen
http://neuralnetworksanddeeplearning.com/
Introduction to Deep Learning by Eugene Charniak ($)
https://mitpress.mit.edu/books/introduction-deep-learning
Deep Learning with Python by François Chollet
https://www.manning.com/books/deep-learning-with-python
Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig ($)
Hands-on Machine Learning with Scikit-Learn, Keras and TensorFlow by Aurélien Géron ($)
https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/
Notebooks for book exercises: https://github.com/ageron/handson-ml2
Reinforcement Learning, An Introduction by R. Sutton & A.G. Barto
http://incompleteideas.net/sutton/book/the-book-2nd.html (draft)
Artificial Intelligence: Foundations of Computational Agents (2nd Edition) by David L. Poole and Alan K. Mackworth
https://artint.info/2e/html/ArtInt2e.html
Machine Learning Yearning: Technical Strategy for AI Engineers, In the Era of Deep Learning by Andrew Ng
https://www.deeplearning.ai/machine-learning-yearning/
colah's blog
- Understanding LSTMs http://colah.github.io/posts/2015-08-Understanding-LSTMs/
Andrej Karpathy's blog
- Recipe for training neural networks http://karpathy.github.io/2019/04/25/recipe/
- The Unreasonable Effectiveness of Recurrent Neural Networks http://karpathy.github.io/2015/05/21/rnn-effectiveness/
- Deep Reinforcement Learning: Pong from Pixels http://karpathy.github.io/2016/05/31/rl/
Towards Data Science
https://towardsdatascience.com/
https://towardsdatascience.com/intuitively-understanding-variational-autoencoders-1bfe67eb5daf
Overview over activation functions:
https://medium.com/@snaily16/what-why-and-which-activation-functions-b2bf748c0441
NIPS 2016 Tutorial: Generative Adversarial Networks by Ian Goodfellow
https://arxiv.org/abs/1701.00160
https://www.youtube.com/watch?v=AJVyzd0rqdc
AI Lund tv: videos from seminars and workshops @ Lund University
Pytorch tutorial by Jeremy Howard
https://pytorch.org/tutorials/beginner/nn_tutorial.html
Reports on business and societal impact of AI by McKinsey
https://www.mckinsey.com/featured-insights/artificial-intelligence
Reports on business and societal impact of AI by PWC
https://www.pwc.com/gx/en/issues/data-and-analytics/artificial-intelligence.html
Grad-Cam tutorial
Backpropagation https://www.nature.com/articles/323533a0
A Fast Learning Algorithm for Deep Belief Nets https://doi.org/10.1162/neco.2006.18.7.1527
Greedy layer-wise training of deep networks http://papers.nips.cc/paper/3048-greedy-layer-wise-training-of-deep-networks.pdf
Computer vision course from Stanford
https://www.youtube.com/playlist?list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv
- Image classification: https://cs231n.github.io/classification/
EECS 498-007 / 598-005: Deep Learning for Computer Vision from University of Michigan
https://web.eecs.umich.edu/~justincj/teaching/eecs498/FA2020/
https://www.youtube.com/playlist?list=PL5-TkQAfAZFbzxjBHtzdVCWE0Zbhomg7r
Computer Vision: Algorithms and Applications by Richard Szeliski
Computer Vision - A Modern Approach by David A. Forsyth and Jean Ponce ($)
https://github.com/jbhuang0604/awesome-computer-vision
https://distill.pub/2017/feature-visualization/
https://distill.pub/2018/building-blocks/
https://ai.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html
https://github.com/jcjohnson/neural-style
Backpropagation Applied to Handwritten Zip Code Recognition (LeNet) https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwibzejJ2_7rAhUKyoUKHfrkBqIQFjABegQIAhAB&url=http%3A%2F%2Fyann.lecun.com%2Fexdb%2Fpublis%2Fpdf%2Flecun-89e.pdf&usg=AOvVaw1V9weNdZgg_6oEcKcWmdXk
VGG https://arxiv.org/pdf/1409.1556.pdf
GoogLeNet https://storage.googleapis.com/pub-tools-public-publication-data/pdf/43022.pdf
ResNet https://arxiv.org/pdf/1512.03385.pdf
CS224n NLP course from Stanford
http://web.stanford.edu/class/cs224n/
https://www.youtube.com/playlist?list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z
Fast.ai NLP course
https://github.com/fastai/course-nlp
https://www.youtube.com/playlist?list=PLtmWHNX-gukKocXQOkQjuVxglSDYWsSh9
Natural Language Processing from Coursera
https://www.coursera.org/learn/language-processing
Natural Language Processing from Berkeley
https://people.ischool.berkeley.edu/~dbamman/nlp20.html
Applied Natural Language Processing from Berkeley
https://people.ischool.berkeley.edu/~dbamman/info256.html
Applied Text Mining in Python from Univ. of Michigan/Coursera
https://www.coursera.org/learn/python-text-mining/home/welcome
Spacy course
AllenNLP tutorials
https://allennlp.org/tutorials
Speech and Language Processing by Dan Jurafsky and James H. Martin
https://web.stanford.edu/~jurafsky/slp3/
Coreference chapter: https://web.stanford.edu/~jurafsky/slp3/22.pdf
Natural Language Processing by Jacob Eisenstein
https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg
u.cs.biu.ac.il/~yogo/nnlp.pdf
Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan & Hinrich Schütze
https://nlp.stanford.edu/IR-book/html/htmledition/irbook.html
Natural Language Processing with PyTorch by Brian McMahan, Delip Rao ($)
https://www.oreilly.com/library/view/natural-language-processing/9781491978221/
Data and Text Processing for Health and Life Sciences by Francisco M. Couto
http://labs.rd.ciencias.ulisboa.pt/book/
Introduction to Natural Language Processing for Text https://towardsdatascience.com/introduction-to-natural-language-processing-for-text-df845750fb63
-
The Illustrated Transformer http://jalammar.github.io/illustrated-transformer/
-
The Illustrated BERT, ELMO and co http://jalammar.github.io/illustrated-bert/
Peter Bloem Transformers from Scratch http://peterbloem.nl/blog/transformers
https://smerity.com/articles/articles.html
https://towardsdatascience.com/evaluating-text-output-in-nlp-bleu-at-your-own-risk-e8609665a213
Steps for effective text data cleaning (with case study using Python) https://www.analyticsvidhya.com/blog/2014/11/text-data-cleaning-steps-python/
SciSpacy
https://github.com/allenai/scispacy
Python regular expressions documentation
https://docs.python.org/3/library/re.html
Tutorials about text cleaning
https://www.analyticsvidhya.com/blog/2014/11/text-data-cleaning-steps-python/
http://ieva.rocks/2016/08/07/cleaning-text-for-nlp/
https://chrisalbon.com/python/basics/cleaning_text/
http://rjweiss.github.io/text-iriss2013/
Tutorial about coreference resolution with neuralcoref
Tutorial for spacy
Tutorial for Huggingface Tokenization
Lars Juhl Jensen slideshare
https://www.slideshare.net/larsjuhljensen
LSTM http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.676.4320&rep=rep1&type=pdf
Codecademy SQL course https://www.codecademy.com/learn/learn-sql
Elixir training
https://tess.elixir-europe.org/
NBIS course on single-cell RNASeq
https://nbisweden.github.io/workshop-scRNAseq/
List of statistics resources
https://jvns.ca/blog/2017/04/17/statistics-for-programmers/
Immersive Maths (interactive linear algebra book)
Computational Linear Algebra for Coders by Fast.ai
https://github.com/fastai/numerical-linear-algebra/
Mathematics for Machine Learning
Applied Math and Machine Learning Basics chapter in Deep Learning book
https://www.deeplearningbook.org/contents/part_basics.html
Mathematical Methods for Physics and Engineering by Riley, Hobson, Bence
https://scikit-learn.org/stable/datasets/
EU Ethics Guidelines for Trustworthy AI
https://ec.europa.eu/futurium/en/ai-alliance-consultation/guidelines#Top
Multi-Task Learning in the Wilderness, Andrej Karpathy, Jun 15, 2019, ICML
https://slideslive.com/38917690/multitask-learning-in-the-wilderness
Trustworthy Human-Centric AI, Fredrik Heintz, 2020, Lund University
http://ai.lu.se/tv/trustworthy-human-centric-ai/
A conversation about AI risk and AI ethics in the age of covid-19, Jaan Tallinn and Olle Häggström
https://www.chalmers.se/en/centres/chair/news/Pages/webinar-19May2020.aspx