Code Monkey home page Code Monkey logo

mongodb-python-workshop's Introduction

MongoDB-Python-Workshop

Short Description

This tutorial is an introduction to reading, building, and querying data using MongoDB and Covid-19 Research data from the Allen institute and Johns Hopkins.

Long Description

MongoDB is a document oriented database that makes it possible to build and query collections of JSON documents.

This tutorial will consist of a series of notebooks and tutorial exercises. Rather than discussing the MongoDB format in the abstract, participants learn to build and interact with a MongoDB, with an emphasis on using Python through Jupyter Notebook. Through these exercises, participants will use MongoDB and the Covid-19 datasets from the Allen institute and Johns Hopkins University to:

  1. Query a Sample Mongo DB, using filters and aggregations.

  2. Use PyMongo to populate a remote (Atlas) MongoDB with JSON documents downloaded from the CORD-19 research data set.

  3. Query the CORD-19 research data set, create a text search index, and parse results as JSON documents and Pandas dataframes.

Prerequisites

This tutorial is designed for workshop participants who are have some experience working with JSON documents in Python. For backround material, please see the Python and JSON workshop material at: https://github.com/geoffswc/python-json-workshop

Software Installation and Setup

To participate in this workshop, you'll need to be able to run python 3.6+ code. This workshop will use Jupyter Notebook, though you can follow along with a different IDE if you prefer.

Although the focus for this wokshop will be on using Python to query a MongoDB, we will briefly browse a MongoDB collection using the Compass UI. To follow along with that section (optional), you'll want to download the Compass UI at: https://www.mongodb.com/try/download/compass

You will also need to create an Atlas DB account, instantiate a sample database, and upload COVID-19 data to a collection. Instructions for this are in Atlas-Setup.md

Workbooks

This tutorial consists of a series of notebooks. Further explanatory text is available in the notebook for each section or the tutorial.

Covid-19-Data-JHU-Overview.ipynb

Connect to and query the Johns Hopkins Covid-19 dataset hosted on MongoDB/Atlas. List available databases and collections. Join (lookup) collections based on common attributes.

Covid-19-Data-JHU-Guide.ipynb

Recreate some of the queries in the Johns Hopkins guide for python and mongo in a Jupyter notebook format.

Sample-Atlas-DB.ipynb

Query a MongoDB hosted on Atlas through Python, including basic querying and regex.

Atlas-Mongo-Import-Through-Python.ipynb

Write data to a remote Atlas MongoDB collection through PyMongo. This workbook uses the CORD-19 dataset from the Allen institute to demonstrate.

CORD-19-Reseach-Data-Set-Atlas.ipynb

Query the CORD-19 dataset you uploaded to MongoDB. Create and search documents through a text index on a field in the collection.

Installation and Setup

For the Atlas-related notebooks, you'll need to create an Atlas account. Instructions are in the workbooks.

For the local installation notebooks, you'll need to install MongoDB and populate a document database with JSON files from the CORD-19 dataset. Instructions are in the CORD-19-Reseach-Data-Set.ipynb workbook.

Data and Links

Links to Datasets

Covid-19 Research Dataset from the Allen Institute for AI

https://pages.semanticscholar.org/coronavirus-research

Covid-19 Dashboard dataset from Johns Hopkins University

https://github.com/mongodb-developer/open-data-covid-19

Links to Documentation and Tutorials

MongoDB:

https://www.mongodb.com

MongoDB Reference Manual:

https://docs.mongodb.com/manual/reference/

Pymongo documentation and quickstart:

https://pymongo.readthedocs.io/en/stable/

Johns Hopkins dataset using Python and Mongo

https://developer.mongodb.com/article/johns-hopkins-university-covid-19-data-atlas#covid19-python

Compass (UI for MongoDB)

https://www.mongodb.com/try/download/compass

mongodb-python-workshop's People

Contributors

geoffswc avatar

Stargazers

wz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.