Code Monkey home page Code Monkey logo

dataviz-fundamentals's Introduction

Fundamentals of Data Visualization, in Bokeh ๐Ÿ“Š

โš ๏ธ Note: This repository is a WIP, "watch" it to keep up with updates!

Project description

This repository hosts Bokeh equivalents for various plots from Fundamentals of Data Visualization by Claus O. Wilke. It provides a collection of interactive data visualizations implemented using the Bokeh library.

The full rendered pages of this repository can be found here

Table of contents (WIP)

  1. Introduction: An overview of the narrative and type of plots to expect.

  2. Visualizing amounts

    • Bar plots: Representing amounts using vertical, horizontal, grouped, and stacked bars.

    • Dot plots and heatmaps: Using dots and colors to represent values.

  3. Visualizing distributions

    • Single distribution histogram and density plots: Showing the distribution of a single variable using histograms or density plots.

    • Multiple distribution histogram and density plot: Comparing multiple distributions using histograms and density plots.

    • Visualizing many distributions at once using boxplots, sina plots and ridgeline plots: Illustrating the distribution of data using boxes and whiskers and the density of multiple distributions along a common axis using ridgeline plots.

  4. Visualizing associations

    • Scatter plots and correlograms: Illustrating the relationship between two variables using scatter plots, correlograms and paired data points.

Local setup

To run these notebooks locally, follow these steps:

  1. Clone the repository:

     git clone https://github.com/bokeh/dataviz-fundamentals.git
    
  2. Navigate to the project directory via the terminal or command prompt.

  3. Create a new conda environment and install the required dependencies:

     conda env create -n <name> -f environment.yml
    

replacing <name> with your preferred environment name.

  1. Activate the new environment:

     conda activate <name>
    
  2. Open Jupyter notebook via anaconda navigator or via the command line:

     jupyter notebook
    
  3. Open the desired notebook in your web browser and run the cells.

Contributing

Contributions are welcome! If you would like to contribute to this project, please follow the guidelines below:

  • Fork the repository and create your branch.

  • Make your changes and ensure the code follows the project's coding style.

  • Test your changes thoroughly.

  • Run:

      pre-commit install
    

    to install the pre-commit hooks locally.

  • Commit your changes.

  • Submit a pull request with a clear description of your changes.

License & Code of Conduct

This project is licensed under the MIT and BSD 3-Clause licence. By contributing to this project, you agree to abide by the Bokeh Code of Conduct.

dataviz-fundamentals's People

Contributors

azaya89 avatar bryevdv avatar pavithraes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dataviz-fundamentals's Issues

Table of contents

Given the renaming of the bar plots file, it has become apparent that regular edits to the table of contents (TOC) will be necessary for each publication on the GH pages to ensure that the links on the Index page correctly point to the corresponding posts.

As a result, this Issue will require ongoing attention throughout the duration of the project.

Tasks:

  • Confirm that the relevant .ipynb and .html files have been added to the repository.
  • Rewrite, if necessary, the TOC to match the file names
  • Add the .html file as a link to Index.html

Checklists:

INTRODUCTION POST

  • Is the topic on the TOC
  • Does it have a functional link in the Index.html file

BAR PLOTS

  • Is the topic on the TOC
  • Does it have a functional link in the Index.html file

DOT PLOTS AND HEATMAP

  • Is the topic on the TOC
  • Does it have a functional link in the Index.html file

SINGLE DISTRIBUTION HISTOGRAM AND DENSITY PLOT

  • Is the topic on the TOC
  • Does it have a functional link in the index.html file

MULTIPLE DISTRIBUTION HISTOGRAM AND DENSITY PLOT

  • Is the topic on the TOC
  • Does it have a functional link in the index.html file

BOXPLOTS AND RIDGELINE PLOTS

  • Is the topic on the TOC
  • Does it have a functional link in the index.html file

SCATTER PLOTS AND CORRELOGRAMS

  • Is the topic on the TOC
  • Does it have a functional link in the index.html file

GEOSPATIAL DATA

  • Is the topic on the TOC
  • Does it have a functional link in the index.html file

Notes on converting `.rda` to `.tsv` or `.csv` files

I see https://github.com/hnguyentt/dataviz-python/issues/1#issue-1753060960 -- we can credit them if they allow us to use the work. Otherwise, we can write a script to do this ourselves and add some notes in our README.

Post 1: Bar plots

Topics to cover: Bar plots

Link to WIP blog post: Google doc

Checklist:

Tasks:

  • Notebook created
  • Blog post created
  • Notebook rendered on GitHub pages
  • Blog post published
  • Social media announcements

Checks:

  • Is the notebook reproducible
  • Does the notebook text and blog post text follow our docs style guide
  • Do all images have alt-text?

visualizing-geospatial-data

Topics to cover: Cartograms

Link to WIP blog post: incoming

Checklist:

Tasks:

  • Notebook created
  • Blog post created
  • Notebook rendered on GitHub pages
  • Blog post published
  • Social media announcements

Checks:

  • Is the notebook reproducible
  • Does the notebook text and blog post text follow our docs style guide
  • Do all images have alt-text?

Introduction files

The Introduction.ipynb and Introduction.html files till contain the old file paths. I have to modify them to refelct the new file paths and also reflect the style guide for subsequent posts.

Introduction post.

  • Introduction blog post submitted on Medium
  • README file updated with the current Table of Contents (WIP)
  • Index page created on the GH page with links in the Table of contents
  • Introduction plot made and linked in the GH page
  • Blog post published on Medium

Finalize topics and timeline

Open new issues (for example #3) to track each notebook (and corresponding blog post), which can have subtasks for activities like data preparation.

This is a meta issue to ensure we complete the project management bits.

Update README.md

Add the following to the project readme:

  • Project description
  • Table of contents, with one-line descriptions for each notebook
  • Local setup instructions
  • Contributing/developer guidelines
  • License & CoC note

Pull request creation

PROBLEM DESCRIPTION

I am facing challenges with the way my Pull Requests (PRs) are being created.

ISSUE DETAILS

Typically, when I create a PR in a new branch, I notice that all the commits from other unmerged PRs in different branches are also included. For instance, PR #14 was intended to include only the specific named commit, but it also contained unmerged commits from my main, intro, and bar-plot branches, which I did not intend to include. This makes the PR look disorganised.

WHAT I HAVE TRIED

I have attempted to create a new branch and use git cherry-pick to select only the desired commit to push from that branch. However, this approach has not been successful. I have also tried various suggestions found online, but none of them have resolved the issue. The resulting PRs still include other commits not associated with the specific branch.

WHAT I NEED

I need a straightforward method to create PRs with only a single commit, even in cases where there may be unmerged commits from other branches.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.