Code Monkey home page Code Monkey logo

spring2023tidyverse's Introduction

SPRING2023TIDYVERSE

Spring 2023 Tidyverse create and extend assignments

Farhana Akther

Initial Description and Links:

In this assignment we will get to practice collaborating around a code project with GitHub. We will be practicing our knowledge of TidyVerse functions by creating vignette examples of the packages. I am using a birth dataset from fivethirtyeight.com. This dataset contains U.S. births data for 1994 - 2003 which, is provided by the Centers for Disease Control and Prevention’s (CDC’s) National Center for Health Statistics (NCNS).

  • Github
  • Rpubs
  • fivethirtyeight
  • TIDYVERSE EXTEND:

    In the extension part of this assignment I have chose Alic's work and used the TidyVerse package, specifically dplyr package to demonstrate it's capabilities.I have used dplyr to manipulate the dataset by using filter(), summarize(), sum(), and mean() functions, combine with group_by() which, allowed us to perform our operation “by group”.

    Waheeb Algabri

    Initial Description and Link:

    For this assignment, we'll be practicing our knowledge of Tidyverse functions by creating vignette examples of the packages that make up Tidyverse. In this project, my goal is to create a programming example or “vignette” that showcases the capabilities of a TidyVerse package, along with a dataset from either fivethirtyeight.com or Kaggle. The aim of this example is to demonstrate how to effectively use the selected TidyVerse package to manipulate, analyze, and visualize the selected dataset.

  • Github
  • Rpubs
  • Taha A

    Initial Description and Link:

    For this assignment, we'll be practicing our knowledge of Tidyverse functions by creating vignette examples of the packages that make up Tidyverse. In my case, I wanted to attempt going over the forcats package which focuses on manipulating factor elements in a dataframe, as I have no experience with using it at this point.

    =======

  • Github
  • Rpubs
  • =======

    For my extend I have extended Kory Martin's dplyr create assignment. This is located here.

    =======

    Alice D

    Initial Description and Link:

    I've chosen ggplot2 as my tidyverse package to showcase and worked with a dataset from Kaggle showing the number of internet users for various countries between the years 1980 and 2020.

    =======

  • Github
  • Rpubs
  • Kaggle
  • =======

    For my extend, I've chosen to add upon Farhana's implementation of dplyr. This is located here.

    # Glen Davis A vignette of example use cases for the purrr library within the tidyverse.

    =======

    Susanna W

    For this create assignment, I used the following packages and functions to analyze college major dataset from FiveThirtyEight.

    Package Function
    readr read_csv
    dplyr glimpse() group_by() summarise() mutate()
    ggplot2 ggplot() geom_bar() scale_x_continuous() scale_y_continuous() labs() xlab() ylab() ggtitle() theme() coord_flip()

    Links:

  • Github
  • Rpubs
  • Kaggle

  • John Cruz

    Tidyverse SELECT

    I worked using the lubridate package within the Tidyverse ensemble. With this, I created examples exploring NYC Filming Permits data.

  • Github
  • RPubs
  • NYC Open Data
  • Extension by Jacob Silver

    ======= Tidyverse EXTEND

    To extend an example, I used Keith's fuzzyjoin package to work on MTA subway locations and NYC public hospitals

  • Github
  • RPubs
  • MTA Data
  • Hospital Data

  • =======

    Gregg Maloy

    TIDYVERSE CREATE

    For this assignment dplyr was utilized to conduct a superficial analysis of the 'Music Dataset : 1950 to 2019' which provides a list of songs 'from 1950 to 2019 describing music metadata as sadness, danceability, loudness, acousticness, etc.' More specifically, dplyr was used to analyze aspects of the 'sadness' variable to demonstrate the main functions of the dplyr package. The final product is a playlist of songs between the years 1950-2019 which includes the top one 'saddest' song from each year.

  • Github
  • Rpubs
  • Kaggle
  • TIDYVERSE EXTEND

    The overarching purpose of this assignment was the utilization of github as a collaborative coding tool to explore push, pull, clone and forking capabilities. In this assignment a github repository was cloned, another student's vignette .rmd file modified and then the .rmd file was pulled back to the original github repository to demonstrate github's collaboration capabilities.

    The vignette which was modified was created by Jlok17. The vignette was modified to improve ggplot readability. More specifically the plot was reordered and observation value labels were introduced to a scatter plot.

  • Github
  • Rpubs
  • =======

    Glen Davis: Create:

    A vignette of example use cases for the purrr library within the tidyverse.

    Daniel Craig

    Readr Vignette -

    Yahoo Finance Kaggle Dataset:

  • Kaggle
  • Github:

  • Github
  • Rpubs:

  • RPubs
  • Daniel Craig

    Purrr Extension -

    Github:

  • <a href = https://github.com/d-ev-craig/DATA607/blob/main/TIDYVERSE%20Create/purrr%20Vignette%20Extension/purrr_Vignette_ext_dcraig.Rmd
  • Glen Davis: Extend:

    gdd - extending Mo's tidyverse create submission - starting lines for changes/comments - line 29: combined your string replacements and class coercion into one line line 35: reordered/simplified your group_by/summarize workflow line 66: adjusted your bar plot so that the x-/y-values are in more standard positions (i.e. the x is a category, and the y is numeric), your bars are sorted, and then the coords are flipped so you still achieve what you wanted visually: a horizontal bar plot. This is better than setting the y-value as a category and the x-value as numeric and not doing the coord flip, as it's easy to get confused when you do it that way. Also updated your Amounts since you wanted them to represent millions of dollars, not dollars.

    Eddie Xu

    Initial Description and Link:

    For this assignment, I decided to use ggplot2 and associated map package to present a visual presentation for data analysis.

  • Github
  • Rpubs
  • Rpubs for Extend
  • =======

    Gabriel Castellanos

    Inital Description:

    The purpose of this mardown is to provide an introduction to the following 3 packages: Forcats, Dplyr, and GGplot. This vignette shows how using these 3 packages from the larger tidyverse package can help the user enhance data visuals (using GGPlot).

    Kayleah Griffen

    The objective of this assignment was twofold (1) to practice collaborating around a code project with GitHub and (2) to use a capability of tidyverse and demonstrate it with a vignette. The gitHub repository the code was submitted to with a pull request is https://github.com/acatlin/SPRING2023TIDYVERSE.

    The dataset I chose to work with is data that I obtained from working with the Franklin Community Center. The Franklin Community Center is a nonprofit organization that aims to help families and individuals in Saratoga County. They have been in operation for 40 years and their Food Pantry has been operational since 2018. In 2019, the Food Pantry began using the Oasis database to manage their cases. Each family or individual is assigned a case number, and every time a person from the case comes in to receive a service it is documented. I worked with the Oasis team to understand how to extract data from their database. With the data I extracted, my goal is to visualizations showing what parts of NY the food bank services are going to.

    The tidyverse capabilities that I wanted to demonstrate using the dataset are extensions of ggplot2. The Simple Features for R, or sf package, can be used in conjunction with ggplot2 in tidyverse to create maps. Additionally the treemapify can be used with ggplot2 to make treemaps.

    Kayleah Griffen Tidyverse Extend

    For the tidyverse extend, I chose to work with Taha's code and to demonstrate more of the functionality of the forcats package.

    =======

    Umer Farooq

    Create

    This Vignette is eying at a tidyverse package ggplot2. This purpose of this vignette is to explain how basics of ggplot2 works and how can we make effective graphs. A random data set in the field to healthcare is being picked from Kaggle to plot data using ggplot2

  • Github
  • Rpubs
  • Extend

    In the tidyverse extend assignment I have extended Alex Khaykin's vignette. Below is the github link for that entension:

  • Github
  • =======

    Alex K

    Initial Description and Link:

    In this assignment we will get to practice collaborating around a code project with GitHub. We will create and example using one or more TidyVerse packages and demonstrate how to use the capabilities. I will use a birth dataset from 'fivethirtyeight.com'.

  • Github
  • Rpubs
  • =======

    Miguel Gomez

    TidyVerse Create

  • Github
  • Rpubs
  • fivethirtyeight
  • TidyVerse Extend

  • Github
  • Rpubs
  • Mohamed Hassan

    Initial Description and Link:

    The dataset I used was obtained from Kaggle. It contained the amount of political donations given by American sports owners to political campaigns and Political Action Committee organizations. Using dyplyr, stringr, and ggplot2 from tidyverse, I explored various questions from the dataset.

    Rpubs GitHub

  • Github
  • Rpubs
  • Kaggle
  • =======

    =======

    Kory Martin

    Tidyverse Create:

    Initial Description and Link:

    For this assignment, I choose the dplyr library in Tidyverse to show how to work with a dataframe that shows Netflix TV Shows and Movies dataset, which was pulled from Kaggle.

  • GitHub
  • Rpubs
  • =======

    Tidverse Extend:

    For the extend portion of this assignment, I looked at the code originally created by classmate Coco Donavon, here.

    =======

  • GitHub
  • Rpubs
  • =======

    Shoshana Farber

    Tidyverse CREATE

    This vignette demonstrates some of the capabilities of the tidyr package from the tidyverse suite. It also utilizes dplyr and ggplot2 functions.

    The data set used was from FiveThirtyEight.com and it focused on Elo ratings and other metrics for NBA basketball teams.

    Links:

    Tidyverse EXTEND

    I chose to extend John Cruz's vignette on the lubridate package by focusing on the as_date() function.

    Links:

    =======

    Ross Boehme

    Ross Create

  • Github
  • RPubs
  • FiveThirtyEight Article
  • Ross Extend

    • Extended Alex Khaykin's analysis of Congressmembers' ages.
  • Github
  • RPubs
  • =======

    Jian Quan Chen

    Create

    Initial Description and Link:

    For this assignment, I will be creating a programming sample vignette to demonstrate the use of the tidyr package in the tidyverse package. I will be working with the “Video Game Sales” (https://www.kaggle.com/datasets/gregorut/videogamesales) dataset from Kaggle. The dataset was generated from a scrape of vgchartz.com and contains the sales of video games that sold greater than 100,000 copies from 1980 to 2020.

  • Github
  • Rpubs
  • Extend

    This is my extension to Alex Khaykin's vignette on the ggplot2 package in the tidyverse. His "Create" assignment looked at key plots in ggplot2 using the 'congress_age' dataset from fivethirtyeight. So far, he has demonstrated how to create a bar plot, boxplot, violin plot, and a scatterplot. I will expand on this by creating a density plot and histogram as well as showing useful components in the ggplot2 package to improve data visualization.

    =======

    Jacob Silver

    This vignette demonstrates how to use the str_replace_all function within the tidyverse's stringr package in the context of tabular data. The data is a compilation of 2,498 articles about data science, pulled from the data science site Kaggle. See link here:

    Joe Garcia

    Initial Description and Links:

    For this assignment, we explore TidyVerse and how to use some of its features. In my case, we use ggplot2 to explore some recent college grads and unemployment attributed to the field of study.

  • Github
  • RPubs
  • FiveThirtyEight
  • TidyVerse Extend to Susanna's Github
  • =======

  • Kaggle
  • =======

    Keith Colella

    This vignette will introduce the fuzzyjoin package, which enables joining of two datasets based on imperfect matches. This package is very helpful for combining data without unique keys.

  • Github
  • RPubs
  • Joshua Lok

    Initial Description and Links:

    In this assignment we will get to practice collaborating around a code project with GitHub. We will create, and example using one or more TidyVerse packages and demonstrate how to use the capabilities.

    We will use the Bob Ross Dataset from "Fivethirtyeight.com". Within this data set, there are the different elements of Bob Ross Painting's/Work, here we will use the Tidyverse Package to show general trends and an analysis of these different elements.

  • Github
  • Rpubs
  • fivethirtyeight
  • =======

    Coco Donovan

    Using a dataset containing NCAA Women's Basketball rosters for Division I, I performed a basic analysis of from the rosters using readr, dplyr and ggplot2.

    data = 'https://raw.githubusercontent.com/Sports-Roster-Data/womens-college-basketball/main/wbb_rosters_2022_23.csv'

    spring2023tidyverse's People

    Contributors

    acatlin avatar moham6839 avatar greggmaloy avatar waheeb123 avatar eddiexunyc avatar abnormalpotassium avatar rossboehme avatar cocodono avatar suswong avatar farhanaakther23 avatar akhaykin avatar hellojohncruz avatar shanafarber avatar lejqc avatar foxxenn avatar klmartin1998 avatar kac624 avatar gc521 avatar umerfarooq122 avatar jlok17 avatar geedoubledee avatar d-ev-craig avatar longsocksilver avatar klgriffen96 avatar nickamc avatar miguelgomez7287 avatar beshkiakvarnstrom avatar

    Recommend Projects

    • React photo React

      A declarative, efficient, and flexible JavaScript library for building user interfaces.

    • Vue.js photo Vue.js

      🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

    • Typescript photo Typescript

      TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

    • TensorFlow photo TensorFlow

      An Open Source Machine Learning Framework for Everyone

    • Django photo Django

      The Web framework for perfectionists with deadlines.

    • D3 photo D3

      Bring data to life with SVG, Canvas and HTML. 📊📈🎉

    Recommend Topics

    • javascript

      JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

    • web

      Some thing interesting about web. New door for the world.

    • server

      A server is a program made to process requests and deliver data to clients.

    • Machine learning

      Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

    • Game

      Some thing interesting about game, make everyone happy.

    Recommend Org

    • Facebook photo Facebook

      We are working to build community through open source technology. NB: members must have two-factor auth.

    • Microsoft photo Microsoft

      Open source projects and samples from Microsoft.

    • Google photo Google

      Google ❤️ Open Source for everyone.

    • D3 photo D3

      Data-Driven Documents codes.