Code Monkey home page Code Monkey logo

lab-pandas-deep-dive's Introduction

Ironhack Logo

Lab | Pandas Deep Dive

Introduction

By this point in the program, you should have learned how to perform a variety of operations using the Pandas library.

In this lab, again you will be working on main.ipynb. Read the instructions and questions in the Jupyter notebook and provide your answers. Make sure to test your answers in Python.

Goals

In this lab, you will examine a data file named apple_store.csv downloadable from this link.

You can also find this data in Ironhack's database:

  • db: appleStore
  • table: data

Feel free to choose where you get your data from. If you get your data from the database, ignore the steps in the main.ipynb file regarding importing the csv file.

This data contains information of over 7,000 Apple Store apps such as ID, name, size in bytes, price, number of ratings, user rating, prime genre, and so on. You will use Pandas to import the data source and examine the data in order to answer several questions described next.

Challenge Questions

  • How many apps are there in the data source?

  • What is the average rating of all apps?

  • How many apps have an average rating no less than 4?

  • How many genres are there in total for all the apps?

  • What are the top 3 genres that have the most number of apps?

  • Which genre is most likely to contain free apps?

  • If a developer tries to make money by developing and selling Apple Store apps, in which genre should s/he develop the apps? Please assume all apps cost the same amount of time and expense to develop.

Deliverables

  • main.ipynb with your responses to each of the questions above.

Submission

Upon completion, add your version of main.ipynb to git. Then commit git and push your branch to the remote.

Resources

Pandas Documentation

10 Minutes to Pandas

Google Search

Additional Challenges for the Nerds

If you have completed the apple_store challenge without much difficulty, you will find this tutorial pretty easy. However, it's still a great tutorial to read because it explains a lot of the thinking process behind codes. You can skim through this tutorial quickly to check if there's anything you still don't know.

This is an advanced tutorial about Pandas that involves character encoding, Pandas DataFrame apply method, Python lambda expression, Python functional programming (you'll learn later this week), data cleaning (you'll learn later this week), and plotting with matplotlib (you'll learn in Module 2). There is a lot of new information but if you manage to complete this tutorial you'll be far ahead of your classmates.

The most challenging part of this course is Module 3. In Module 1 and 2 most students should be able to complete with moderate efforts. What will make you truly stand out is how deep you can dive in Module 3, which depends on your level of accomplishment in Module 1 and 2. Therefore, if you have the power to accomplish more (in terms of both the depth and breadth) in the first two modules we will certainly encourage you to.

lab-pandas-deep-dive's People

Contributors

carlarsmendes avatar ta-data-lis avatar bewekb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.