Code Monkey home page Code Monkey logo

data-explorer-web-app's Introduction

VISITORS FOLLOWERS

Hello World! ๐Ÿ‘‹


I'm Leah and welcome to my GitHub profile! :octocat:

Typing SVG

GitHub followers Gmail Badge LinkedIn Badge Medium Badge


GIF

Fun facts:

๐Ÿ‘ฉโ€๐ŸŽ“ Former Marketing guru turned Data Science wizard with a Business degree in hand.

โœ๏ธ Master of Data Science and Innovation (MDSI) graduate from the University of Technology Sydney, armed with supercharged data skills.

๐Ÿค” If you don't write your SQL queries in uppercase, I don't trust you.

๐Ÿ“ˆ Madly passionate about Modern Data Stacks, Data Engineering, DataOps, and saving the day with top-notch data governance practices for enterprise data architectures. Let's optimize that data flow!


Languages and Tools

Hadoop Spark AWS AWS Docker MongoDB Kubernetes Bash Linux Jenkins Git R Kafka PostgreSQL Postgres Python Cassandra

๐Ÿ‘‡ Check out my latest Medium blog

Recent Article 0

Recent Article 1

Recent Article 2

Recent Article 3

Recent Article 4


Metrics ๐Ÿ“ˆ

Metrics

data-explorer-web-app's People

Contributors

caijiaping avatar cartierz avatar laurabayonaf avatar ndleah avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

abuklifa192

data-explorer-web-app's Issues

Docker run failed due to file does not exist

Describe the bug
Docker run failed due to file does not exist

To Reproduce
Steps to reproduce the behavior:

  1. Set up the Docker inside Dockerfile and docker-compose.yml
  2. Go to the project directory > open the terminal
  3. To build the docker image > Run docker build -t streamlitapp:latest . > Successful
    image
  4. To run the docker image > Run docker run -p 8501:8501 streamlitapp:latest
  5. See error
Usage: streamlit run [OPTIONS] TARGET [ARGS]...

Error: Invalid value: File does not exist: main_leah.py

Screenshots
image

Desktop (please complete the following information):

  • Windows 11

My part is still not running accordingly, and is potentially something to do with the instance of the class.

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

error while running the unit test for `test_data.py`

Describe the bug
I'm working on the unit test file for data.py with test_data.py with the purpose of testing my Dataset class methods. There are total 13 tests in my unit test file representing 13 methods in my Dataset class. When I ran the test, the result returned 2 errors related to the get_head() and get_tail() methods.

My defined method in data.py looks like this:

  def get_head(self, n=5):
    """
      Return Pandas Dataframe with top rows of loaded dataset
    """
    return self.df.iloc[:n]

Here is the example of how I run the get_head() test in the test_data.py, the command is exactly the same logic with the get_tail():

class GetHead(unittest.TestCase):
    def test_get_head(self):
        # create sample dataframe
        data = [['tom', 10], ['nick', 15], ['juli', 14], ['annie', 24], ['julie', 20], ['leah', 18],['patrick', 27]]
        test_df = pd.DataFrame(data, columns = ['Name', 'Age'])
        test_file_name = "car_accident.csv"
        test_dataset = Dataset(test_df, test_file_name)
        self.assertEqual(test_dataset.get_head(), test_df.head())

However, the test result return this error:

Traceback (most recent call last):
  File "test_data.py", line 234, in test_get_head
    self.assertEqual(test_dataset.get_head(2), test_df.head(2).any())
  File "C:\Users\LEAH NGUYEN\AppData\Local\Programs\Python\Python38\lib\unittest\case.py", line 912, in assertEqual
    assertion_func(first, second, msg=msg)
  File "C:\Users\LEAH NGUYEN\AppData\Local\Programs\Python\Python38\lib\unittest\case.py", line 902, in _baseAssertEqual
    if not first == second:
  File "C:\Users\LEAH NGUYEN\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\core\generic.py", line 1537, in __nonzero__
    raise ValueError(
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I try to look up this issue online but couldn't find a good explaination prior to my test. Any ideas of what is this error and how can I improve it @caijiaping @Laurabayonaf @cartierz ?

To Reproduce
Steps to reproduce the behavior:

  1. Go to \src\test
  2. Open the terminal > Run python test_data.py
  3. See error

altair row limit

Describe the bug
Max rows error if the dataset with more than 5000 rows

Question related to unittest

Hi @caijiaping ,

I just checked your unit test file and have a question. I saw you only defined 1 class and have several functions inside to run the unit test for the class methods like this
image

So when you execute the unit test script inside the terminal, how many tests did the result return? For example, {1 test run successfully}, {3 test run successfully, 2 test run failed}

data display issue with streamlit

Describe the bug
Streamlit can't display the dataframe including different data type eg: a few numeric columns and string columns in same database

Having issues running docker.

I am having issues running docker under Windows 10; Docker Engine failed to start...

I just open the app and the error is coming up.

  • OS: 19042.1288
  • Edition: Windows 10 Pro
  • Version 20H2
    Picture1

Does anyone have suggestions on how to go next?

Datetime data issue

Describe the bug
Not count of the missing date in Datetime script

To Reproduce
Steps to reproduce the behavior:

  1. Go to: Powershell and rund steamlit run main_lean.py
  2. Select: the Datetime column name
  3. See error

Expected behaviour
Since the test data includes 1 missing data cell, the missing date function should count 1

Screenshots
image

Desktop (please complete the following information):

  • OS: Window 10
  • Browser Chrome
  • Version: 95.0

Error before select Date data

Describe the bug
There is an error message before the user selects date column (in overall information section).

To Reproduce
Steps to reproduce the behavior:
1.Go to project directory terminal > run streamlit run main_leah.py
2. See error in below screenshot

Expected behavior
It should show a message to inform the user in "4. Information on datetime columns"

Screenshots
image

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser: Chrome

Error display before converting the datetime column under string format

Describe the bug
In Overall information section, there is a part where we can convert those columns with DateTime data but under string or object64 format to be converted to `DateTime format like this:
image

Before converting string columns to datetime format, however, if we scroll down to Datetime Information section, an error will be displayed as follows:
image

This error will be disappeared once we converted the column to the DateTime format above but this still creates a bad practice for the program UI

To Reproduce
Steps to reproduce the behavior:

  1. run the program
  2. Scroll down to the Datetime section
  3. See errors

Improve code readability

Hi team,

As I did some researches related to other public projects and also take references to our Python individual assignment, I notice that in other project's main script, the practitioners always construct their code logic by creating several functions representing each section they want to specify (which in our assignment are 4 sections including 1. Overall Info; 2. Numeric; 3. Text; 4.DateTime) and the main function called def main(): to consolidate all the contents within just a few lines. This can be illustrated as follows:
image

Thus, I think we can improve our code readability by constructing the code following this method. This is also beneficial for us while creating the flowchart as we have grouped each section as a function.

Let me know what you guys think?

Error when importing the modules from src folder

Describe the bug
I have a problem when importing the modules from src folder to the streamlit_app.py (where we connect all the modules together and run our program).

My first assumption is that the streamlit_app.py file is not in the same folder with my module script (data.py), not to do with the parent folder

To Reproduce
Steps to reproduce the behavior:

  1. Go to project directory terminal > run streamlit run main_leah.py
  2. See error

Expected behavior
The successful should return the app interface as stated in the assessment brief

Screenshots
image

Desktop (please complete the following information):

  • Windows 11
  • Browser Chrome

Additional context
N/A

datetime module and datetime file conflict

Describe the bug
import error due to the same naming of datetime.py and datetime package

To Reproduce
run test_datetime.py with datetime.py (import date_time from DateColumn)

Expected behavior

Traceback (most recent call last):
File "src/test/test_datetime.py", line 9, in
from datetime import DateColumn
ImportError: cannot import name 'DateColumn' from 'datetime' (/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/datetime.py)

Cannot read the csv file when running the docker

Describe the bug
When I run the test inside the terminal with the command streamlit run main_leah.py, the program worker fine. However, when I run the program with Docker container and accessing it with the address http://localhost:8501/, it gave an error while loading the csv file as Error: Request failed with status code 400

To Reproduce
Steps to reproduce the behavior:

  1. run docker-compose up inside the project terminal
  2. Go to 'http://localhost:8501/'
  3. Click on 'Upload csv file button' > choose csv file
  4. See error

Screenshots
image

Error to select the number of rows over total DF rows

Describe the bug
Error when I select the number of row more than total DF rows

To Reproduce
Steps to reproduce the behavior:

  1. Go to: Powershell and run the commend: streamlit run main_leah.py
  2. Used the slider to select rows
  3. See error

Expected behavior
The max side of slider bar shows the max of total row number.
Prevent the user to select the number over total DF rows

Screenshots
image

Desktop (please complete the following information):

  • OS: Window 10
  • Browser chrome

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.