Code Monkey home page Code Monkey logo

dcu-ca683-workspace's People

Contributors

araic avatar mitalkasundra avatar

Watchers

 avatar  avatar  avatar  avatar

dcu-ca683-workspace's Issues

Discuss findings for Excel

In ~100 words let the team know about what you found when you researched using the Iris dataset in Excel

Research SAS/JMP

Install and try out the following tool:
SAS/JMP
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Research python

Install and try out the following tool:
Python
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Discuss findings for SPSS

In ~100 words let the team know about what you found when you researched using the Iris dataset in SPSS

Implement Gaussian Naive Bayes

Use Gaussian Naive Bayes as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Implement summary statistics

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Path to iris test dataset CSV file (comma separated with headers).

The program should output several lines of text strings to the command line.

  1. Summary statics of the training data
  2. Summary statistics of the test data
  3. Summary statistics of the combined test and training data

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Research SPSS

Install and try out the following tool:
SPSS
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Research Excel

Install and try out the following tool:
Excel
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Research Weka

Install and try out the following tool:
Weka
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Research R

Install and try out the following tool:
R
and open the Iris dataset in this tool to see what things you can do with it.
Search what the existing literature (google scholar or blogs) say about using the iris dataset inside this tool

Implement logistic regression

Use logistic regression as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Implement Linear Discriminant Analysis

Use Linear Discriminant Analysis as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Implement K-Nearest Neighbors

Use K-Nearest Neighbors as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Investigate MCAR

The sample iris dataset had values missing. We need to perform due diligence to investigate if the values are missing at random or not.

The lecture slides explained how to calculate this.

Business Analysis

Steps:

  1. Read the assignment requirements
  2. Read the description of our project (recommender systems)

Deliverables

  1. Written up summary of what we are expected to deliver to Andrew, bullet point for every deliverable so we can perform a checklist at the end.
  2. Written up details of what a customer would expect from a recommender system. Provide list of features, actions, and associated behaviours that are expected by our "customer". We can use this for QA at the end to check if we have delivered what was required.

Business Analysis of Requirements

Read the requirements and draft a specification for a developer to follow on what the implementation should be able to do to solve a real-world problem described by the customer.

Discuss findings for R

In ~100 words let the team know about what you found when you researched using the Iris dataset in R

Literature review

Search the internet for blogs and academic papers on which techniques we can use to solve the business problem.

Deliverables

  1. List of academic papers organised by topic
  2. List of relevant blogs
  3. List of relevant algorithms

Technical Design (R&D)

Design how to solve the proposed business problem.

Use the research findings (#9, #8, #6, #10) and business requirements (#11) as inputs.

Provide technical software architecture document on how exactly to solve the problem step by step. The architecture should describe data input, transformations, processes, tables/objects, outputs etc.

Discuss findings for python

In ~100 words let the team know about what you found when you researched using the Iris dataset in python

Investigate merging of programs

Our model evaluator program takes an input file and divides is up in to 80% to 20% split.

Our summary statistics program takes 2 input files and generates summary statistics for file 1 file 2 and both combined.

There is a difference between the theory of what data we expect.

Investigate if we need to merge the 2 programs.

Write Publication

Design a flashy document (Latex/Prezzi/Powerpoint) to convey the business problem, how we approached solving it, and the utility of our solution.

Discuss findings for Weka

In ~100 words let the team know about what you found when you researched using the Iris dataset in Weka

Data understanding

Steps

  1. Open up the instacart data set and browse to see what files and columns might be of interest for a recommender system.
  2. Identify any transformations that might need to be made to the data
  3. Identify any joins that might need to be made between parts of the data set

Deliverable
Report summarising the structure of the data, the files and columns of interest for a recommender system, and any transformations that might be needed.

Implement Classification and Regression Trees

Use Classification and Regression Trees as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Implement Support Vector Machines

Use Support Vector Machines as a classifier to perform the following:

Implement a program that reads from the command line 5 parameters:

  1. Path to iris training dataset CSV file (comma separated with headers).
  2. Petal length of a sample
  3. Petal width of a sample
  4. Sepal width of a sample
  5. Sepal length of a sample

The program should output 1 text string to the command line.

  1. the name of the predicted species.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Create Team name

We must create a team name, mission statement, slogan and logo.

The marketing should appeal to the major market groupings including APAC, EMEA, AMER.

QA Testing

Given a release candidate test the implementation:

  1. Create a test plan
  2. execute the tests
  3. Record the results.

Test plan should include missing data and performance evaluation. The architecture document says to "Create a Validation Dataset, Split data into 80%:20% ratio. 80% to train model and 20% to validate dataset".

Review the architecture document (#12) and business requirements (#11) to verify that the requirements are met.

The iris dataset is available here
https://github.com/hughpearse/DCU-CA683-workspace/blob/master/iris.csv

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.