Code Monkey home page Code Monkey logo

check_score's Introduction

measuring the score

To measure the score of any algorithm I have filtered train.csv by year/term

train and test samples for this script

The output after filtering train.csv with is_booking=1 was divided in two parts:

  1. The first file is meant to "train" your algorithm or to extract any information (recomended to avoid bias in the score estimation) A file with 2013+2014-I meant for the training was generated: wget http://test-carrillo.web.cern.ch/test-carrillo/kag/exp/train20132014I.csv.zip

  2. The second file is meant to measure the score of your algorithm.

running the script

"python MPA5.py int_hc_loic.txt submission_XXX.csv" will measure the score of any algorithm predicting for mytest.csv youralg(test_2014II_isbooking.csv)=submission_XXX.csv

The first argument is this script is the file with the true hotel clusters (id hotel_cluster)

If not extra-arguments are provided (python MPA5.py int_hc.txt) it will estimate the score of 4 benchmark examples:

  • submit_top5.txt (Most Frequent clusters)
  • submit_random5.txt (random integers organized in the right output format)
  • submit_perfect1.txt (perfect hotel_cluster prediction in the first position)
  • submit_perfect2.txt (perfect hotel_cluster prediction in the second position)

This is the output of the script python MPA5.py int_hc.txt (ran by this script) Benchmarks:

  • for Top 5: 0.073
  • for Random: 0.022
  • for Sample: 0.019
  • for Perfect1: 1.0
  • for Perfect2: 0.5

This is what we see in the Public Score (ran by kaggle score estimation)

  • Most Frequent Benchmark 0.059
  • Random Guess Benchmark 0.023
  • Sample Submission Benchmark 0.017

Results are compatible, there is an error due to the lower statistics available for this script and also due to the difference spliting by year in the kaggle script, here we are spliting by number of lines.

check_score's People

Contributors

camilocarrillo avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.