Code Monkey home page Code Monkey logo

Visit QuantNet

Visit QuantNet Quantlet_Extraction_Evaluation_Visualisation Visit QuantNet 2.0

Name of QuantLet : Quantlet_Extraction_Evaluation_Visualisation

Published in : ''

Description : 'Extraction, grading and clustering of the Quantlets in the GitHub Organization Quantlet with the use of the classes modules/QUANTLET.py and modules/METAFILE.py. With this program you can extract, update and save the data, model topics with Latent Semantic Analysis, compute different clusterings and visualize the clustering with t-Stochastic Neighbour embedding.'

Keywords : Text analysis, LSA, t-SNE, clustering, kmeans clustering, spectral clustering, visualisation

See also : ''

Author : Marius Sterling

Submitted : September 18 2018 by Marius Sterling

Example : 

Picture1

PYTHON Code

from modules.QUANTLET import QUANTLET
import os

filename = 'data_file_'
github_token = None
USER = 'Quantlet'
# Creates if necessary the folders in the list
for i in ['data']:
    if i not in os.listdir():
        os.mkdir(i)

# looks for already saved files, if there is none, loads all data and save them
f = sorted([i for i in os.listdir('data') if 'json' in i and filename in i])
if not f:
    q = QUANTLET(github_token=github_token, user=USER)
    q.download_metafiles_from_user()
    name = 'data/' + filename
    name += q.get_last_commit().strftime('%Y%m%d')
    name += '.json'
    q.save(name)
else:
    q = QUANTLET.load('data/' + f[-1])

# Update all existing metafiles in q
q.update_existing_metafiles()

# Update all existing metafiles and searches for new Quantlets
q.update_all_metafiles(since=q.last_full_check)

# Saving data newly
name = 'data/' + filename
name += q.get_last_commit().strftime('%Y%m%d')
name += '.json'
q.save(name)


# return bad graded quantlets
grades = q.grading()
grades.loc[grades['q_quali'].isin(['C','D','F'])]

# Extract corpus and dictionary, document term matrix dtm
c,d      = q.get_corpus_dictionary()
dtm      = q.get_document_term_matrix(corpus=c,dictionary=d)
c_tfidf  = q.get_corpus_tfidf(c,d)

# do tf-idf and extract document topic  matrix X
lsa      = q.lsa_model(corpus=c_tfidf, dictionary=d, num_topics=20)
X        = q.get_lsa_matrix(lsa, corpus=c_tfidf, dictionary=d)

# cluster the Quantlets with K-Means into groups
cl,_     = q.cl_kmeans(X=X, n_clusters=20)

# 
named_cl = q.topic_labels(cl=cl,document_topic_matrix=X, lsa=lsa, top_n=4)
q.tsne(X, named_cl, n_iter=2500, save_directory='',save_ending='kmeans', file_type='png')

LvB's Projects

nextunicorn icon nextunicorn

This folder contains 10 quantlets for the master thesis "Searching for a unicorn: A ML approach towards predicting startup success"

nic_class_2015 icon nic_class_2015

Numerical Introductory Course WS15/16 - Sample of codes provided by students

nnqr icon nnqr

"Modelling Systemic Risk using Neural Network Quantile Regression"

npmsle icon npmsle

Parallel implementation of a nonparametric simulated Maximum Likelihood Estimation (NPSMLE)

oi_crypto icon oi_crypto

Order Imbalances and Returns in Cryptocurrency Markets

onsm icon onsm

Opinion Networks in Social Media

outcome-adaptive-random-forest icon outcome-adaptive-random-forest

Non-parametric variable selection and inference via the outcome-adaptive Random Forest (OARF). Uses the IPTW estimator to estimate the ATE while the propensity score is estimated via OARF. This leads to smaller variance and bias. Only variables that are confounders or predictive of the outcome are selected for the propensity score.

pam icon pam

Penalized Adaptive Method

pca-svm icon pca-svm

Time series classification using PCA+SVM

pgfp icon pgfp

Pricing Green Financial Products

phacking icon phacking

p-Hacking project of Haerdle, Reule and Agakishiev published in Dei ex machinis or the attractiveness of p-hacking

plotting icon plotting

codes show how to plot step by step linear regression models, financial developments, simulated data and how to use the bootstrap technique

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.