Code Monkey home page Code Monkey logo

sententree's Introduction

SentenTree

SentenTree is a novel text visualization technique for summarizing a collection of social media text, i.e. take thousands or more Tweets and summarize what the Tweets are about. The aim of this project was to create a visualization that is cheap to compute but represent the connected thoughts in the words.

SentenTree example

See DEMO

Author

Publication

Mengdie Hu, Krist Wongsuphasawat and John Stasko. Visualizing Social Media Content with SentenTree, in IEEE Transactions on Visualization and Computer Graphics 2016.

Installation

npm install sententree

Example usage

<div id="vis"></div>
d3.tsv('data/demo.tsv', (error, data) => {
  // data format is [{ id, text, count }]

  const model = new SentenTreeBuilder()
    .buildModel(data);

  new SentenTreeVis('#vis')
    // change the number to limit number of output
    .data(model.getRenderedGraphs(3))
    .on('nodeClick', node => {
      console.log('node', node);
    });
});

For developers

Install dependencies via npm or yarn

$ npm install

Then run local instance via

$ npm run start

License

Copyright 2014 Twitter, Inc. Licensed under the Apache License Version 2.0

sententree's People

Contributors

juliaferraioli avatar kristw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sententree's Issues

JS dependencies need updating

npm WARN deprecated [email protected]: ๐Ÿ™Œ  Thanks for using Babel: we recommend using babel-preset-env now: please read babeljs.io/env to update!
npm WARN deprecated [email protected]: uglifyjs is deprecated - use uglify-js instead.
npm WARN deprecated [email protected]: gulp-util is deprecated - replace it, following the guidelines at https://medium.com/gulpjs/gulp-util-ca3b1f9f9ac5
npm WARN deprecated [email protected]: wrench.js is deprecated! You should check out fs-extra (https://github.com/jprichardson/node-fs-extra) for any operations you were using wrench for. Thanks for all the usage over the years.
npm WARN deprecated [email protected]: no longer maintained
npm WARN deprecated [email protected]: please upgrade to graceful-fs 4 for compatibility with current and future versions of Node.js
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: Use uuid module instead
npm WARN deprecated [email protected]: ReDoS vulnerability parsing Set-Cookie https://nodesecurity.io/advisories/130
npm WARN deprecated [email protected]: This version is no longer maintained. Please upgrade to the latest version.
npm WARN deprecated [email protected]: This version is no longer maintained. Please upgrade to the latest version.
npm WARN deprecated [email protected]: This version is no longer maintained. Please upgrade to the latest version.
npm WARN deprecated [email protected]: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
npm WARN deprecated [email protected]: express 2.x series is deprecated
npm WARN deprecated [email protected]: please upgrade to graceful-fs 4 for compatibility with current and future versions of Node.js
npm WARN deprecated [email protected]: connect 1.x series is deprecated

creating my own dataset

What is count ?
tweet text and tweet ID are pretty obvious, but what is count here ? The retweet count or the favorite count ?

How can I use this

Hi,
I cant manage to use that cool tool.
I have installed it , now what? even if i clone the repo and run the demo it doesnt seem to work like in the site.

How to use termWeights option

Hi! My name is Matt Britton, I'm a student at Georgia Tech. My advisor is Alex Endert, a member of John Stasko's department.

I am working on a project to use Sententrees in a visualization of threaded replies in a forum (e.g. Reddit). My objective is to make it easier to navigate and summarize a large conversation.

I have a prototype created with a working Sententree, but the algorithm tends to choose irrelevant words with low content value, e.g. I, would, think, not, like, etc. My guess is that these words predominate because the text in a forum, unlike tweets, has a lot more structure and includes more prepositions, articles, conjunctions, etc. than the corpus used in your examples.

I'm looking at ways to address this, and before I do my own text preprocessing, I'd like to investigate the "termWeights" object that can be passed to SententreeModel() as part of the "options" parameter. It looks like this value is parsed and passed to SententreeModel.growSeq(), but from what I can tell, it is not actually implemented there yet.

Can you confirm that my understanding of this code is correct? If so, I may choose to implement weighting myself - can you give me a sense of what you envisioned for this feature and how you intended it to function?

Best,

Matt

How do I preprocess the data?

I see there are some scripts in the archive/preprocess folder but what is the order of execution? And what's the input format of the first one? There are no comments inside :-(

Generate a new dataset?

Is there in the repository any code that given a set of tweets produces a suitable dataset?
I mean a dataset in the format [{ id, text, count }].

I could do it manually but how should I compute count for each tweet? Is it the number of retweets? What is it exactly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.