Code Monkey home page Code Monkey logo

myria-web's Introduction

A web front-end for Myria

This is a Google App Engine app.

Build Status

Dependencies

You must have the Google App Engine SDK for Python installed locally. During setup, be sure to select the option to create symbolic links to the Python utilities so that they are available from the command line.

Initial setup

  1. This project uses the UW eScience Raco project. We have configured it as a submodule. After cloning this repository, you must run:
git submodule init
git submodule update
Then setup the module as described in the [Raco README](https://github.com/uwescience/raco/blob/master/README.md).
  1. The PLY library used to parse programs in the Myria language uses a precompiled parsetab.py in the raco submodule. This file is not required, but dramatically speeds up the parser load time (which happens for every request to the app). To generate it, run
scripts/myrial examples/reachable.myl

in the raco subdirectory.

  1. Launch the local App Engine emulator. I prefer to use Google's GoogleApp EngineLauncher application (installed with the SDK), which provides a nice GUI interface to control the emulator. From the menu select Add Existing Application, and add the myria-web/appengine directory.

Alternatively, from the command line, you may launch:

dev_appserver.py /path/to/myria-web/appengine

And then point your browser at localhost:8080 to view the application.

Changing the Myria Hostname

To change the Myria instance from the default (vega), modify appengine/myria_web_main.py, changing the hostname and port variables. Changes will reflect automatically in the GAE application at localhost:8080.

Which branch to be on

There are two notable branches in the myria-web repository: master and production.

Depending on your goals (modifying latest myria-web vs running a stable version of the interface), you may wish to switch to the production branch.

Updating the code

To update the submodule to the latest from master, run this code:

git submodule update --recursive --remote

(Might also require beforehand:

git submodule init

)

Run the tests

Install the developer dependencies.

pip install -r requirements-dev.txt

Download a local copy of Google App Engine

curl -O https://storage.googleapis.com/appengine-sdks/featured/google_appengine_1.9.22.zip
unzip -q google_appengine_1.9.22.zip

Run

nosetests test/test_myria_down.py test/test_myria_up.py test_style.py -w appengine --with-gae --gae-lib-root=google_appengine

Run without Google App Engine

It is possible to run myria-web without a dependence on Google App Engine. Right now, myria-web uses the paste module to run the web application.

Install the developer dependencies.

pip install -r requirements-dev.txt

Do steps 1 and 2 only from Initial setup.

Finally, start the server.

cd appengine
python myria_web_main.py

Issues

The Google App Engine GUI has a Logs button that can be helpful for diagnosing issues with the Myria web app.

myria-web's People

Contributors

billhowe avatar bmyerz avatar brandonhaynes avatar dhalperi avatar domoritz avatar ericgribkoff avatar hello-josh avatar jingjingwang avatar jortiz16 avatar ljorr1 avatar radion avatar ryanmaas avatar senderista avatar stechu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

myria-web's Issues

Why is production the main branch?

I think master should be the main branch for this repo and we should only merge from master into production. The reason why I think this is better is that we should see the current state of master by default and see the readme from master. This will avoid problem like me pushing changes to production accidentally ;-)

data upload

(small) Data upload through the web interface is going to be critical to get people to try it out.

Status is not being updated

The status of the query is still being shown as running on the web page even though queries section shows that the query is executed successfully.

screen shot 2014-03-19 at 3 09 43 am

partition-aware catalog and optimizer

from @billhowe:

Catalog to know how a relation is partitioned so we don't re-partition needlessly (this will take some noodling, but note that it's simpler than "the complete optimizer for materialized views."

(@dhalperi: this will take optimizer changes too!)

Query Result Preview

After a query has finished, display the first N ( for example, 50 ) tuples and also number of tuples in the result tab.

Pagnition could be considered as well.

Some background (Please skip if you already buy this proposal):
I am using Myria to do some interactice exploration of Freebase data. Myria is already quite awesome since it is much much faster than single machine postgres. Yikes! If we have the result preview, the process will be much more smooth since I do not have to download the data and open the downloaded file. Also the data could be very large.

Currently I use:
curl <data_url> | head
This is bad because this will make the download query fail.

AVG(LONG) is broken

My belief is that the rewrite we did to turn AVG(x) into SUM(x)/COUNT(x) [for distributed aggregates] broke this.

We want the average to be a double, but we are getting long.

add swagger-ui

swagger is now in the production java; now we need to add swagger-ui to the production web.

Visualization TODOs

General

  • add view for what operators did (aggregate how long in i.e. bar chat)
  • better way to return to overview view

Graph @tmoreau89

  • match colors between graph and fragment vis
  • prototype graph in d3
  • transitions
  • edge rendering + transitions
  • show workers for each fragment (tooltip)
  • separate expanding node from clicking on node
  • default expanding based on # of fragments
  • give the lines a weight depending on the number of tuples
  • add zooming functionality
  • edges should be polylines/paths
  • give the fragments better names (that are not the id)
  • users can click on label

Fragment visualization @aaasz

  • add labels for worker
  • animate brush
  • fix small brush (should have minimum width)
  • add number of tuples returned to popover
  • summary for each worker in popup
  • show a text that the data is not loaded and you have to select something
  • grid lines should be connected between synchronized charts (area and lanes)
  • change order of ruler, area and grid
  • make it faster
  • zooming should make the brushes smaller/ larger
  • ruler also in gantt chart
  • show at what paints data was sent and where (dots and lines, with link to network chart)
  • zoom controls (+ and - button, scrolling or zoom control like a slider)
  • explode summary for one fragment on one worker
  • # tuples returned should be different when called (it doesn't make sense to show it then)
  • less data in tooltips

Network visualization @ujaved

  • add labels for src and dest
  • add a way to select a whole worker and show the whole column/ row
  • allow us to load data for multiple connections between fragments
  • add a way to select multiple pixels
  • highlight which pixels are selected
  • show multiple lines in line chart, add labels at the end of the line (don't use colors)
  • highlight hovered column/ row using a background
  • add circles for data values in line chart lines to X and Y when hovering (disappear when not hovering)
  • clear selections button
  • group lines in line chart by dest or origin worker (on demand)
  • add a way to reorder the workers based on some clustering
  • bar charts for aggregated sent/received data

What needs to be run to generate dev_appserver.py ?

It's unclear to me how to set up myria-web to run with app engine. Is it a python program? Do you need to run a setup script?

If the answers to these questions could be added to the README that would be great.

be able to permalink a query

Right now the URL for an editor page does not tell you what you were doing. For instance, I can't link you directly into SQL mode. Make links more useful, and ultimately be able to "permalink" our examples for short queries.

This does necessarily mandate that queries be short because we can't have GET links over 2k chars or whatever. Maybe this means we should start using the GAE datastore and let you reference queries by the hash of the program or something like that.

debug output (I think) is killing performance

There seems to be a multi-second lag in submitting queries. FWICT during this time there is a LOT of debug output generated, especially recursive printing of query plans. I suspect that's the cause.

@7andrew7 can you please look into this? Most of the messages are yours, although I doubt the bug actually is.

jinja-templates: rather than populating HTML, return JSON and use javascript to generate HTML

makes pages smaller and reuses code -- we need the javascript to re-render page elements anyway.

Really, though, I'm basing everything about this issue on one discussion with @gaorlov, who I'm probably mis-representing, and I'm not well informed. There are reasonable other takes:

http://stackoverflow.com/questions/1284381/why-is-it-a-bad-practice-to-return-generated-html-instead-of-json-or-is-it

So I guess this is really just a "think about it" issue.

Try to use string as relation key

When I try to run

T1 = SCAN(TwitterK);
T2 = [FROM T1 EMIT $0];
STORE (T2, JustX);

I get

Error 500 (Internal Server Error): Traceback (most recent call last): File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/lib/webapp2-2.5.2/webapp2.py", line 570, in dispatch return method(*args, **kwargs) File "/Users/dominik/Developer/UW/myria-web/appengine/myria_web_main.py", line 386, in post compiled = compile_to_json(query, cached_logicalplan, physicalplan, catalog) File "/Users/dominik/Developer/UW/myria-web/appengine/raco/myrialang.py", line 972, in compile_to_json apply_schema_recursive(root_op, catalog) File "/Users/dominik/Developer/UW/myria-web/appengine/raco/myrialang.py", line 933, in apply_schema_recursive apply_schema_recursive(child, catalog) File "/Users/dominik/Developer/UW/myria-web/appengine/raco/myrialang.py", line 933, in apply_schema_recursive apply_schema_recursive(child, catalog) File "/Users/dominik/Developer/UW/myria-web/appengine/raco/myrialang.py", line 933, in apply_schema_recursive apply_schema_recursive(child, catalog) File "/Users/dominik/Developer/UW/myria-web/appengine/raco/myrialang.py", line 915, in apply_schema_recursive rel_scheme = catalog.get_scheme(rel_key) File "/Users/dominik/Developer/UW/myria-web/appengine/myria_web_main.py", line 101, in get_scheme 'userName': rel_key.user, AttributeError: 'unicode' object has no attribute 'user'

Changing

diff --git a/raco/myrialang.py b/raco/myrialang.py
index 5c3607f..3e3de49 100644
--- a/raco/myrialang.py
+++ b/raco/myrialang.py
@@ -2,6 +2,7 @@ from collections import defaultdict

 from raco import algebra
 from raco import rules
+from raco import relation_key
 from raco.scheme import Scheme
 from raco import expression
 from raco.language import Language
@@ -911,7 +912,7 @@ def apply_schema_recursive(operator, catalog):
             rel_scheme = catalog.get_scheme(rel_key)
         elif isinstance(operator, MyriaScanTemp):
             # Temp Scan. Is this handled correctly? No clue.
-            rel_key = operator.name
+            rel_key = relation_key.RelationKey.from_string(operator.name)
             rel_scheme = catalog.get_scheme(rel_key)

         if rel_scheme:

Does not solve the problem because then compile errors (compile to json) show up.

save state when moving forward and backward

Right now we do a lot of automatic AJAX stuff on page load. One side effect of this is that if you accidentally hit the back button while writing a query, you lose all state. Yikes! This just bit us twice in a row.

The website should pick up where you left off if you hit backwards & forwards. A good model is GitHub, which somehow does this magically.

MyriaL examples don't work

The sigma clipping example leads to the following error

Error 500 (Internal Server Error): 

Traceback (most recent call last):
  File "/base/data/home/runtimes/python27/python27_lib/versions/third_party/webapp2-2.5.2/webapp2.py", line 570, in dispatch
    return method(*args, **kwargs)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/myria_web_main.py", line 291, in post
    self.get()
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/myria_web_main.py", line 297, in get
    plan = get_logical_plan(query, language)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/myria_web_main.py", line 68, in get_logical_plan
    return get_plan(query, language, 'logical')
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/myria_web_main.py", line 54, in get_plan
    processor.evaluate(parsed)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/interpreter.py", line 263, in evaluate
    method(*statement[1:])
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/interpreter.py", line 327, in dowhile
    self.__materialize_result(_id, expr, body_ops)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/interpreter.py", line 267, in __materialize_result
    child_op = self.ep.evaluate(expr)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/interpreter.py", line 39, in evaluate
    return method(*expr[1:])
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/interpreter.py", line 138, in bagcomp
    orig_op, _info = multiway.merge(from_args)
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/multiway.py", line 67, in merge
    return (op, __calculate_offsets(from_args))
  File "/base/data/home/apps/s~myria-web/1.373713340623326520/raco/myrial/multiway.py", line 45, in __calculate_offsets
    index += len(from_args[_id].scheme())
TypeError: object of type 'NoneType' has no len()

README instructions to generate parsetab.py don't run

myrial.py no longer exists in datalogcompiler.

Running ./scripts/myrial examples/reachable.myl in datalogcompiler generates error:

Traceback (most recent call last):
  File "./scripts/myrial", line 5, in <module>
    import raco.myrial.interpreter as interpreter
ImportError: No module named raco.myrial.interpreter

How does one generate this file?

preview results

Some kind of dumb preview of results (mostly a UI thing, not a deep internal fancy thing)

postgres issues with doubles

See query http://vega.cs.washington.edu:1776/query/query-410:

AllNorm = SCAN(armbrustlab:seaflow:allnorm2);

AllBounds = SELECT Cruise, Day, File_Id
     , MIN(pe_norm) as min_pe
     , MIN(fsc_small_norm) as min_fsc_small
     , MIN(fsc_perp_norm) as min_fsc_perp
     , MIN(chl_small_norm) as min_chl_small
     , MAX(pe_norm) as max_pe
     , MAX(fsc_small_norm) as max_fsc_small
     , MAX(fsc_perp_norm) as max_fsc_perp
     , MAX(chl_small_norm) as max_chl_small
FROM AllNorm;

STORE(AllBounds, armbrustlab:seaflow:allbounds2);

This query, which seemingly inserts data that is read from the database, throws the following error:

edu.washington.escience.myria.DbException: org.postgresql.util.PSQLException: ERROR: \"4.9E-324\" is out of range for type double precision\n  Where: COPY armbrustlab MyriaSysTemp allbounds2, line 55, column max_pe: \"4.9E-324\"

Looks like this is a known issue and a bug of some sort in PostgreSQL 1. And 2. There may even be a fix 3, but it is not in postgres 9.1 that we currently run.

profiling: what does `tuples returned` mean?

https://demo.myria.cs.washington.edu/profile?queryId=2234

click fragment 2, zoom to beginning of window.

screen shot 2014-04-09 at 12 59 57 am

V7 is the TableScan ; V2 is the ShuffleProducer.

Mouse-over (72, dark blue bar) says V7: null returned, ...
Mouse-over (72, light blue bar) says V2: 38 tuples returned, ...

I'm trying to figure out how V7 returns null then V2 returns 38 tuples.

  • Maybe V7's null is real -- this triggered sending a batch over the network. In this case, there must be an earlier non-null result, but I can't move the window farther left and I can't mouse over it.
  • Maybe V7's null is wrong and it really returned 38 tuples.

I'm not sure which, or something else.

Perhaps relatedly, I'm confused how V0, a DbInsert, returns tuples.

Some sort of SAFEDIV function?

SAFEDIV(a,b, optional c) would expand to

CASE WHEN b = 0 THEN c ELSE a/b END

and if c is not provided we would just use 0.

Myrial language reference

Now that Raco provides a way to inspect the keywords, I think we should extend it to provide a description of what the keyword does (perhaps Python docstring for builtins and something similar for expression library?). Then we can have a "language reference" tab in the editor, which would be extremely useful I think!

history restoration broken - race condition?

My history restoration is either not working or is being over-written; I always see Datalog when I load the page although something else may flicker briefly.

I also see this error:

screen shot 2014-05-07 at 1 03 57 am

Pushing computation in-database

Everything before the first shuffle (broadcast, collect) can be pushed into the database using a QueryScan operator instead of a TableScan operator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.